disdrodb.api package

Contents

disdrodb.api package#

Submodules#

disdrodb.api.checks module#

DISDRODB Checks Functions.

disdrodb.api.checks.check_campaign_name(campaign_name)[source][source]#

Check the campaign name is upper case !.

disdrodb.api.checks.check_campaign_names(campaign_names)[source][source]#

Check DISDRODB campaign names.

disdrodb.api.checks.check_data_archive_dir(data_archive_dir: str)[source][source]#

Raise an error if the path does not end with DISDRODB.

disdrodb.api.checks.check_data_availability(product, data_source, campaign_name, station_name, data_archive_dir=None, **product_kwargs)[source][source]#

Check the station product data directory has files inside. If not, raise an error.

disdrodb.api.checks.check_data_source(data_source)[source][source]#

Check the data_source name is upper case !.

disdrodb.api.checks.check_data_sources(data_sources)[source][source]#

Check DISDRODB data sources.

disdrodb.api.checks.check_directories_inside(dir_path)[source][source]#

Check there are directories inside the specified dir_path.

disdrodb.api.checks.check_filepaths(filepaths)[source][source]#

Ensure filepaths is a list of string.

disdrodb.api.checks.check_folder_partitioning(folder_partitioning)[source][source]#

Check if the given folder partitioning scheme is valid.

Parameters:

folder_partitioning (str or None) –

Defines the subdirectory structure based on the dataset’s start time. Allowed values are:

  • ”” or None: No additional subdirectories, files are saved directly in dir.

  • ”year”: Files are stored under a subdirectory for the year (<dir>/2025).

  • ”year/month”: Files are stored under subdirectories by year and month (<dir>/2025/04).

  • ”year/month/day”: Files are stored under subdirectories by year, month and day (<dir>/2025/04/01).

  • ”year/month_name”: Files are stored under subdirectories by year and month name (<dir>/2025/April).

  • ”year/quarter”: Files are stored under subdirectories by year and quarter (<dir>/2025/Q2).

Returns:

The verified folder partitioning scheme.

Return type:

folder_partitioning

disdrodb.api.checks.check_invalid_fields_policy(invalid_fields)[source][source]#

Check invalid fields policy.

disdrodb.api.checks.check_issue_dir(data_source, campaign_name, metadata_archive_dir=None)[source][source]#

Check existence of the issue directory. If does not exists, raise an error.

disdrodb.api.checks.check_issue_file(data_source, campaign_name, station_name, metadata_archive_dir=None)[source][source]#

Check existence of a valid issue YAML file. If does not exists, raise an error.

disdrodb.api.checks.check_measurement_interval(measurement_interval)[source][source]#

Check measurement interval validity.

disdrodb.api.checks.check_measurement_intervals(measurement_intervals)[source][source]#

Check measurement interval.

Can be a list. It must be a positive natural number

disdrodb.api.checks.check_metadata_archive_dir(metadata_archive_dir: str)[source][source]#

Raise an error if the path does not end with DISDRODB.

disdrodb.api.checks.check_metadata_file(metadata_archive_dir, data_source, campaign_name, station_name, check_validity=True)[source][source]#

Check existence of a valid metadata YAML file. If does not exists, raise an error.

disdrodb.api.checks.check_path(path: str) None[source][source]#

Check if a path exists.

Parameters:

path (str) – Path to check.

Raises:

FileNotFoundError – If the path does not exist.

disdrodb.api.checks.check_path_is_a_directory(dir_path, path_name='')[source][source]#

Check that the path exists and is directory.

disdrodb.api.checks.check_product(product)[source][source]#

Check DISDRODB product.

disdrodb.api.checks.check_product_kwargs(product, product_kwargs)[source][source]#

Validate that product_kwargs for a given product contains exactly the required parameters.

Parameters:
  • product (str) – The product name (e.g., “L2E”, “L2M”).

  • product_kwargs (dict) – Keyword arguments provided for this product.

Returns:

The validated product_kwargs.

Return type:

dict

Raises:

ValueError – If required arguments are missing or if there are unexpected extra arguments.

disdrodb.api.checks.check_rolling(rolling)[source][source]#

Check rolling argument validity.

disdrodb.api.checks.check_sample_interval(sample_interval)[source][source]#

Check sample_interval argument validity.

disdrodb.api.checks.check_scattering_table_dir(scattering_table_dir: str)[source][source]#

Raise an error if the directory does not exist.

disdrodb.api.checks.check_sensor_name(sensor_name: str) None[source][source]#

Check sensor name.

Parameters:
  • sensor_name (str) – Name of the sensor.

  • product (str) – DISDRODB product.

Raises:
  • TypeError – Error if sensor_name is not a string.

  • ValueError – Error if the input sensor name has not been found in the list of available sensors.

disdrodb.api.checks.check_start_end_time(start_time, end_time)[source][source]#

Check start_time and end_time value validity.

disdrodb.api.checks.check_station_inputs(data_source, campaign_name, station_name, metadata_archive_dir=None)[source][source]#

Check validity of stations inputs.

disdrodb.api.checks.check_station_names(station_names)[source][source]#

Check DISDRODB station names.

disdrodb.api.checks.check_temporal_resolution(temporal_resolution)[source][source]#

Check temporal resolution validity.

disdrodb.api.checks.check_time(time)[source][source]#

Check time validity.

It returns a datetime.datetime object to seconds precision.

Parameters:

time (datetime.datetime, datetime.date, numpy.datetime64 or str) – Time object. Accepted types: datetime.datetime, datetime.date, numpy.datetime64 or str. If string type, it expects the isoformat YYYY-MM-DD hh:mm:ss.

Returns:

time

Return type:

datetime.datetime

disdrodb.api.checks.check_url(url: str) bool[source][source]#

Check url.

Parameters:

url (str) – URL to check.

Returns:

True if url well formatted, False if not well formatted.

Return type:

bool

disdrodb.api.checks.check_valid_fields(fields, available_fields, field_name, invalid_fields_policy='raise')[source][source]#

Check if fields are valid.

disdrodb.api.checks.get_current_utc_time()[source][source]#

Get current UTC time.

disdrodb.api.checks.has_available_data(data_source, campaign_name, station_name, product, data_archive_dir=None, **product_kwargs)[source][source]#

Return True if data are available for the given product and station.

disdrodb.api.checks.select_required_product_kwargs(product, product_kwargs)[source][source]#

Select the required product arguments.

disdrodb.api.configs module#

Retrieve sensor configuration files.

disdrodb.api.configs.available_sensor_names() list[source][source]#

Get available names of sensors.

Returns:

sensor_names – Sorted list of the available sensors

Return type:

list

disdrodb.api.configs.get_sensor_configs_dir(sensor_name: str, product: str) str[source][source]#

Retrieve configs directory.

Parameters:
  • sensor_name (str) – Name of the sensor.

  • product (str) – DISDRODB product.

Returns:

config_sensor_dir – Config directory.

Return type:

str

Raises:

ValueError – Error if the config directory does not exist.

disdrodb.api.configs.read_config_file(sensor_name: str, product: str, filename: str) dict[source][source]#

Read a config yaml file and return the dictionary.

Parameters:
  • sensor_name (str) – Name of the sensor.

  • filename (str) – Name of the file.

Returns:

Content of the config file.

Return type:

dict

Raises:

ValueError – Error if file does not exist.

disdrodb.api.create_directories module#

Tools to create RAW, L0A and L0B DISDRODB directories.

disdrodb.api.create_directories.create_data_directory(data_archive_dir, product, data_source, campaign_name, station_name, **product_kwargs)[source][source]#

Create station product data directory.

disdrodb.api.create_directories.create_initial_station_structure(data_source, campaign_name, station_name, data_archive_dir=None, metadata_archive_dir=None)[source][source]#

Create the DISDRODB Data and Metadata Archive structure for a single station.

disdrodb.api.create_directories.create_issue_directory(metadata_archive_dir, data_source, campaign_name)[source][source]#

Create issue directory.

disdrodb.api.create_directories.create_l0_directory_structure(data_archive_dir, metadata_archive_dir, data_source, campaign_name, station_name, force, product)[source][source]#

Create directory structure for the first L0 DISDRODB product.

If the input data are raw text files, use product = "L0A" If the input data are raw netCDF files, use product = "L0B"

product = "L0A" will call run_l0a. product = "L0B" will call run_l0b_nc.

disdrodb.api.create_directories.create_logs_directory(product, data_source, campaign_name, station_name, data_archive_dir=None, **product_kwargs)[source][source]#

Initialize the logs directory structure for a DISDRODB product.

disdrodb.api.create_directories.create_metadata_directory(metadata_archive_dir, data_source, campaign_name)[source][source]#

Create metadata directory.

disdrodb.api.create_directories.create_product_directory(data_source, campaign_name, station_name, product, force, data_archive_dir=None, metadata_archive_dir=None, **product_kwargs)[source][source]#

Initialize the directory structure for a DISDRODB product.

If product files already exists: - If force=True, it remove all existing data inside the product directory. - If force=False, it raise an error.

disdrodb.api.create_directories.create_test_archive(test_data_archive_dir, data_source, campaign_name, station_name, data_archive_dir=None, metadata_archive_dir=None, force=False)[source][source]#

Create test DISDRODB Archive for a single existing station.

This function is used to make a copy of metadata and issue files of a stations. This enable to then test data download and DISDRODB processing.

disdrodb.api.create_directories.ensure_empty_data_dir(data_dir, force)[source][source]#

Remove the content of the data_dir directory.

disdrodb.api.info module#

Retrieve file information from DISDRODB products file names and filepaths.

disdrodb.api.info.check_groups(groups)[source][source]#

Check groups validity.

disdrodb.api.info.get_campaign_name_from_filepaths(filepaths)[source][source]#

Return the DISDROB campaign name of the specified files.

disdrodb.api.info.get_end_time_from_filepaths(filepaths)[source][source]#

Return the end time of the specified files.

disdrodb.api.info.get_groups_value(groups, filepath)[source][source]#

Return a string associated to the groups keys.

If multiple keys are specified, the value returned is a string of format: <group_value_1>/<group_value_2>/...

If a single key is specified and is start_time or end_time, the function returns a datetime.datetime object.

disdrodb.api.info.get_info_from_filepath(filepath)[source][source]#

Retrieve file information dictionary from filepath.

disdrodb.api.info.get_key_from_filepath(filepath, key)[source][source]#

Extract specific key information from a list of filepaths.

disdrodb.api.info.get_key_from_filepaths(filepaths, key)[source][source]#

Extract specific key information from a list of filepaths.

disdrodb.api.info.get_product_from_filepaths(filepaths)[source][source]#

Return the DISDROB product name of the specified files.

disdrodb.api.info.get_sample_interval_from_filepaths(filepaths)[source][source]#

Return the sample interval of the specified files.

disdrodb.api.info.get_season(time)[source][source]#

Get season from datetime.datetime or datetime.date object.

disdrodb.api.info.get_start_end_time_from_filepaths(filepaths)[source][source]#

Return the start and end time of the specified files.

disdrodb.api.info.get_start_time_from_filepaths(filepaths)[source][source]#

Return the start time of the specified files.

disdrodb.api.info.get_station_name_from_filepaths(filepaths)[source][source]#

Return the DISDROB station name of the specified files.

disdrodb.api.info.get_time_component(time, component)[source][source]#

Get time component from datetime.datetime object.

disdrodb.api.info.get_version_from_filepaths(filepaths)[source][source]#

Return the DISDROB product version of the specified files.

disdrodb.api.info.group_filepaths(filepaths, groups=None)[source][source]#

Group filepaths in a dictionary if groups are specified.

Parameters:
  • filepaths (list) – List of filepaths.

  • groups (list or str) – The group keys by which to group the filepaths. Valid group keys are product, subproduct, campaign_name, station_name, start_time, end_time,``temporal_resolution``,``sample_interval``, data_format, year, month, day, doy, dow, hour, minute, second, month_name, quarter, season. The time components are extracted from start_time ! If groups is None returns the input filepaths list. The default value is None.

Returns:

Either a dictionary of format {<group_value>: <list_filepaths>}. or the original input filepaths (if groups=None)

Return type:

dict or list

disdrodb.api.info.infer_archive_dir_from_path(path: str) str[source][source]#

Return the disdrodb base directory from a file or directory path.

Assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters:

path (str) – Directory or file path within the DISDRODB archive.

Returns:

Path of the DISDRODB directory.

Return type:

str

disdrodb.api.info.infer_campaign_name_from_path(path: str) str[source][source]#

Return the campaign name from a file or directory path.

Assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters:

path (str) – Directory or file path within the DISDRODB archive.

Returns:

Name of the campaign.

Return type:

str

disdrodb.api.info.infer_data_source_from_path(path: str) str[source][source]#

Return the data_source from a file or directory path.

Assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters:

path (str) – Directory or file path within the DISDRODB archive.

Returns:

Name of the data source.

Return type:

str

disdrodb.api.info.infer_disdrodb_tree_path(path: str) str[source][source]#

Return the directory tree path from the archive directory.

Current assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters:

path (str) – Directory or file path within the DISDRODB archive.

Returns:

Path inside the DISDRODB archive. Format: DISDRODB/RAW/<DATA_SOURCE>/<CAMPAIGN_NAME>/... Format: DISDRODB/<ARCHIVE_VERSION>/<DATA_SOURCE>/<CAMPAIGN_NAME>/...

Return type:

str

disdrodb.api.info.infer_disdrodb_tree_path_components(path: str) list[source][source]#

Return a list with the component of a DISDRODB path disdrodb_path.

Parameters:

path (str) – Directory or file path within the DISDRODB archive.

Returns:

Path element of the DISDRODB archive. Format: [data_archive_dir, product_version, data_source`, ``campaign_name, …]

Return type:

list

disdrodb.api.info.infer_path_info_dict(path: str) dict[source][source]#

Return a dictionary with the data_archive_dir, data_source and campaign_name of the disdrodb_path.

Parameters:

path (str) – Directory or file path within the DISDRODB archive.

Returns:

Dictionary with the path element of the DISDRODB archive. Valid keys: "data_archive_dir", "data_source", "campaign_name"

Return type:

dict

disdrodb.api.info.infer_path_info_tuple(path: str) tuple[source][source]#

Return a tuple with the data_archive_dir, data_source and campaign_name of the disdrodb_path.

Parameters:

path (str) – Directory or file path within the DISDRODB archive.

Returns:

Dictionary with the path element of the DISDRODB archive. Valid keys: "data_archive_dir", "data_source", "campaign_name"

Return type:

tuple

disdrodb.api.io module#

Routines to list and open DISDRODB products.

disdrodb.api.io.filter_by_time(filepaths, start_time=None, end_time=None)[source][source]#

Filter filepaths by start_time and end_time.

Parameters:
  • filepaths (list) – List of filepaths.

  • start_time (datetime.datetime) – Start time. If None, will be set to 1997-01-01.

  • end_time (datetime.datetime) – End time. If None will be set to current UTC time.

Returns:

filepaths – List of valid filepaths. If no valid filepaths, returns an empty list !

Return type:

list

disdrodb.api.io.filter_dataset_by_time(ds, start_time=None, end_time=None)[source][source]#

Subset an xarray.Dataset by time, robust to duplicated/non-monotonic indices.

NOTE: ds.sel(time=slice(start_time, end_time)) fails in presence of duplicated timesteps because time ‘index is not monotonic increasing or decreasing’.

Parameters:
  • ds (xr.Dataset) – Dataset with a time coordinate.

  • start_time (np.datetime64 or None) – Inclusive start bound. If None, no lower bound is applied.

  • end_time (np.datetime64 or None) – Inclusive end bound. If None, no upper bound is applied.

Returns:

Subset dataset with the same ordering of timesteps (duplicates preserved).

Return type:

xr.Dataset

disdrodb.api.io.filter_filepaths(filepaths, debugging_mode)[source][source]#

Filter out filepaths if debugging_mode=True.

disdrodb.api.io.find_files(data_source, campaign_name, station_name, product, debugging_mode: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None, glob_pattern=None, start_time=None, end_time=None, **product_kwargs)[source][source]#

Retrieve DISDRODB product files for a give station.

Parameters:
  • data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.

  • campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.

  • station_name (str) – The name of the station.

  • product (str) – The name DISDRODB product.

  • debugging_mode (bool, optional) – If True, it select maximum 3 files for debugging purposes. The default value is False.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • glob_pattern (str, optional) – Glob pattern to search for raw data files. The default is “*”. The argument is used only if product=”RAW”.

  • temporal_resolution (str, optional) – The temporal resolution of the product (e.g., “1MIN”, “10MIN”, “1H”). It must be specified only for product L1, L2E and L2M !

  • model_name (str) – The model name of the statistical distribution for the DSD. It must be specified only for product L2M !

Returns:

filepaths – List of file paths.

Return type:

list

disdrodb.api.io.is_within_time_period(l_start_time, l_end_time, start_time, end_time)[source][source]#

Assess which files are within the start and end time.

disdrodb.api.io.open_data_archive(data_archive_dir=None)[source][source]#

Open the DISDRODB Data Archive.

disdrodb.api.io.open_dataset(data_source, campaign_name, station_name, product, product_kwargs=None, debugging_mode: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None, chunks=-1, parallel=False, compute=False, start_time=None, end_time=None, variables=None, **open_kwargs)[source][source]#

Retrieve DISDRODB product files for a give station.

Parameters:
  • data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.

  • campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.

  • station_name (str) – The name of the station.

  • product (str) – The name DISDRODB product.

  • debugging_mode (bool, optional) – If True, it select maximum 3 files for debugging purposes. The default value is False.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • **product_kwargs (optional) – DISDRODB product options It must be specified only for product L1, L2E and L2M products ! For L1, L2E and L2M products, temporal_resolution is required FOr L2M product, model_name is required.

  • **open_kwargs (optional) – Additional keyword arguments passed to xarray.open_mfdataset().

Return type:

xarray.Dataset

disdrodb.api.io.open_file_explorer(path)[source][source]#

Open the native file-browser showing ‘path’.

disdrodb.api.io.open_logs_directory(data_source, campaign_name, station_name=None, data_archive_dir=None)[source][source]#

Open the DISDRODB Data Archive logs directory of a station.

disdrodb.api.io.open_metadata_archive(metadata_archive_dir=None)[source][source]#

Open the DISDRODB Metadata Archive.

disdrodb.api.io.open_metadata_directory(data_source, campaign_name, station_name=None, metadata_archive_dir=None)[source][source]#

Open the DISDRODB Metadata Archive station(s) metadata directory.

disdrodb.api.io.open_netcdf_files(filepaths, chunks=-1, start_time=None, end_time=None, variables=None, parallel=False, compute=True, **open_kwargs)[source][source]#

Open DISDRODB netCDF files using xarray.

Using combine=”nested” and join=”outer” ensure that duplicated timesteps are not overwritten!

disdrodb.api.io.open_product_directory(product, data_source, campaign_name, station_name, data_archive_dir=None)[source][source]#

Open the DISDRODB Data Archive station product directory.

disdrodb.api.io.open_readers_directory()[source][source]#

Open the disdrodb software readers directory.

disdrodb.api.io.remove_product(data_archive_dir, product, data_source, campaign_name, station_name, logger=None, verbose=True, **product_kwargs)[source][source]#

Remove all product files of a specific station.

disdrodb.api.path module#

Define paths within the DISDRODB infrastructure.

disdrodb.api.path.define_campaign_dir(archive_dir, product, data_source, campaign_name, check_exists=False)[source][source]#

Return the campaign directory in the DISDRODB infrastructure.

If product="METADATA", it returns the path in the DISDRODB Metadata Archive. Otherwise, it returns the path in the DISDRODB Data Archive.

Parameters:
  • product (str) – The DISDRODB product. See disdrodb.available_products(). If “METADATA” is specified, it returns the path in the DISDRODB Metadata Archive.

  • data_source (str) – The data source. Must be specified if campaign_name is specified.

  • campaign_name (str) – The campaign name.

  • archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

Returns:

station_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_config_dir(product)[source][source]#

Define the config directory path of a given DISDRODB product.

disdrodb.api.path.define_data_dir(product, data_source, campaign_name, station_name, data_archive_dir=None, check_exists=False, **product_kwargs)[source][source]#

Return the station product data directory in the DISDRODB infrastructure.

Parameters:
  • product (str) – The DISDRODB product. See disdrodb.available_products().

  • data_source (str) – The data source.

  • campaign_name (str) – The campaign name.

  • station_name (str) – The station name.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

  • temporal_resolution (str, optional) – The temporal resolution of the product. It must be specified only for product L1, L2E and L2M !

  • model_name (str) – The name of the fitted statistical distribution for the DSD. It must be specified only for product L2M !

Returns:

data_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_data_source_dir(archive_dir, product, data_source, check_exists=False)[source][source]#

Return the data source directory in the DISDRODB infrastructure.

If product="METADATA", it returns the path in the DISDRODB Metadata Archive. Otherwise, it returns the path in the DISDRODB Data Archive.

Parameters:
  • product (str) – The DISDRODB product. See disdrodb.available_products(). If “METADATA” is specified, it returns the path in the DISDRODB Metadata Archive.

  • data_source (str) – The data source.

  • archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False. Raise error if the directory does not exist.

Returns:

station_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_disdrodb_path(archive_dir, product, data_source='', campaign_name='', check_exists=True)[source][source]#

Return the directory path in the DISDRODB Metadata and Data Archive.

If product="METADATA", it returns the path in the DISDRODB Metadata Archive. Otherwise, it returns the path in the DISDRODB Data Archive.

If data_source and campaign_name are not specified it return the product directory.

If data_source is specified, it returns the data_source directory.

If campaign_source is specified, it returns the campaign_name directory.

Parameters:
  • archive_dir (str) – The DISDRODB archive directory

  • product (str) – The DISDRODB product. See disdrodb.available_products(). If “METADATA” is specified, it returns the path in the DISDRODB Metadata Archive.

  • data_source (str, optional) – The data source. Must be specified if campaign_name is specified.

  • campaign_name (str, optional) – The campaign name.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is True. Raise error if the directory does not exist.

Returns:

dir_path – Directory path

Return type:

str

disdrodb.api.path.define_file_folder_path(obj, dir_path, folder_partitioning)[source][source]#

Define the folder path where saving a file based on the dataset’s starting time.

Parameters:
  • ds (xarray.Dataset or pandas.DataFrame) – The object containing time information.

  • dir (str) – Directory within the DISDRODB Data Archive where DISDRODB product files are to be saved. It can be a product directory or a logs directory.

  • folder_partitioning (str or None) –

    Define the subdirectory structure where saving files. Allowed values are:

    • None or “”: Files are saved directly in data_dir.

    • ”year”: Files are saved under a subdirectory for the year.

    • ”year/month”: Files are saved under subdirectories for year and month.

    • ”year/month/day”: Files are saved under subdirectories for year, month and day

    • ”year/month_name”: Files are stored under subdirectories by year and month name

    • ”year/quarter”: Files are saved under subdirectories for year and quarter.

Returns:

A complete directory path where the file should be saved.

Return type:

str

disdrodb.api.path.define_filename(product: str, campaign_name: str, station_name: str, start_time=None, end_time=None, add_version=True, add_time_period=True, add_extension=True, prefix='', suffix='', **product_kwargs) str[source][source]#

Define DISDRODB products filename.

Parameters:
  • campaign_name (str) – Name of the campaign.

  • station_name (str) – Name of the station.

  • start_time (datetime.datatime, optional) – Start time. Required if add_time_period = True.

  • end_time (datetime.datatime, optional) – End time. Required if add_time_period = True.

  • temporal_resolution (str, optional) – The temporal resolution of the product. It must be specified only for product L1, L2E and L2M !

  • model_name (str) – The model name of the fitted statistical distribution for the DSD. It must be specified only for product L2M !

Returns:

L0B file name.

Return type:

str

disdrodb.api.path.define_issue_dir(data_source, campaign_name, metadata_archive_dir=None, check_exists=False)[source][source]#

Return the issue directory in the DISDRODB infrastructure.

Parameters:
  • data_source (str) – The data source.

  • campaign_name (str) – The campaign name.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

Returns:

issue_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_issue_filepath(data_source, campaign_name, station_name, metadata_archive_dir=None, check_exists=False)[source][source]#

Return the station issue filepath in the DISDRODB infrastructure.

Parameters:
  • data_source (str) – The data source.

  • campaign_name (str) – The campaign name.

  • station_name (str) – The station name.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

Returns:

issue_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_l0a_filename(df, campaign_name: str, station_name: str) str[source][source]#

Define L0A file name.

Parameters:
  • df (pandas.DataFrame) – L0A DataFrame.

  • campaign_name (str) – Name of the campaign.

  • station_name (str) – Name of the station.

Returns:

L0A file name.

Return type:

str

disdrodb.api.path.define_l0b_filename(ds, campaign_name: str, station_name: str) str[source][source]#

Define L0B file name.

disdrodb.api.path.define_l0c_filename(ds, campaign_name: str, station_name: str) str[source][source]#

Define L0C file name.

disdrodb.api.path.define_l1_filename(ds, campaign_name, station_name: str, temporal_resolution: str) str[source][source]#

Define L1 file name.

disdrodb.api.path.define_l2e_filename(ds, campaign_name: str, station_name: str, temporal_resolution: str) str[source][source]#

Define L2E file name.

disdrodb.api.path.define_l2m_filename(ds, campaign_name: str, station_name: str, temporal_resolution: str, model_name: str) str[source][source]#

Define L2M file name.

disdrodb.api.path.define_logs_dir(product, data_source, campaign_name, station_name, data_archive_dir=None, check_exists=False, **product_kwargs)[source][source]#

Return the station log directory in the DISDRODB infrastructure.

Parameters:
  • product (str) – The DISDRODB product. See disdrodb.available_products().

  • data_source (str) – The data source.

  • campaign_name (str) – The campaign name.

  • station_name (str) – The station name.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

Returns:

station_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_metadata_dir(data_source, campaign_name, metadata_archive_dir=None, check_exists=False)[source][source]#

Return the metadata directory in the DISDRODB infrastructure.

Parameters:
  • data_source (str) – The data source.

  • campaign_name (str) – The campaign name.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

Returns:

metadata_archive_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_metadata_filepath(data_source, campaign_name, station_name, metadata_archive_dir=None, check_exists=False)[source][source]#

Return the station metadata filepath in the DISDRODB infrastructure.

Parameters:
  • data_source (str) – The data source.

  • campaign_name (str) – The campaign name.

  • station_name (str) – The station name.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

Returns:

metadata_archive_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_partitioning_tree(time, folder_partitioning)[source][source]#

Define the time directory tree given a timestep.

Parameters:
  • time (datetime.datetime) – Timestep.

  • folder_partitioning (str or None) –

    Define the subdirectory structure where saving files. Allowed values are:

    • None: Files are saved directly in data_dir.

    • ”year”: Files are saved under a subdirectory for the year.

    • ”year/month”: Files are saved under subdirectories for year and month.

    • ”year/month/day”: Files are saved under subdirectories for year, month and day

    • ”year/month_name”: Files are stored under subdirectories by year and month name

    • ”year/quarter”: Files are saved under subdirectories for year and quarter.

Returns:

A time partitioned directory tree.

Return type:

str

disdrodb.api.path.define_product_dir_tree(product, **product_kwargs)[source][source]#

Return the product directory tree.

Parameters:
  • product (str) – The DISDRODB product. See disdrodb.available_products().

  • temporal_resolution (str, optional) – The temporal resolution of the product. It must be specified only for product L1, L2E and L2M !

  • model_name (str) – The custom model name of the fitted statistical distribution. It must be specified only for product L2M !

Returns:

data_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_station_dir(product, data_source, campaign_name, station_name, data_archive_dir=None, check_exists=False)[source][source]#

Return the station product directory in the DISDRODB infrastructure.

Parameters:
  • product (str) – The DISDRODB product. See disdrodb.available_products().

  • data_source (str) – The data source.

  • campaign_name (str) – The campaign name.

  • station_name (str) – The station name.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

  • check_exists (bool, optional) – Whether to check if the directory exists. The default value is False.

Returns:

station_dir – Station data directory path

Return type:

str

disdrodb.api.path.define_temporal_resolution(seconds, rolling)[source][source]#

Define the DISDRODB product temporal resolution.

Prefix the measurement interval with ROLL if rolling=True.

disdrodb.api.search module#

disdrodb.api.search.available_campaigns(product=None, data_sources=None, station_names=None, available_data=False, raise_error_if_empty=False, invalid_fields_policy='raise', data_archive_dir=None, metadata_archive_dir=None, **kwargs)[source][source]#

Return campaigns names for which stations are available.

disdrodb.api.search.available_data_sources(product=None, campaign_names=None, station_names=None, available_data=False, raise_error_if_empty=False, invalid_fields_policy='raise', data_archive_dir=None, metadata_archive_dir=None, **kwargs)[source][source]#

Return data sources for which stations are available.

disdrodb.api.search.available_stations(product=None, data_sources=None, campaign_names=None, station_names=None, return_tuple=True, available_data=False, raise_error_if_empty=False, invalid_fields_policy='raise', data_archive_dir=None, metadata_archive_dir=None, **filter_kwargs)[source][source]#

Return stations information for which metadata or product data are available on disk.

This function queries the DISDRODB Metadata Archive and, optionally, the local DISDRODB Data Archive to identify stations that satisfy the specified filters.

If the DISDRODB product is not specified, it lists the stations present in the DISDRODB Metadata Archive given the specified filtering criteria. If the DISDRODB product is specified, it lists the stations present in the local DISDRODB Data Archive given the specified filtering criteria.

Parameters:
  • product (str or None, optional) –

    Name of the product to filter on (e.g., “RAW”, “L0A”, “L1”).

    If the DISDRODB product is not specified (default), it lists the stations present in the DISDRODB Metadata Archive given the specified filtering criteria.

    If the DISDRODB product is specified, it lists the stations present in the local DISDRODB Data Archive given the specified filtering criteria. The default is None.

  • data_sources (str or sequence of str, optional) – One or more data source identifiers to filter stations by. The name(s) must be UPPER CASE. If None, no filtering on data source is applied. The default is is None.

  • campaign_names (str or sequence of str, optional) – One or more campaign names to filter stations by. The name(s) must be UPPER CASE. If None, no filtering on campaign is applied. The default is is None.

  • station_names (str or sequence of str, optional) – One or more station names to include. If None, all stations matching other filters are considered. The default is is None.

  • available_data (bool, optional) –

    If product is not specified:

    • if available_data is False, return stations present in the DISDRODB Metadata Archive

    • if available_data is True, return stations with data available on the

    online DISDRODB Decentralized Data Archive (i.e., stations with the disdrodb_data_url in the metadata).

    If product is specified:

    • if available_data is False, return stations where the product directory exists in the in the local DISDRODB Data Archive

    • if available_data is True, return stations where product data exists in the in the local DISDRODB Data Archive.

    The default is is False.

  • return_tuple (bool, optional) – If True, return a list of tuples (data_source, campaign_name, station_name). If False, return only a list of station names The default is True.

  • raise_error_if_empty (bool, optional) – If True and no stations satisfy the criteria, raise a ValueError. If False, return an empty list/tuple. The default is False.

  • invalid_fields_policy ({'raise', 'warn', 'ignore'}, optional) –

    How to handle invalid filter values for data_sources, campaign_names, or station_names that are not present in the metadata archive:

    • ’raise’ : raise a ValueError (default)

    • ’warn’ : emit a warning, then ignore invalid entries

    • ’ignore’: silently drop invalid entries

  • data_archive_dir (str or Path-like, optional) – Path to the root of the local DISDRODB Data Archive. Required only if ``product``is specified. If None, the default data archive base directory is used. Default is None.

  • metadata_archive_dir (str or Path-like, optional) – Path to the root of the DISDRODB Metadata Archive. If None, the default metadata base directory is used. Default is None.

  • **product_kwargs (dict, optional) – Additional arguments required for some products. It must be specified only for product L1, L2E and L2M products ! For L1, L2E and L2M products, temporal_resolution is required. FOr L2M product, model_name is required.

Returns:

If return_tuple=True, return a list of tuples (data_source, campaign_name, station_name). If return_tuple=True, return a list of station names.

Return type:

list

Examples

>>> # List all stations present in the DISDRODB Metadata Archive
>>> stations = available_stations()
>>> # List all stations present in the online DISDRODB Data Archive
>>> stations = available_stations(available_data=True)
>>> # List stations with raw data available in the local DISDRODB Data Archive
>>> raw_stations = available_stations(product="RAW", available_data=True)
>>> # List stations of specific data sources
>>> stations = available_stations(data_sources=["NASA", "EPFL"])
disdrodb.api.search.get_required_product(product)[source][source]#

Determine the required product for input product processing.

disdrodb.api.search.is_disdrodb_data_url_specified(metadata_filepath)[source][source]#

Check if the disdrodb_data_url is specified in the metadata file.

disdrodb.api.search.list_campaign_names(metadata_archive_dir, data_sources=None, campaign_names=None, invalid_fields_policy='raise', return_tuple=False)[source][source]#

List campaign names in the DISDRODB Metadata Archive.

disdrodb.api.search.list_data_sources(metadata_archive_dir, data_sources=None, invalid_fields_policy='raise')[source][source]#

List data sources names in the DISDRODB Metadata Archive.

disdrodb.api.search.list_station_names(metadata_archive_dir, data_sources=None, campaign_names=None, station_names=None, invalid_fields_policy='raise', return_tuple=False)[source][source]#

List station names in the DISDRODB Metadata Archive.

disdrodb.api.search.select_stations_matching_metadata_values(metadata_archive_dir, list_info, filter_kwargs)[source][source]#

Keep only the stations with the specified metadata key matching the specified value.

disdrodb.api.search.select_stations_with_disdrodb_data_url(metadata_archive_dir, list_info)[source][source]#

Keep only the stations with disdrodb_data_url specified in the metadata file.

disdrodb.api.search.select_stations_with_product_data(data_archive_dir, product, list_info, **product_kwargs)[source][source]#

Keep only the stations with product data.

disdrodb.api.search.select_stations_with_product_directory(data_archive_dir, product, list_info)[source][source]#

Keep only the stations with the product directory.

Module contents#