disdrodb.api package#
Submodules#
disdrodb.api.checks module#
DISDRODB Checks Functions.
- disdrodb.api.checks.check_campaign_name(campaign_name)[source][source]#
Check the campaign name is upper case !.
- disdrodb.api.checks.check_campaign_names(campaign_names)[source][source]#
Check DISDRODB campaign names.
- disdrodb.api.checks.check_data_archive_dir(data_archive_dir: str)[source][source]#
Raise an error if the path does not end with
DISDRODB.
- disdrodb.api.checks.check_data_availability(product, data_source, campaign_name, station_name, data_archive_dir=None, **product_kwargs)[source][source]#
Check the station product data directory has files inside. If not, raise an error.
- disdrodb.api.checks.check_data_source(data_source)[source][source]#
Check the data_source name is upper case !.
- disdrodb.api.checks.check_directories_inside(dir_path)[source][source]#
Check there are directories inside the specified
dir_path.
- disdrodb.api.checks.check_folder_partitioning(folder_partitioning)[source][source]#
Check if the given folder partitioning scheme is valid.
- Parameters:
folder_partitioning (str or None) –
Defines the subdirectory structure based on the dataset’s start time. Allowed values are:
””: No additional subdirectories, files are saved directly in data_dir.
”year”: Files are stored under a subdirectory for the year (<data_dir>/2025).
”year/month”: Files are stored under subdirectories by year and month (<data_dir>/2025/04).
”year/month/day”: Files are stored under subdirectories by year, month and day (<data_dir>/2025/04/01).
”year/month_name”: Files are stored under subdirectories by year and month name (<data_dir>/2025/April).
”year/quarter”: Files are stored under subdirectories by year and quarter (<data_dir>/2025/Q2).
- Returns:
The verified folder partitioning scheme.
- Return type:
folder_partitioning
- disdrodb.api.checks.check_invalid_fields_policy(invalid_fields)[source][source]#
Check invalid fields policy.
- disdrodb.api.checks.check_issue_dir(data_source, campaign_name, metadata_archive_dir=None)[source][source]#
Check existence of the issue directory. If does not exists, raise an error.
- disdrodb.api.checks.check_issue_file(data_source, campaign_name, station_name, metadata_archive_dir=None)[source][source]#
Check existence of a valid issue YAML file. If does not exists, raise an error.
- disdrodb.api.checks.check_measurement_interval(measurement_interval)[source][source]#
Check measurement interval validity.
- disdrodb.api.checks.check_measurement_intervals(measurement_intervals)[source][source]#
Check measurement interval.
Can be a list. It must be a positive natural number
- disdrodb.api.checks.check_metadata_archive_dir(metadata_archive_dir: str)[source][source]#
Raise an error if the path does not end with
DISDRODB.
- disdrodb.api.checks.check_metadata_file(metadata_archive_dir, data_source, campaign_name, station_name, check_validity=True)[source][source]#
Check existence of a valid metadata YAML file. If does not exists, raise an error.
- disdrodb.api.checks.check_path(path: str) None[source][source]#
Check if a path exists.
- Parameters:
path (str) – Path to check.
- Raises:
FileNotFoundError – If the path does not exist.
- disdrodb.api.checks.check_path_is_a_directory(dir_path, path_name='')[source][source]#
Check that the path exists and is directory.
- disdrodb.api.checks.check_product_kwargs(product, product_kwargs)[source][source]#
Validate that product_kwargs for a given product contains exactly the required parameters.
- Parameters:
- Returns:
The validated product_kwargs.
- Return type:
- Raises:
ValueError – If required arguments are missing or if there are unexpected extra arguments.
- disdrodb.api.checks.check_sample_interval(sample_interval)[source][source]#
Check sample_interval argument validity.
- disdrodb.api.checks.check_sensor_name(sensor_name: str) None[source][source]#
Check sensor name.
- Parameters:
- Raises:
TypeError – Error if
sensor_nameis not a string.ValueError – Error if the input sensor name has not been found in the list of available sensors.
- disdrodb.api.checks.check_station_names(station_names)[source][source]#
Check DISDRODB station names.
- disdrodb.api.checks.check_valid_fields(fields, available_fields, field_name, invalid_fields_policy='raise')[source][source]#
Check if fields are valid.
disdrodb.api.configs module#
Retrieve sensor configuration files.
- disdrodb.api.configs.available_sensor_names() list[source][source]#
Get available names of sensors.
- Returns:
sensor_names – Sorted list of the available sensors
- Return type:
- disdrodb.api.configs.get_sensor_configs_dir(sensor_name: str, product: str) str[source][source]#
Retrieve configs directory.
- Parameters:
- Returns:
config_sensor_dir – Config directory.
- Return type:
- Raises:
ValueError – Error if the config directory does not exist.
disdrodb.api.create_directories module#
Tools to create RAW, L0A and L0B DISDRODB directories.
- disdrodb.api.create_directories.create_data_directory(data_archive_dir, product, data_source, campaign_name, station_name, **product_kwargs)[source][source]#
Create station product data directory.
- disdrodb.api.create_directories.create_initial_station_structure(data_source, campaign_name, station_name, data_archive_dir=None, metadata_archive_dir=None)[source][source]#
Create the DISDRODB Data and Metadata Archive structure for a single station.
- disdrodb.api.create_directories.create_issue_directory(metadata_archive_dir, data_source, campaign_name)[source][source]#
Create issue directory.
- disdrodb.api.create_directories.create_l0_directory_structure(data_archive_dir, metadata_archive_dir, data_source, campaign_name, station_name, force, product)[source][source]#
Create directory structure for the first L0 DISDRODB product.
If the input data are raw text files, use
product = "L0A"If the input data are raw netCDF files, useproduct = "L0B"product = "L0A"will callrun_l0a.product = "L0B"will callrun_l0b_nc.
- disdrodb.api.create_directories.create_logs_directory(product, data_source, campaign_name, station_name, data_archive_dir=None, **product_kwargs)[source][source]#
Initialize the logs directory structure for a DISDRODB product.
- disdrodb.api.create_directories.create_metadata_directory(metadata_archive_dir, data_source, campaign_name)[source][source]#
Create metadata directory.
- disdrodb.api.create_directories.create_product_directory(data_source, campaign_name, station_name, product, force, data_archive_dir=None, metadata_archive_dir=None, **product_kwargs)[source][source]#
Initialize the directory structure for a DISDRODB product.
If product files already exists: - If
force=True, it remove all existing data inside the product directory. - Ifforce=False, it raise an error.
- disdrodb.api.create_directories.create_test_archive(test_data_archive_dir, data_source, campaign_name, station_name, data_archive_dir=None, metadata_archive_dir=None, force=False)[source][source]#
Create test DISDRODB Archive for a single existing station.
This function is used to make a copy of metadata and issue files of a stations. This enable to then test data download and DISDRODB processing.
disdrodb.api.info module#
Retrieve file information from DISDRODB products file names and filepaths.
- disdrodb.api.info.get_campaign_name_from_filepaths(filepaths)[source][source]#
Return the DISDROB campaign name of the specified files.
- disdrodb.api.info.get_end_time_from_filepaths(filepaths)[source][source]#
Return the end time of the specified files.
- disdrodb.api.info.get_info_from_filepath(filepath)[source][source]#
Retrieve file information dictionary from filepath.
- disdrodb.api.info.get_key_from_filepath(filepath, key)[source][source]#
Extract specific key information from a list of filepaths.
- disdrodb.api.info.get_key_from_filepaths(filepaths, key)[source][source]#
Extract specific key information from a list of filepaths.
- disdrodb.api.info.get_product_from_filepaths(filepaths)[source][source]#
Return the DISDROB product name of the specified files.
- disdrodb.api.info.get_sample_interval_from_filepaths(filepaths)[source][source]#
Return the sample interval of the specified files.
- disdrodb.api.info.get_season(time)[source][source]#
Get season from datetime.datetime or datetime.date object.
- disdrodb.api.info.get_start_end_time_from_filepaths(filepaths)[source][source]#
Return the start and end time of the specified files.
- disdrodb.api.info.get_start_time_from_filepaths(filepaths)[source][source]#
Return the start time of the specified files.
- disdrodb.api.info.get_station_name_from_filepaths(filepaths)[source][source]#
Return the DISDROB station name of the specified files.
- disdrodb.api.info.get_time_component(time, component)[source][source]#
Get time component from datetime.datetime object.
- disdrodb.api.info.get_version_from_filepaths(filepaths)[source][source]#
Return the DISDROB product version of the specified files.
- disdrodb.api.info.group_filepaths(filepaths, groups=None)[source][source]#
Group filepaths in a dictionary if groups are specified.
- Parameters:
filepaths (list) – List of filepaths.
groups (list or str) – The group keys by which to group the filepaths. Valid group keys are
product,subproduct,campaign_name,station_name,start_time,end_time,``accumulation_acronym``,``sample_interval``,data_format,year,month,day,doy,dow,hour,minute,second,month_name,quarter,season. The time components are extracted fromstart_time! If groups isNonereturns the input filepaths list. The default value isNone.
- Returns:
Either a dictionary of format
{<group_value>: <list_filepaths>}. or the original input filepaths (ifgroups=None)- Return type:
- disdrodb.api.info.infer_archive_dir_from_path(path: str) str[source][source]#
Return the disdrodb base directory from a file or directory path.
Assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!
- disdrodb.api.info.infer_campaign_name_from_path(path: str) str[source][source]#
Return the campaign name from a file or directory path.
Assumption: no
data_source,campaign_name,station_nameor file contain the word DISDRODB!
- disdrodb.api.info.infer_data_source_from_path(path: str) str[source][source]#
Return the data_source from a file or directory path.
Assumption: no
data_source,campaign_name,station_nameor file contain the word DISDRODB!
- disdrodb.api.info.infer_disdrodb_tree_path(path: str) str[source][source]#
Return the directory tree path from the archive directory.
Current assumption: no
data_source,campaign_name,station_nameor file contain the word DISDRODB!
- disdrodb.api.info.infer_disdrodb_tree_path_components(path: str) list[source][source]#
Return a list with the component of a DISDRODB path
disdrodb_path.
- disdrodb.api.info.infer_path_info_dict(path: str) dict[source][source]#
Return a dictionary with the
data_archive_dir,data_sourceandcampaign_nameof the disdrodb_path.
disdrodb.api.io module#
Routines to list and open DISDRODB products.
- disdrodb.api.io.filter_filepaths(filepaths, debugging_mode)[source][source]#
Filter out filepaths if
debugging_mode=True.
- disdrodb.api.io.find_files(data_source, campaign_name, station_name, product, debugging_mode: bool = False, data_archive_dir: str | None = None, glob_pattern='*', **product_kwargs)[source][source]#
Retrieve DISDRODB product files for a give station.
- Parameters:
data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.
campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.
station_name (str) – The name of the station.
product (str) – The name DISDRODB product.
debugging_mode (bool, optional) – If
True, it select maximum 3 files for debugging purposes. The default value isFalse.data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.glob_pattern (str, optional) – Glob pattern to search for raw data files. The default is “*”. The argument is used only if product=”RAW”.
sample_interval (int, optional) – The sampling interval in seconds of the product. It must be specified only for product L2E and L2M !
rolling (bool, optional) – Whether the dataset has been resampled by aggregating or rolling. It must be specified only for product L2E and L2M !
model_name (str) – The model name of the statistical distribution for the DSD. It must be specified only for product L2M !
- Returns:
filepaths – List of file paths.
- Return type:
- disdrodb.api.io.open_data_archive(data_archive_dir=None)[source][source]#
Open the DISDRODB Data Archive.
- disdrodb.api.io.open_dataset(data_source, campaign_name, station_name, product, product_kwargs=None, debugging_mode: bool = False, data_archive_dir: str | None = None, **open_kwargs)[source][source]#
Retrieve DISDRODB product files for a give station.
- Parameters:
data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.
campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.
station_name (str) – The name of the station.
product (str) – The name DISDRODB product.
sample_interval (int, optional) – The sampling interval in seconds of the product. It must be specified only for product L2E and L2M !
rolling (bool, optional) – Whether the dataset has been resampled by aggregating or rolling. It must be specified only for product L2E and L2M !
model_name (str) – The model name of the statistical distribution for the DSD. It must be specified only for product L2M !
debugging_mode (bool, optional) – If
True, it select maximum 3 files for debugging purposes. The default value isFalse.data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.
- Return type:
- disdrodb.api.io.open_file_explorer(path)[source][source]#
Open the native file-browser showing ‘path’.
- disdrodb.api.io.open_logs_directory(data_source, campaign_name, station_name=None, data_archive_dir=None)[source][source]#
Open the DISDRODB Data Archive logs directory of a station.
- disdrodb.api.io.open_metadata_archive(metadata_archive_dir=None)[source][source]#
Open the DISDRODB Metadata Archive.
- disdrodb.api.io.open_metadata_directory(data_source, campaign_name, station_name=None, metadata_archive_dir=None)[source][source]#
Open the DISDRODB Metadata Archive station(s) metadata directory.
- disdrodb.api.io.open_product_directory(product, data_source, campaign_name, station_name, data_archive_dir=None)[source][source]#
Open the DISDRODB Data Archive station product directory.
disdrodb.api.path module#
Define paths within the DISDRODB infrastructure.
- disdrodb.api.path.define_accumulation_acronym(seconds, rolling)[source][source]#
Define the accumulation acronnym.
Prefix the accumulation interval acronym with ROLL if rolling=True.
- disdrodb.api.path.define_campaign_dir(archive_dir, product, data_source, campaign_name, check_exists=False)[source][source]#
Return the campaign directory in the DISDRODB infrastructure.
If
product="METADATA", it returns the path in the DISDRODB Metadata Archive. Otherwise, it returns the path in the DISDRODB Data Archive.- Parameters:
product (str) – The DISDRODB product. See
disdrodb.available_products(). If “METADATA” is specified, it returns the path in the DISDRODB Metadata Archive.data_source (str) – The data source. Must be specified if
campaign_nameis specified.campaign_name (str) – The campaign name.
archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
station_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_config_dir(product)[source][source]#
Define the config directory path of a given DISDRODB product.
- disdrodb.api.path.define_data_dir(product, data_source, campaign_name, station_name, data_archive_dir=None, check_exists=False, **product_kwargs)[source][source]#
Return the station product data directory in the DISDRODB infrastructure.
- Parameters:
product (str) – The DISDRODB product. See
disdrodb.available_products().data_source (str) – The data source.
campaign_name (str) – The campaign name.
station_name (str) – The station name.
data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.sample_interval (int, optional) – The sampling interval in seconds of the product. It must be specified only for product L2E and L2M !
rolling (bool, optional) – Whether the dataset has been resampled by aggregating or rolling. It must be specified only for product L2E and L2M !
model_name (str) – The name of the fitted statistical distribution for the DSD. It must be specified only for product L2M !
- Returns:
data_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_data_source_dir(archive_dir, product, data_source, check_exists=False)[source][source]#
Return the data source directory in the DISDRODB infrastructure.
If
product="METADATA", it returns the path in the DISDRODB Metadata Archive. Otherwise, it returns the path in the DISDRODB Data Archive.- Parameters:
product (str) – The DISDRODB product. See
disdrodb.available_products(). If “METADATA” is specified, it returns the path in the DISDRODB Metadata Archive.data_source (str) – The data source.
archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
station_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_disdrodb_path(archive_dir, product, data_source='', campaign_name='', check_exists=True)[source][source]#
Return the directory path in the DISDRODB Metadata and Data Archive.
If
product="METADATA", it returns the path in the DISDRODB Metadata Archive. Otherwise, it returns the path in the DISDRODB Data Archive.If
data_sourceandcampaign_nameare not specified it return the product directory.If
data_sourceis specified, it returns thedata_sourcedirectory.If
campaign_sourceis specified, it returns thecampaign_namedirectory.- Parameters:
archive_dir (str) – The DISDRODB archive directory
product (str) – The DISDRODB product. See
disdrodb.available_products(). If “METADATA” is specified, it returns the path in the DISDRODB Metadata Archive.data_source (str, optional) – The data source. Must be specified if
campaign_nameis specified.campaign_name (str, optional) – The campaign name.
check_exists (bool, optional) – Whether to check if the directory exists. The default value is
True.
- Returns:
dir_path – Directory path
- Return type:
- disdrodb.api.path.define_file_folder_path(obj, data_dir, folder_partitioning)[source][source]#
Define the folder path where saving a file based on the dataset’s starting time.
- Parameters:
ds (xarray.Dataset or pandas.DataFrame) – The object containing time information.
data_dir (str) – Directory within the DISDRODB Data Archive where DISDRODB product files are to be saved.
folder_partitioning (str or None) –
Define the subdirectory structure where saving files. Allowed values are:
None: Files are saved directly in data_dir.
”year”: Files are saved under a subdirectory for the year.
”year/month”: Files are saved under subdirectories for year and month.
”year/month/day”: Files are saved under subdirectories for year, month and day
”year/month_name”: Files are stored under subdirectories by year and month name
”year/quarter”: Files are saved under subdirectories for year and quarter.
- Returns:
A complete directory path where the file should be saved.
- Return type:
- disdrodb.api.path.define_filename(product: str, campaign_name: str, station_name: str, obj=None, add_version=True, add_time_period=True, add_extension=True, prefix='', suffix='', **product_kwargs) str[source][source]#
Define DISDRODB products filename.
- Parameters:
obj (xarray.Dataset or pandas.DataFrame) – xarray Dataset or pandas DataFrame. Required if add_time_period = True.
campaign_name (str) – Name of the campaign.
station_name (str) – Name of the station.
sample_interval (int, optional) – The sampling interval in seconds of the product. It must be specified only for product L2E and L2M !
rolling (bool, optional) – Whether the dataset has been resampled by aggregating or rolling. It must be specified only for product L2E and L2M !
model_name (str) – The model name of the fitted statistical distribution for the DSD. It must be specified only for product L2M !
- Returns:
L0B file name.
- Return type:
- disdrodb.api.path.define_issue_dir(data_source, campaign_name, metadata_archive_dir=None, check_exists=False)[source][source]#
Return the issue directory in the DISDRODB infrastructure.
- Parameters:
data_source (str) – The data source.
campaign_name (str) – The campaign name.
data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
issue_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_issue_filepath(data_source, campaign_name, station_name, metadata_archive_dir=None, check_exists=False)[source][source]#
Return the station issue filepath in the DISDRODB infrastructure.
- Parameters:
data_source (str) – The data source.
campaign_name (str) – The campaign name.
station_name (str) – The station name.
data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
issue_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_l0a_filename(df, campaign_name: str, station_name: str) str[source][source]#
Define L0A file name.
- Parameters:
df (pandas.DataFrame) – L0A DataFrame.
campaign_name (str) – Name of the campaign.
station_name (str) – Name of the station.
- Returns:
L0A file name.
- Return type:
- disdrodb.api.path.define_l0b_filename(ds, campaign_name: str, station_name: str) str[source][source]#
Define L0B file name.
- Parameters:
ds (xarray.Dataset) – L0B xarray Dataset.
campaign_name (str) – Name of the campaign.
station_name (str) – Name of the station.
- Returns:
L0B file name.
- Return type:
- disdrodb.api.path.define_l0c_filename(ds, campaign_name: str, station_name: str) str[source][source]#
Define L0C file name.
- Parameters:
ds (xarray.Dataset) – L0B xarray Dataset.
campaign_name (str) – Name of the campaign.
station_name (str) – Name of the station.
- Returns:
L0B file name.
- Return type:
- disdrodb.api.path.define_l1_filename(ds, campaign_name, station_name: str) str[source][source]#
Define L1 file name.
- Parameters:
ds (xarray.Dataset) – L1 xarray Dataset.
campaign_name (str) – Name of the campaign.
station_name (str) – Name of the station.
- Returns:
L1 file name.
- Return type:
- disdrodb.api.path.define_l2e_filename(ds, campaign_name: str, station_name: str, sample_interval: int, rolling: bool) str[source][source]#
Define L2E file name.
- Parameters:
ds (xarray.Dataset) – L1 xarray Dataset
campaign_name (str) – Name of the campaign.
station_name (str) – Name of the station
- Returns:
L0B file name.
- Return type:
- disdrodb.api.path.define_l2m_filename(ds, campaign_name: str, station_name: str, sample_interval: int, rolling: bool, model_name: str) str[source][source]#
Define L2M file name.
- Parameters:
ds (xarray.Dataset) – L1 xarray Dataset
campaign_name (str) – Name of the campaign.
station_name (str) – Name of the station
- Returns:
L0B file name.
- Return type:
- disdrodb.api.path.define_logs_dir(product, data_source, campaign_name, station_name, data_archive_dir=None, check_exists=False, **product_kwargs)[source][source]#
Return the station log directory in the DISDRODB infrastructure.
- Parameters:
product (str) – The DISDRODB product. See
disdrodb.available_products().data_source (str) – The data source.
campaign_name (str) – The campaign name.
station_name (str) – The station name.
data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
station_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_metadata_dir(data_source, campaign_name, metadata_archive_dir=None, check_exists=False)[source][source]#
Return the metadata directory in the DISDRODB infrastructure.
- Parameters:
data_source (str) – The data source.
campaign_name (str) – The campaign name.
data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
metadata_archive_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_metadata_filepath(data_source, campaign_name, station_name, metadata_archive_dir=None, check_exists=False)[source][source]#
Return the station metadata filepath in the DISDRODB infrastructure.
- Parameters:
data_source (str) – The data source.
campaign_name (str) – The campaign name.
station_name (str) – The station name.
data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
metadata_archive_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_product_dir_tree(product, **product_kwargs)[source][source]#
Return the product directory tree.
- Parameters:
product (str) – The DISDRODB product. See
disdrodb.available_products().sample_interval (int, optional) – The sampling interval in seconds of the product. It must be specified only for product L2E and L2M !
rolling (bool, optional) – Whether the dataset has been resampled by aggregating or rolling. It must be specified only for product L2E and L2M !
model_name (str) – The custom model name of the fitted statistical distribution. It must be specified only for product L2M !
- Returns:
data_dir – Station data directory path
- Return type:
- disdrodb.api.path.define_station_dir(product, data_source, campaign_name, station_name, data_archive_dir=None, check_exists=False)[source][source]#
Return the station product directory in the DISDRODB infrastructure.
- Parameters:
product (str) – The DISDRODB product. See
disdrodb.available_products().data_source (str) – The data source.
campaign_name (str) – The campaign name.
station_name (str) – The station name.
data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.check_exists (bool, optional) – Whether to check if the directory exists. The default value is
False.
- Returns:
station_dir – Station data directory path
- Return type:
disdrodb.api.search module#
- disdrodb.api.search.available_campaigns(product=None, data_sources=None, station_names=None, available_data=False, raise_error_if_empty=False, invalid_fields_policy='raise', data_archive_dir=None, metadata_archive_dir=None, **product_kwargs)[source][source]#
Return campaigns names for which stations are available.
- disdrodb.api.search.available_data_sources(product=None, campaign_names=None, station_names=None, available_data=False, raise_error_if_empty=False, invalid_fields_policy='raise', data_archive_dir=None, metadata_archive_dir=None, **product_kwargs)[source][source]#
Return data sources for which stations are available.
- disdrodb.api.search.available_stations(product=None, data_sources=None, campaign_names=None, station_names=None, return_tuple=True, available_data=False, raise_error_if_empty=False, invalid_fields_policy='raise', data_archive_dir=None, metadata_archive_dir=None, **product_kwargs)[source][source]#
Return stations information for which metadata or product data are available on disk.
This function queries the DISDRODB Metadata Archive and, optionally, the local DISDRODB Data Archive to identify stations that satisfy the specified filters.
If the DISDRODB product is not specified, it lists the stations present in the DISDRODB Metadata Archive given the specified filtering criteria. If the DISDRODB product is specified, it lists the stations present in the local DISDRODB Data Archive given the specified filtering criteria.
- Parameters:
product (str or None, optional) –
Name of the product to filter on (e.g., “RAW”, “L0A”, “L1”).
If the DISDRODB product is not specified (default), it lists the stations present in the DISDRODB Metadata Archive given the specified filtering criteria.
If the DISDRODB product is specified, it lists the stations present in the local DISDRODB Data Archive given the specified filtering criteria. The default is is None.
data_sources (str or sequence of str, optional) – One or more data source identifiers to filter stations by. The name(s) must be UPPER CASE. If None, no filtering on data source is applied. The default is is
None.campaign_names (str or sequence of str, optional) – One or more campaign names to filter stations by. The name(s) must be UPPER CASE. If None, no filtering on campaign is applied. The default is is
None.station_names (str or sequence of str, optional) – One or more station names to include. If None, all stations matching other filters are considered. The default is is
None.available_data (bool, optional) –
If
productis not specified:if available_data is False, return stations present in the DISDRODB Metadata Archive
if available_data is True, return stations with data available on the
online DISDRODB Decentralized Data Archive (i.e., stations with the disdrodb_data_url in the metadata).
If
productis specified:if available_data is False, return stations where the product directory exists in the in the local DISDRODB Data Archive
if available_data is True, return stations where product data exists in the in the local DISDRODB Data Archive.
The default is is False.
return_tuple (bool, optional) – If True, return a list of tuples
(data_source, campaign_name, station_name). If False, return only a list of station names The default is True.raise_error_if_empty (bool, optional) – If True and no stations satisfy the criteria, raise a
ValueError. If False, return an empty list/tuple. The default is False.invalid_fields_policy ({'raise', 'warn', 'ignore'}, optional) –
How to handle invalid filter values for
data_sources,campaign_names, orstation_namesthat are not present in the metadata archive:’raise’ : raise a
ValueError(default)’warn’ : emit a warning, then ignore invalid entries
’ignore’: silently drop invalid entries
data_archive_dir (str or Path-like, optional) – Path to the root of the local DISDRODB Data Archive. Required only if ``product``is specified. If None, the default data archive base directory is used. Default is None.
metadata_archive_dir (str or Path-like, optional) – Path to the root of the DISDRODB Metadata Archive. If None, the default metadata base directory is used. Default is None.
**product_kwargs (dict, optional) – Additional arguments required for some products. For example, for the “L2E” product, you need to specify
rollingandsample_interval. For the “L2M” product, you need to specify also themodel_name.
- Returns:
If
return_tuple=True, return a list of tuples(data_source, campaign_name, station_name). Ifreturn_tuple=True,, return a list of station names.- Return type:
Examples
>>> # List all stations present in the DISDRODB Metadata Archive >>> stations = available_stations() >>> # List all stations present in the online DISDRODB Data Archive >>> stations = available_stations(available_data=True) >>> # List stations with raw data available in the local DISDRODB Data Archive >>> raw_stations = available_stations(product="RAW", available_data=True) >>> # List stations of specific data sources >>> stations = available_stations(data_sources=["GPM", "EPFL"])
- disdrodb.api.search.get_required_product(product)[source][source]#
Determine the required product for input product processing.
- disdrodb.api.search.is_disdrodb_data_url_specified(metadata_filepath)[source][source]#
Check if the disdrodb_data_url is specified in the metadata file.
- disdrodb.api.search.keep_list_info_elements_with_product_data(data_archive_dir, product, list_info, **product_kwargs)[source][source]#
Keep only the stations with product data.
- disdrodb.api.search.keep_list_info_elements_with_product_directory(data_archive_dir, product, list_info)[source][source]#
Keep only the stations with the product directory.
- disdrodb.api.search.keep_list_info_with_disdrodb_data_url(metadata_archive_dir, list_info)[source][source]#
Keep only the stations with disdrodb_data_url specified in the metadata file.
- disdrodb.api.search.list_campaign_names(metadata_archive_dir, data_sources=None, campaign_names=None, invalid_fields_policy='raise', return_tuple=False)[source][source]#
List campaign names in the DISDRODB Metadata Archive.