Archive Processing#
DISDRODB enables processing of disdrometer data archives from the command line or by calling Python functions. This guide describes how to generate all DISDRODB products (L0A through L2M) for single stations or entire archives.
For product descriptions, see Products. For configuration options, see Products Configuration.
Processing Chain Overview#
DISDRODB processes raw disdrometer data through a sequential chain:
Raw Data → L0A → L0B → L0C → L1 → L2E → L2M
Processing Levels:
L0A: Standardized tabular data (Apache Parquet format)
L0B: netCDF4 with physical dimensions and CF-compliant metadata
L0C: Time-consistent datasets with fixed measurement intervals
L1: Temporally resampled data with hydrometeor classification
L2E: Empirical rainfall parameters and radar observables
L2M: Modeled DSD parameters from parametric fitting
Each level builds on the previous one. You can process the entire chain or individual levels, but prerequisite products must exist (e.g., L1 processing requires L0C data).
Processing Options#
Common Parameters
force: IfTrue, overwrite existing data; ifFalse, raise error if data existsverbose: Print detailed processing information to terminal. Only if parallel isFalsedebugging_mode: Process only subset of data for testing (3 files for L0A, 100 rows for L0B)parallel: Process files simultaneously in multiple processes (recommended for large archives)data_archive_dir: Path to DISDRODB Data Archive (if not using default configuration)metadata_archive_dir: Path to DISDRODB Metadata Archive (if not using default configuration)
Station Selection (for archive-wide processing)
data_sources: List of data sources to process (e.g.,["EPFL", "NASA", "ITALY"])campaign_names: List of campaigns to process (e.g.,["EPFL_2008", "LOCARNO_2019"])station_names: List of specific stations to process (e.g.,["TC-RM", "TC-TO"])
If none specified, all available stations are processed. These filters can be combined to select specific subsets.
L0 Processing (Single Station)#
Process raw data through L0A, L0B, and L0C levels for a specific station.
See disdrodb.run_l0_station() for detailed parameter documentation.
Command Line
disdrodb_run_l0_station <data_source> <campaign_name> <station_name> [options]
Example:
disdrodb_run_l0_station EPFL EPFL_2008 10 --l0a_processing True --l0b_processing True --l0c_processing True --force True --verbose True --parallel False
Type disdrodb_run_l0_station --help for all available options.
Python
import disdrodb
disdrodb.run_l0_station(
data_source="EPFL",
campaign_name="EPFL_2008",
station_name="10",
# Processing levels
l0a_processing=True,
l0b_processing=True,
l0c_processing=True,
# Options
remove_l0a=False,
remove_l0b=False,
force=True,
verbose=True,
debugging_mode=False,
parallel=False,
)
L0 Processing (Multiple Stations)#
Process multiple stations simultaneously. Filters can be combined to select specific subsets of the archive.
See disdrodb.run_l0() for detailed parameter documentation.
Command Line
disdrodb_run_l0 --data_sources <sources> --campaign_names <campaigns> --station_names <stations> [options]
Example - Process entire campaign:
disdrodb_run_l0 --campaign_names EPFL_2008 --l0a_processing True --l0b_processing True --l0c_processing True --parallel False
Example - Process multiple campaigns:
disdrodb_run_l0 --campaign_names 'EPFL_2008 LOCARNO_2018' --l0a_processing True --l0b_processing True --l0c_processing True --parallel True
Type disdrodb_run_l0 --help for all available options.
Python
import disdrodb
disdrodb.run_l0(
data_sources=["EPFL"],
campaign_names=["EPFL_2008"],
# station_names=["10", "20"], # Optional: specific stations only
# Processing levels
l0a_processing=True,
l0b_processing=True,
l0c_processing=True,
# Options
remove_l0a=False,
remove_l0b=False,
force=True,
verbose=True,
debugging_mode=False,
parallel=True,
)
L1 Processing (Single Station)#
Generate temporally resampled data with hydrometeor classification from L0C products.
See disdrodb.run_l1_station() for detailed argument documentation.
Command Line
disdrodb_run_l1_station <data_source> <campaign_name> <station_name> [options]
Example:
disdrodb_run_l1_station EPFL EPFL_2008 10 --force True --verbose True --parallel True
Python
import disdrodb
disdrodb.run_l1_station(
data_source="EPFL",
campaign_name="EPFL_2008",
station_name="10",
force=True,
verbose=True,
debugging_mode=False,
parallel=True,
)
L1 Processing (Multiple Stations)#
Process L1 products for multiple stations.
See disdrodb.run_l1() for detailed argument documentation.
Command Line
disdrodb_run_l1 --campaign_names <campaigns> [options]
Example:
disdrodb_run_l1 --campaign_names EPFL_2008 --force True --parallel True
Python
import disdrodb
disdrodb.run_l1(
campaign_names=["EPFL_2008"],
force=True,
verbose=True,
parallel=True,
)
L2E Processing (Single Station)#
Compute integrated rainfall parameters from L1 products.
See disdrodb.run_l2e_station() for detailed argument documentation.
Command Line
disdrodb_run_l2e_station <data_source> <campaign_name> <station_name> [options]
Example:
disdrodb_run_l2e_station EPFL EPFL_2008 10 --force True --verbose True --parallel True
Python
import disdrodb
disdrodb.run_l2e_station(
data_source="EPFL",
campaign_name="EPFL_2008",
station_name="10",
force=True,
verbose=True,
debugging_mode=False,
parallel=True,
)
L2E Processing (Multiple Stations)#
Process L2E products for multiple stations.
See disdrodb.run_l2e() for detailed argument documentation.
Command Line
disdrodb_run_l2e --campaign_names <campaigns> [options]
Example:
disdrodb_run_l2e --campaign_names EPFL_2008 --force True --parallel True
Python
import disdrodb
disdrodb.run_l2e(
campaign_names=["EPFL_2008"],
force=True,
verbose=True,
parallel=True,
)
L2M Processing (Single Station)#
Fit parametric DSD models to the drop number concentration derived in L2E products.
See disdrodb.run_l2m_station() for detailed argument documentation.
Command Line
disdrodb_run_l2m_station <data_source> <campaign_name> <station_name> [options]
Example:
disdrodb_run_l2m_station EPFL EPFL_2008 10 --force True --verbose True --parallel True
Python
import disdrodb
disdrodb.run_l2m_station(
data_source="EPFL",
campaign_name="EPFL_2008",
station_name="10",
force=True,
verbose=True,
debugging_mode=False,
parallel=True,
)
L2M Processing (Multiple Stations)#
Process L2M products for multiple stations.
See disdrodb.run_l2m() for detailed argument documentation.
Command Line
disdrodb_run_l2m --campaign_names <campaigns> [options]
Example:
disdrodb_run_l2m --campaign_names EPFL_2008 --force True --parallel True
Python
import disdrodb
disdrodb.run_l2m(
campaign_names=["EPFL_2008"],
force=True,
verbose=True,
parallel=True,
)
Complete Processing Chain#
Process entire chain from raw data to final products in one command.
By default, L2M processing is is disabled. Enable with l2m_processing=True.
Single Station
See disdrodb.run_station() for detailed argument documentation.
import disdrodb
disdrodb.run_station(
data_source="EPFL",
campaign_name="EPFL_2008",
station_name="10",
# L0 processing
l0a_processing=True,
l0b_processing=True,
l0c_processing=True,
remove_l0a=False,
remove_l0b=False,
# L1 and L2 processing
l1_processing=True,
l2e_processing=True,
l2m_processing=False,
# Options
force=True,
verbose=True,
debugging_mode=False,
parallel=True,
)
Multiple Stations
See disdrodb.run() for detailed parameter documentation.
import disdrodb
disdrodb.run(
campaign_names=["EPFL_2008"],
# L0 processing
l0a_processing=True,
l0b_processing=True,
l0c_processing=True,
remove_l0a=False,
remove_l0b=False,
# L1 and L2 processing
l1_processing=True,
l2e_processing=True,
l2m_processing=False,
# Options
force=True,
verbose=True,
debugging_mode=False,
parallel=True,
)
Best Practices#
Processing Strategy
Start Small: Test with
debugging_mode=Trueon a single station before processing entire archivesSequential Testing: Process one product level at a time initially to verify configurations
Parallel Processing: Enable
parallel=Truefor large archives to speed up processing** Memory Usage**: Monitor memory usage when processing multiple stations in parallel; adjust archive options if necessary
Disk Space: Monitor disk space, especially when processing L0 products without removing intermediate files
Configuration: Customize products configurations before large-scale processing (see Products Configuration)
Error Handling
Use
force=Falseto prevent accidental overwriting of existing dataUse
verbose=Trueduring initial testing to monitor progressCheck logs directory for detailed error messages.
Memory Management
Adjust
parallelsettings based on available memoryConsider processing in batches for very large archives
Use
remove_l0a=Trueandremove_l0b=Trueto save disk space after L0C generation
Command Reference#
Single Station Commands
disdrodb_run_l0_station: L0A → L0B → L0Cdisdrodb_run_l0a_station: Raw → L0Adisdrodb_run_l0b_station: L0A → L0Bdisdrodb_run_l0c_station: L0B → L0Cdisdrodb_run_l1_station: L0C → L1disdrodb_run_l2e_station: L1 → L2Edisdrodb_run_l2m_station: L2E → L2Mdisdrodb_run_station: Raw → all products
Archive-Wide Commands
disdrodb_run_l0: L0 processing for multiple stationsdisdrodb_run_l0a: L0A processing for multiple stationsdisdrodb_run_l0b: L0B processing for multiple stationsdisdrodb_run_l0c: L0C processing for multiple stationsdisdrodb_run_l1: L1 processing for multiple stationsdisdrodb_run_l2e: L2E processing for multiple stationsdisdrodb_run_l2m: L2M processing for multiple stationsdisdrodb_run: Complete chain for multiple stations
Use <command> --help for detailed parameter information.
By typing disdrodb followed by TAB TAB TAB in the terminal, you can see all available DISDRODB commands.