Station Metadata

Station Metadata#

The metadata for each station are defined in a YAML file. The metadata YAML file expects a standardized set of keys.

There are 7 metadata keys for which it is mandatory to specify the value :

  • the data_source must be the same as the data_source where the metadata are located.

  • the campaign_name must be the same as the campaign_name where the metadata are located.

  • the station_name must be the same as the name of the metadata YAML file without the .yml extension.

  • the sensor_name must be one of the implemented sensor configurations. See disdrodb.available_sensor_names(). If the sensor which produced your data is not within the available sensors, you first need to add the sensor configurations. For this task, read the section Add new sensor configs.

  • the platform_type must be either 'fixed' or 'mobile'. If 'mobile', the DISDRODB L0 processing accepts latitude/longitude/altitude coordinates to vary with time.

  • the raw_data_format must be either 'txt' or 'netcdf'. 'txt' if the source raw data are text/ASCII files. 'netcdf' if source raw data are netCDFs.

  • the raw_data_glob_pattern defines which raw data files in the DISDRODB/RAW/<DATA_SOURCE>/<CAMPAIGN_NAME>/<STATION_NAME>/data directory will be ingested in the DISDRODB L0 processing chain. For instance, if every station raw files ends with .txt you can specify the glob pattern as *.txt. Because you’re not including any path separators (/), this simple glob pattern will recurse through all subfolders (e.g. <year>/<month>/) under data/ and pick up every .txt file. If there are other .txt files in data/ that you don’t want to process (e.g. some geolocation information for mobile platforms or some auxiliary weather data), you can narrow the match by adding the filename prefix of the file you aim to process to the glob pattern (e.g. SPECTRUM_*.txt).

    Finally, to restrict the search to a particular data/ subdirectory, include that folder name in your pattern. Specifying "<custom>/*.txt will return only files directly inside the data/<custom> directory, while "<custom>/**/*.txt will return all files in the data/<custom> directory and all its (e.g. /<year>/<month>) subdirectories.

  • the reader reference tells the disdrodb software which reader function to use to correctly ingest the station’s raw data files. Under the hood, a reader is simply a python function that knows how to read a raw data file and make it compliant with the DISDRODB standards. All reader scripts live in the disdrodb/l0/readers directory, organized by sensor name and data source: disdrodb/l0/readers/<sensor_name>/<DATA_SOURCE>/<READER_NAME>.py. To point the disdrodb software to the correct reader, the reader reference must be defined as <DATA_SOURCE>/<READER_NAME>.

    For example, to select the OTT Parsivel GPM IFLOODS reader (defined at disdrodb.l0.readers.PARSIVEL.GPM.IFLOODS.py) the reader reference GPM/IFLOODS must be used.

The disdrodb_data_url metadata key references to the remote/online repository where station’s raw data are stored. At this URL, a single zip file provides all data available for a given station.

To check the validity of the metadata YAML files, run the following code:

from disdrodb import check_metadata_archive, check_metadata_archive_geolocation

check_metadata_archive()
check_metadata_archive_geolocation()

The list and description of the DISDRODB metadata is provided here below:

Mandatory keys#

Keys

Description

data_source

Station data source.

campaign_name

Station campaign name.

station_name

Name of the stations (and the metadata file).

sensor_name

Sensor name. It defines the processing chain in DISDRODB.

measurement_interval

Sensor measurement sampling interval(s) in seconds

raw_data_format

File format of the raw data. Either ‘txt’ or ‘netcdf’.

raw_data_glob_pattern

Glob pattern to search for raw files

platform_type

Type of station. Either ‘fixed’ or ‘mobile’.

deployment_status

Deployment status. Either ‘ongoing’ or ‘terminated’

Station description#

Keys

Description

title

Station dataset title

description

Station dataset description

project_name

Full project/campaign name of the station

keywords

Keywords related to the station and the campaign

summary

Summary information of the station

comment

Comment on the station measurements

history

History of the raw data file

station_id

ID of the station

location

Village, town or region where the disdrometer is located

country

Country where the disdrometer is located

continent

Continent where the disdrometer is located

Deployment info#

Keys

Description

latitude

WGS84 latitude in degree north [-90,90]. If the disdrometer is moving, specify -9999

longitude

WGS84 longitude in degree east [-180,180]. If the disdrometer is moving, specify -9999

altitude

Elevation above the sea level in meters. If the disdrometer is moving, specify -9999

deployment_status

Deployment status. Possible values: ‘terminated’ or ‘ongoing’

deployment_mode

Deployment mode. Possible values: ‘land’, ‘ship’, ‘truck’, ‘cable’

platform_type

Platform type. Possible values: ‘ fixed’ or ‘mobile’

platform_protection

Platform protection. Possible values: ‘N/A’, ‘shielded’, ‘unshielded’

platform_orientation

Platform orientation in 0-360 degrees from the North direction (clockwise)

Sensor Info#

Keys

Description

sensor_long_name

Sensor long name

sensor_manufacturer

Sensor manufacturer. Examples: Thies Clima, OTT Hydromet, Vaisala, Campbell, …

sensor_wavelength

Sensor wavelength

sensor_serial_number

Sensor serial number

firmware_iop

Input/Output Processor Firmware [Available for OTT Parsivels]

firmware_dsp

Digital Signal Processor Firmware [Available for OTT Parsivels]

firmware_version

Firmware version

sensor_beam_length

Length of the laser beam’s measurement area in mm

sensor_beam_width

Width of the laser beam’s measurement area in mm

sensor_nominal_width

Expected width of the sensor beam under typical operating conditions

calibration_sensitivity

Sensor sensitivity

calibration_certification_date

Sensor calibration date(s)

calibration_certification_url

Sensor calibration certification url

Source information#

Keys

Description

source

Source information

source_convention

Raw data file convention (i.e. ARM v1.XXX, NASA v1.XX, …)

source_processing_date

Date of source raw data file creation

Data Attribution#

Keys

Description

contributors

People contributing to the disdrometer dataset

authors

People responsible and to to be contacted for questions

authors_url

Web url to contact the authors

contact

People to contact to request further information

contact_information

Email address of the contact people

acknowledgements

Ackwnowledgements

references

Literature references describing the usage of the sensor

documentation

Further documentation describing the sensor/campaign/network

website

Website reporting sensor information

institution

Institution providing funding or operating the sensor

source_repository

Repository where the original raw file can be retrieved

license

Data license

doi

Digital Object Identifier of the sensor/campaign/network dataset