disdrodb.data_transfer package#
Submodules#
disdrodb.data_transfer.download_data module#
Routines to download data from the DISDRODB Decentralized Data Archive.
- disdrodb.data_transfer.download_data.build_ftp_server_wget_command(url: str, cut_dirs: int, dst_dir: str, verbose: bool) list[str][source][source]#
Construct the wget command list for FTP recursive download.
- disdrodb.data_transfer.download_data.build_webserver_wget_command(url: str, cut_dirs: int, dst_dir: str, verbose: bool) list[str][source][source]#
Construct the wget command list for subprocess.run.
Notes
- The following wget arguments are used
- -q
: quiet mode (no detailed progress)
- -r
: recursive
- -np
: no parent
- -nH
: no host directories
–timestamping: download missing files or when remote version is newer
–cut-dirs : strip all but the last path segment from the remote path
-P dst_dir : download into dst_dir
url
- disdrodb.data_transfer.download_data.check_consistent_station_name(metadata_filepath, station_name)[source][source]#
Check consistent station_name between YAML file name and metadata key.
- disdrodb.data_transfer.download_data.click_download_archive_options(function: object)[source][source]#
Click command line options for DISDRODB archive download.
- Parameters:
function (object) – Function.
- disdrodb.data_transfer.download_data.click_download_options(function: object)[source][source]#
Click command line options for DISDRODB download.
- Parameters:
function (object) – Function.
- disdrodb.data_transfer.download_data.compute_cut_dirs(url: str) int[source][source]#
Compute the wget cut_dirs value to download directly in dst_dir.
Given a URL ending with ‘/’, compute the total number of path segments. By returning len(segments), we strip away all of them—so that files within that final directory land directly in dst_dir without creating an extra subfolder.
- disdrodb.data_transfer.download_data.download_archive(data_sources: str | list[str] | None = None, campaign_names: str | list[str] | None = None, station_names: str | list[str] | None = None, force: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None)[source][source]#
Download DISDRODB stations with the
disdrodb_data_urlin the metadata.- Parameters:
data_sources (str or list of str, optional) – Data source name (eg : EPFL). If not provided (
None), all data sources will be downloaded. The default value isdata_source=None.campaign_names (str or list of str, optional) – Campaign name (eg : EPFL_ROOF_2012). If not provided (
None), all campaigns will be downloaded. The default value iscampaign_name=None.station_names (str or list of str, optional) – Station name. If not provided (
None), all stations will be downloaded. The default value isstation_name=None.force (bool, optional) – If
True, delete existing files and re-download raw data files. The default value isFalse.data_archive_dir (str (optional)) – DISDRODB Data Archive directory. Format:
<...>/DISDRODB. IfNone(the default), the disdrodb config variabledata_archive_diris used.
- disdrodb.data_transfer.download_data.download_ftp_server_data(url: str, dst_dir: str, verbose: bool = True) None[source][source]#
Download data from an FTP server with anonymous login.
- Parameters:
url (str) – FTP server URL pointing to a folder. Example: “ftp://ftp.example.com/path/to/data/”
dst_dir (str) – Local directory where to download the file (DISDRODB station data directory).
verbose (bool, optional) – Print wget output (default is True).
- disdrodb.data_transfer.download_data.download_station(data_source: str, campaign_name: str, station_name: str, force: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None) None[source][source]#
Download data of a single DISDRODB station from the DISDRODB remote repository.
- Parameters:
data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.
campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.
station_name (str) – The name of the station.
data_archive_dir (str (optional)) – The base directory of DISDRODB, expected in the format
<...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.force (bool, optional) – If
True, remove existing data and re-download. The default value isFalse.data_archive_dir – DISDRODB Data Archive directory. Format:
<...>/DISDRODB. IfNone(the default), the disdrodb config variabledata_archive_diris used.
- disdrodb.data_transfer.download_data.download_station_data(metadata_filepath: str, data_archive_dir: str, force: bool = False, verbose=True) None[source][source]#
Download and unzip the station data.
- Parameters:
metadata_filepaths (str) – Metadata file path.
data_archive_dir (str (optional)) – DISDRODB Data Archive directory. Format:
<...>/DISDRODB. IfNone(the default), the disdrodb config variabledata_archive_diris used.force (bool, optional) – If
True, delete existing files and redownload it. The default value isFalse.
- disdrodb.data_transfer.download_data.download_web_server_data(url: str, dst_dir: str, verbose=True) None[source][source]#
Download data from a web server via HTTP or HTTPS.
Use the system’s wget command to recursively download all files and subdirectories under the given HTTPS “directory” URL. Works on both Windows and Linux, provided that wget is installed and on the PATH.
Ensure wget is available.
Normalize URL to end with ‘/’.
Compute cut-dirs so that only the last segment of the path remains locally.
Build and run the wget command.
- Parameters:
url (str) – HTTPS URL pointing to webserver folder. Example: “https://ruisdael.citg.tudelft.nl/parsivel/PAR001_Cabauw/”
dst_dir (str) – Local directory where to download the file (DISDRODB station data directory).
verbose (bool, optional) – Print wget output (default is True).
- disdrodb.data_transfer.download_data.download_zip_file(url, dst_dir)[source][source]#
Download zip file from zenodo and extract station raw data.
- disdrodb.data_transfer.download_data.ensure_trailing_slash(url: str) str[source][source]#
Return url guaranteed to end with a slash.
disdrodb.data_transfer.upload_data module#
Routines to upload data to the DISDRODB Decentralized Data Archive.
- disdrodb.data_transfer.upload_data.click_upload_archive_options(function: object)[source][source]#
Click command line options for DISDRODB archive upload.
- Parameters:
function (object) – Function.
- disdrodb.data_transfer.upload_data.click_upload_options(function: object)[source][source]#
Click command arguments for DISDRODB data upload.
- disdrodb.data_transfer.upload_data.upload_archive(platform: str | None = None, force: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None, **fields_kwargs) None[source][source]#
Find all stations containing local data and upload them to a remote repository.
- Parameters:
platform (str, optional) – Name of the remote platform. The default platform is
"sandbox.zenodo"(for testing purposes). Switch to"zenodo"for final data dissemination.force (bool, optional) – If
True, upload even if data already exists on another remote location. The default value isforce=False.data_archive_dir (str (optional)) – The directory path where the DISDRODB Data Archive is located. The directory path must end with
<...>/DISDRODB. IfNone, it uses thedata_archive_dirpath specified in the DISDRODB active configuration.data_sources (str or list of str, optional) – Data source name (eg: EPFL). If not provided (
None), all data sources will be uploaded. The default value isdata_source=None.campaign_names (str or list of str, optional) – Campaign name (eg: EPFL_ROOF_2012). If not provided (
None), all campaigns will be uploaded. The default value iscampaign_name=None.station_names (str or list of str, optional) – Station name. If not provided (
None), all stations will be uploaded. The default value isstation_name=None.
- disdrodb.data_transfer.upload_data.upload_station(data_source: str, campaign_name: str, station_name: str, platform: str | None = 'sandbox.zenodo', force: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None) None[source][source]#
Upload data from a single DISDRODB station on a remote repository.
This function also automatically update the disdrodb_data url in the metadata file.
- Parameters:
data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.
campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.
station_name (str) – The name of the station.
data_archive_dir (str (optional)) – The directory path where the DISDRODB Data Archive is located. The directory path must end with
<...>/DISDRODB. IfNone, it uses thedata_archive_dirpath specified in the DISDRODB active configuration.platform (str, optional) – Name of the remote data storage platform. The default platform is
"sandbox.zenodo"(for testing purposes). Switch to"zenodo"for final data dissemination.force (bool, optional) – If
True, upload the data and overwrite thedisdrodb_data_url. The default value isforce=False.
disdrodb.data_transfer.zenodo module#
DISDRODB Zenodo utility.
Module contents#
Routines to download and upload data to the DISDRODB Decentralized Data Archive.