disdrodb.l2 package

Contents

disdrodb.l2 package#

Submodules#

disdrodb.l2.empirical_dsd module#

Functions for computation of DSD parameters.

The functions of this module expects xarray.DataArray objects as input. Zeros and NaN values input arrays are correctly processed. Infinite values should be removed beforehand or otherwise are propagated throughout the computations.

disdrodb.l2.empirical_dsd.add_bins_metrics(ds)[source][source]#

Add bin metrics if missing.

disdrodb.l2.empirical_dsd.compute_integral_parameters(drop_number_concentration, velocity, diameter, diameter_bin_width, sample_interval, water_density)[source][source]#

Compute integral parameters of a drop size distribution (DSD).

Parameters:
  • drop_number_concentration (xr.DataArray) – Drop number concentration in each diameter bin [#/m3/mm].

  • velocity (xr.DataArray) – Fall velocity of drops in each diameter bin [m/s]. The presence of a velocity_method dimension enable to compute the parameters with different velocity estimates.

  • diameter (array-like) – Diameter of drops in each bin in m.

  • diameter_bin_width (array-like) – Width of each diameter bin in mm.

  • sample_interval (float) – Time interval over which the samples are collected in seconds.

  • water_density (float or array-like) – Density of water [kg/m3].

Returns:

ds – Dataset containing the computed integral parameters: - Nt : Total number concentration [#/m3] - M1 to M6 : Moments of the drop size distribution - Z : Reflectivity factor [dBZ] - W : Liquid water content [g/m3] - D10 : Diameter at the 10th quantile of the cumulative LWC distribution [mm] - D50 : Median volume drop diameter [mm] - D90 : Diameter at the 90th quantile of the cumulative LWC distribution [mm] - Dmode : Diameter at which the distribution peaks [mm] - Dm : Mean volume drop diameter [mm] - sigma_m : Standard deviation of the volume drop diameter [mm] - Nw : Normalized intercept parameter [m-3·mm⁻¹] - R : Rain rate [mm/h] - P : Rain accumulation [mm] - TKE: Total Kinetic Energy [J/m2] - KED: Kinetic Energy per unit rainfall Depth [J·m⁻²·mm⁻¹]. - KEF: Kinetic Energy Flux [J·m⁻²·h⁻¹].

Return type:

xarray.Dataset

disdrodb.l2.empirical_dsd.compute_qc_bins_metrics(ds)[source][source]#

Compute quality-control metrics for drop-count bins along the diameter dimension.

This function selects the first available drop-related variable from the dataset, optionally collapses over velocity methods and the velocity dimension, then computes four metrics per time step:

  1. Nbins: total number of diameter bins between the first and last non-zero count

  2. Nbins_missing: number of bins with zero or NaN counts in that interval

  3. Nbins_missing_fraction: fraction of missing bins (zeros) in the interval

  4. Nbins_missing_consecutive: maximum length of consecutive missing bins

Parameters:

ds (xr.Dataset) – Input dataset containing one of the following variables: ‘drop_counts’, ‘drop_number_concentration’, or ‘drop_number’. If a ‘velocity_method’ dimension exists, only the first method is used. If a velocity dimension (specified by VELOCITY_DIMENSION) exists, it is summed over.

Returns:

Dataset with a new ‘metric’ dimension of size 4 and coordinates: [‘Nbins’, ‘Nbins_missing’, ‘Nbins_missing_fraction’, ‘Nbins_missing_consecutive’], indexed by ‘time’.

Return type:

xr.Dataset

disdrodb.l2.empirical_dsd.compute_spectrum_parameters(drop_number_concentration, velocity, diameter, sample_interval, water_density=1000)[source][source]#

Compute drop size spectrum of rain rate, kinetic energy, mass and reflectivity.

Parameters:
  • drop_number_concentration (xr.DataArray) – Drop number concentration in each diameter bin [#/m3/mm].

  • velocity (xr.DataArray) – Fall velocity of drops in each diameter bin [m/s]. The presence of a velocity_method dimension enable to compute the parameters with different velocity estimates.

  • diameter (array-like) – Diameter of drops in each bin in m.

  • sample_interval (float) – Time interval over which the samples are collected in seconds.

  • water_density (float or array-like) – Density of water [kg/m3].

Returns:

ds – Dataset containing the following spectrum: - KE_spectrum : Kinetic Energy spectrum [J/m2/mm] - R_spectrum : Rain Rate spectrum [mm/h/mm] - W_spectrum : Mass spectrum [g/m3/mm] - Z_spectrum : Reflectivity spectrum [dBZ of mm6/m3/mm]

Return type:

xarray.Dataset

disdrodb.l2.empirical_dsd.count_bins_with_drops(ds)[source][source]#

Count the number of diameter bins with data.

disdrodb.l2.empirical_dsd.get_bin_dimensions(xr_obj)[source][source]#

Return the dimensions of the drop spectrum.

disdrodb.l2.empirical_dsd.get_drop_average_velocity(drop_number)[source][source]#

Calculate the drop average velocity \( v_m(D))) \) per diameter class.

The average velocity is obtained by weighting by the number of drops in each velocity bin. If in a given diameter bin no drops are recorded, the resulting average drop size velocity for such bin will be set to NaN.

Parameters:

drop_number (xarray.DataArray) – Array of drop counts \( n(D,v) \) per diameter (and velocity, if available) bins over the measurement interval. The DataArray must have the velocity_bin_center coordinate.

Returns:

average_velocity – Array of drop average velocity \( v_m(D))) \) in m·s⁻¹ . At timesteps with zero drop counts, it returns NaN.

Return type:

xarray.DataArray

disdrodb.l2.empirical_dsd.get_drop_number_concentration(drop_number, velocity, diameter_bin_width, sampling_area, sample_interval)[source][source]#

Calculate the volumetric drop number concentration \( N(D) \) per diameter class.

Computes the drop number concentration \( N(D) \) [m⁻³·mm⁻¹] for each diameter class based on the measured drop counts and sensor parameters. This represents the number of drops per unit volume per unit diameter interval. It is also referred to as the drop size distribution N(D) per cubic metre per millimetre [m-3 mm-1]

Parameters:
  • velocity (xarray.DataArray) – Array of drop fall velocities \( v(D) \) corresponding to each diameter bin in meters per second (m/s). Typically the estimated fall velocity is used. But one can also pass the velocity bin center of the optical disdrometer, which get broadcasted along the diameter bin dimension.

  • diameter_bin_width (xarray.DataArray) – Width of each diameter bin \( \Delta D \) in millimeters (mm).

  • drop_number (xarray.DataArray) – Array of drop counts \( n(D) or n(D,v) \) per diameter (and velocity if available) bins over the measurement interval.

  • sample_interval (float or xarray.DataArray) – Time over which the drops are counted \( \Delta t \) in seconds (s).

  • sampling_area (float or xarray.DataArray) – The effective sampling area \( A \) of the sensor in square meters (m²).

Returns:

drop_number_concentration – Array of drop number concentrations \( N(D) \) in m⁻³·mm⁻¹, representing the number of drops per unit volume per unit diameter interval.

Return type:

xarray.DataArray or ndarray

Notes

The drop number concentration \( N(D) \) is calculated using:

\[\begin{split}N(D) = \frac{n(D)}{A_{\text{eff}}(D) \\cdot \\Delta D \\cdot \\Delta t \\cdot v(D)}\end{split}\]

where:

  • \( n(D,v) \): Number of drops counted in diameter (and velocity) bins.

  • \( A_{text{eff}}(D) \): Effective sampling area of the sensor for diameter \( D \) in square meters (m²).

  • \( \Delta D \): Diameter bin width in millimeters (mm).

  • \( \Delta t \): Measurement interval in seconds (s).

  • \( v(D) \): Fall velocity of drops in diameter bin \( D \) in meters per second (m/s).

The effective sampling area \( A_{text{eff}}(D) \) depends on the sensor and may vary with drop diameter.

disdrodb.l2.empirical_dsd.get_drop_volume(diameter)[source][source]#

Compute the volume of a droplet assuming it is spherical.

Parameters:

diameter (float or array-like) – The diameter of the droplet(s). Can be a scalar or an array of diameters.

Returns:

The volume of the droplet(s) calculated in cubic units based on the input diameter(s).

Return type:

array-like

Notes

The volume is calculated using the formula for the volume of a sphere: V = (π/6) * d^3, where d is the diameter of the droplet.

disdrodb.l2.empirical_dsd.get_effective_sampling_area(sensor_name, diameter)[source][source]#

Compute the effective sampling area in m2 of the disdrometer.

The diameter must be provided in meters !

disdrodb.l2.empirical_dsd.get_equivalent_reflectivity_factor(drop_number_concentration, diameter, diameter_bin_width)[source][source]#

Compute the equivalent reflectivity factor in decibels relative to 1 mm⁶·m⁻³ (dBZ).

The equivalent reflectivity (in mm⁶·m⁻³) is obtained from the sixth moment of the drop size distribution (DSD). The reflectivity factor is expressed in decibels relative to 1 mm⁶·m⁻³ using the formula:

\[Z = 10 \cdot \log_{10}(z)\]

where \( z \) is the reflectivity in linear units of the DSD.

To convert back the reflectivity factor to linear units (mm⁶·m⁻³), use the formula:

\[z = 10^{(Z/10)}\]
Parameters:
  • drop_number_concentration (xarray.DataArray) – Array representing the concentration of droplets per diameter class in number per unit volume.

  • diameter (xarray.DataArray) – Array of droplet diameters in meters (m).

  • diameter_bin_width (xarray.DataArray) – Array representing the width of each diameter bin in millimeters (mm).

Returns:

The equivalent reflectivity factor in decibels (dBZ).

Return type:

xarray.DataArray

Notes

The function computes the sixth moment of the DSD using the formula:

\[\begin{split}z = \\sum n(D) \cdot D^6 \cdot \\Delta D\end{split}\]

where \( n(D) \) is the drop number concentration, \( D \) is the drop diameter, and \( \Delta D \) is the diameter bin width.

disdrodb.l2.empirical_dsd.get_equivalent_reflectivity_spectrum(drop_number_concentration, diameter)[source][source]#

Compute the equivalent reflectivity per diameter class.

The equivalent reflectivity per unit diameter Z(D) [in mm⁶·m⁻³ / mm] is expressed in decibels using the formula:

\[Z(D) = 10 \cdot \log_{10}(z(D))\]

where \( z(D) \) is the equivalent reflectivity spectrum in linear units of the DSD.

To convert back the reflectivity factor to linear units (mm⁶·m⁻³ / mm), use the formula:

\[z(D) = 10^{(Z(D)/10)}\]

To obtain the total equivalent reflectivity factor (z) one has to multiply z(D) with the diameter bins intervals and summing over the diameter bins.

Parameters:
  • drop_number_concentration (xarray.DataArray) – Array representing the concentration of droplets per diameter class in number per unit volume.

  • diameter (xarray.DataArray) – Array of droplet diameters in meters (m).

Returns:

The equivalent reflectivity spectrum in decibels (dBZ).

Return type:

xarray.DataArray

disdrodb.l2.empirical_dsd.get_kinetic_energy_spectrum(drop_number_concentration, velocity, diameter, sample_interval, water_density=1000)[source][source]#

Compute the rainfall kinetic energy per diameter class.

To obtain the Total Kinetic Energy (TKE) one has to multiply KE(D) with the diameter

bins intervals and summing over the diameter bins.

Parameters:
  • drop_number_concentration (xarray.DataArray) – Array of drop number concentrations \( N(D) \) in m⁻³·mm⁻¹.

  • velocity (xarray.DataArray or float) – The fall velocities \( v \) of the drops, in meters per second (m/s).

  • diameter (xarray.DataArray) – The equivalent volume diameters \( D \) of the drops in each bin, in meters (m).

  • sample_interval (float) – The time over which the drops are counted \( \Delta t \) in seconds (s).

  • water_density (float, optional) – The density of water \( rho_w \) in kilograms per cubic meter (kg/m³). Default is 1000 kg/m³.

Returns:

Kinetic Energy Spectrum [J/m2/mm]

Return type:

xr.DataArray

disdrodb.l2.empirical_dsd.get_kinetic_energy_variables(drop_number_concentration, velocity, diameter, diameter_bin_width, sample_interval, water_density=1000)[source][source]#

Compute rainfall kinetic energy descriptors from the drop number concentration.

Parameters:
  • drop_number_concentration (xarray.DataArray) – Array of drop number concentrations \( N(D) \) in m⁻³·mm⁻¹.

  • velocity (xarray.DataArray or float) – The fall velocities \( v \) of the drops, in meters per second (m/s).

  • diameter (xarray.DataArray) – The equivalent volume diameters \( D \) of the drops in each bin, in meters (m).

  • diameter_bin_width (xarray.DataArray) – Width of each diameter bin \( \Delta D \) in millimeters (mm).

  • sample_interval (float) – The time over which the drops are counted \( \Delta t \) in seconds (s).

  • water_density (float, optional) – The density of water \( rho_w \) in kilograms per cubic meter (kg/m³). Default is 1000 kg/m³.

Returns:

Xarray Dataset with relevant rainfall kinetic energy variables: - TKE: Total Kinetic Energy [J/m2] - KED: Kinetic Energy per unit rainfall Depth [J·m⁻²·mm⁻¹]. Typical values range between 0 and 40 J·m⁻²·mm⁻¹. - KEF: Kinetic Energy Flux [J·m⁻²·h⁻¹]. Typical values range between 0 and 5000 J·m⁻²·h⁻¹. KEF is related to the KED by the rain rate: KED = KEF/R .

Return type:

xarray.Dataset

disdrodb.l2.empirical_dsd.get_kinetic_energy_variables_from_drop_number(drop_number, velocity, sampling_area, diameter, sample_interval, water_density=1000)[source][source]#

Compute rainfall kinetic energy descriptors from the measured drop number spectrum.

Parameters:
  • drop_number (xarray.DataArray) – The number of drops in each diameter (and velocity, if available) bin(s).

  • velocity (xarray.DataArray or float) – The fall velocities \( v \) of the drops in each bin, in meters per second (m/s). Values are broadcasted to match the dimensions of drop_number.

  • diameter (xarray.DataArray) – The equivalent volume diameters \( D \) of the drops in each bin, in meters (m).

  • sampling_area (float) – The effective sampling area \( A \) of the sensor in square meters (m²).

  • sample_interval (float) – The time over which the drops are counted \( \Delta t \) in seconds (s).

  • water_density (float, optional) – The density of water \( rho_w \) in kilograms per cubic meter (kg/m³). Default is 1000 kg/m³.

Returns:

Xarray Dataset with relevant rainfall kinetic energy variables: - TKE: Total Kinetic Energy [J/m2] - KED: Kinetic Energy per unit rainfall Depth [J·m⁻²·mm⁻¹]. Typical values range between 0 and 40 J·m⁻²·mm⁻¹. - KEF: Kinetic Energy Flux [J·m⁻²·h⁻¹]. Typical values range between 0 and 5000 J·m⁻²·h⁻¹. KEF is related to the KED by the rain rate: KED = KEF/R .

Return type:

xarray.Dataset

Notes

KED provides a measure of the energy associated with each unit of rainfall depth. KED is useful for analyze the potential impact of raindrop erosion as a function of the intensity of rainfall events.

The kinetic energy of a rain drop is defined as:

\[\begin{split}KE(D) = \frac{1}{2} · m_{drop} · v_{drop}^2 = \frac{\\pi \rho_{w}}{12} · D^3 · v^2\end{split}\]

The Total Kinetic Energy (TKE) is calculated using:

\[\begin{split}TKE = \\sum_{i,j} \\left({n_{ij} · KE(D_{i}) \right) = \frac{\\pi \rho_{w}}{12 · A} \\sum_{i,j} \\left( {n_{ij} · D_{i}^3 · v_{j}^2}} \right)\end{split}\]

The Kinetic Energy Flux (KEF) is calculated using:

\[\begin{split}KEF = \frac{TKE}{\\Delta t } · 3600\end{split}\]

KED is calculated using:

\[\begin{split}KED = \frac{KEF}{R} \\cdot \frac{\\pi}{6} \\cdot \frac{\rho_w}{R} \\cdot \\sum_{i,j} \\left( \frac{n_{ij} \\cdot D_i^3 \\cdot v_j^2}{A} \right)\end{split}\]

where:

  • \( n_{ij} \) is the number of drops in diameter bin \( i \) and velocity bin \( j \).

  • \( D_i \) is the diameter of bin \( i \).

  • \( v_j \) is the velocity of bin \( j \).

  • \( A \) is the sampling area.

  • \( \Delta t \) is the measurement interval in seconds.

  • \( R \) is the rainfall rate in mm/hr.

disdrodb.l2.empirical_dsd.get_liquid_water_content(drop_number_concentration, diameter, diameter_bin_width, water_density=1000)[source][source]#

Calculate the liquid water content based on drop number concentration and drop diameter.

Parameters:
  • drop_number_concentration (array-like) – The concentration of droplets (number of droplets per unit volume) in each diameter bin.

  • diameter (array-like) – The diameters of the droplets for each bin, in meters (m).

  • diameter_bin_width (array-like) – The width of each diameter bin, in millimeters (mm).

  • water_density (float, optional) – The density of water in kg/m^3. The default is 1000 kg/m3.

Returns:

The calculated liquid water content in grams per cubic meter (g/m3).

Return type:

array-like

disdrodb.l2.empirical_dsd.get_liquid_water_content_from_moments(moment_3, water_density=1000)[source][source]#

Calculate the liquid water content (LWC) from the third moment of the DSD.

LWC represents the mass of liquid water per unit volume of air.

Parameters:
  • moment_3 (float or array-like) – The third moment of the drop size distribution, \( M_3 \), in units of [m⁻³·mm³] (number per cubic meter times diameter cubed).

  • water_density (float, optional) – The density of water in kilograms per cubic meter (kg/m³). Default is 1000 kg/m³ (approximate density of water at 20°C).

Returns:

lwc – The liquid water content in grams per cubic meter (g/m³).

Return type:

float or array-like

Notes

The liquid water content is calculated using the formula:

\[\begin{split}\text{LWC} = \frac{\\pi \rho_w}{6} \\cdot M_3\end{split}\]

where:

  • \( text{LWC} \) is the liquid water content [g/m³].

  • \( rho_w \) is the density of water [g/mm³].

  • \( M_3 \) is the third moment of the DSD [m⁻³·mm³].

Examples

Compute the liquid water content from the third moment:

>>> moment_3 = 1e6  # Example value in [m⁻³·mm³]
>>> lwc = get_liquid_water_content_from_moments(moment_3)
>>> print(f"LWC: {lwc:.4f} g/m³")
LWC: 0.0005 g/m³
disdrodb.l2.empirical_dsd.get_liquid_water_spectrum(drop_number_concentration, diameter, water_density=1000)[source][source]#

Calculate the mass spectrum W(D) per diameter class.

It represents the mass of liquid water as a function of raindrop diameter. The integrated liquid water content can be obtained by multiplying the spectrum with the diameter bins intervals and summing over the diameter bins.

Parameters:
  • drop_number_concentration (array-like) – The concentration of droplets (number of droplets per unit volume) in each diameter bin.

  • diameter (array-like) – The diameters of the droplets for each bin, in meters (m).

Returns:

The calculated rain drop mass spectrum in grams per cubic meter per unit diameter (g/m3/mm).

Return type:

array-like

disdrodb.l2.empirical_dsd.get_mean_volume_drop_diameter(moment_3, moment_4)[source][source]#

Calculate the volume-weighted mean volume diameter \( D_m \) from DSD moments.

The mean volume diameter of a drop size distribution (DSD) is computed using the third and fourth moments.

The volume-weighted mean volume diameter is also referred as the mass mean diameter. It represents the first moment of the mass spectrum.

If no drops are recorded, the output values is NaN.

Parameters:
  • moment_3 (float or array-like) – The third moment of the drop size distribution, \( M_3 \), in units of [m⁻³·mm³].

  • moment_4 (float or array-like) – The fourth moment of the drop size distribution, \( M_4 \), in units of [m⁻³·mm⁴].

Returns:

D_m – The mean volume diameter in millimeters (mm).

Return type:

float or array-like

Notes

The mean volume diameter is calculated using the formula:

\[D_m = \frac{M_4}{M_3}\]

where:

  • \( D_m \) is the mean volume diameter [mm].

  • \( M_3 \) is the third moment of the DSD [m⁻³·mm³].

  • \( M_4 \) is the fourth moment of the DSD [m⁻³·mm⁴].

Examples

Compute the mean volume diameter from the third and fourth moments:

>>> moment_3 = 1e6  # Example value in [m⁻³·mm³]
>>> moment_4 = 5e6  # Example value in [m⁻³·mm⁴]
>>> D_m = get_mean_volume_drop_diameter(moment_3, moment_4)
>>> print(f"Mean Volume Diameter D_m: {D_m:.4f} mm")
Mean Volume Diameter D_m: 5.0000 mm
disdrodb.l2.empirical_dsd.get_median_volume_drop_diameter(drop_number_concentration, diameter, diameter_bin_width, water_density=1000)[source][source]#

Compute the median volume drop diameter (D50).

The median volume drop diameter (D50) is defined as the diameter at which half of the total liquid water content is contributed by drops smaller than D50, and half by drops larger than D50.

Drops smaller (respectively larger) than D50 contribute to half of the total rainwater content in the sampled volume. D50 is sensitive to the concentration of large drops.

Often referred also as D50 (50 for 50 percentile of the distribution).

Parameters:
  • drop_number_concentration (xarray.DataArray) – The drop number concentration ( N(D) ) for each diameter bin, typically in units of number per cubic meter per millimeter (m⁻³·mm⁻¹).

  • diameter (xarray.DataArray) – The equivalent volume diameters ( D ) of the drops in each bin, in meters (m).

  • diameter_bin_width (xarray.DataArray) – The width ( Delta D ) of each diameter bin, in millimeters (mm).

  • water_density (float, optional) – The density of water in kg/m^3. The default is 1000 kg/m3.

Returns:

Median volume drop diameter (D50) [mm]. The drop diameter that divides the volume of water contained in the sample into two equal parts.

Return type:

xarray.DataArray

disdrodb.l2.empirical_dsd.get_min_max_diameter(drop_counts)[source][source]#

Get the minimum and maximum diameters where drop_counts is non-zero.

Parameters:

drop_counts (xarray.DataArray) – Drop counts with dimensions (“time”, “diameter_bin_center”) and coordinate “diameter_bin_center”. It assumes the diameter coordinate to be monotonically increasing !

Returns:

  • min_drop_diameter (xarray.DataArray) – Minimum diameter where drop_counts is non-zero, for each time step.

  • max_drop_diameter (xarray.DataArray) – Maximum diameter where drop_counts is non-zero, for each time step.

disdrodb.l2.empirical_dsd.get_mode_diameter(drop_number_concentration, diameter)[source][source]#

Get raindrop diameter with highest occurrence.

Parameters:
  • drop_number_concentration (xarray.DataArray) – The drop number concentration N(D) for each diameter bin, typically in units of number per cubic meter per millimeter (m⁻³·mm⁻¹).

  • diameter (xarray.DataArray) – The equivalent volume diameters D of the drops in each bin, in meters (m).

Returns:

The diameter with the highest drop number concentration.

Return type:

xarray.DataArray

disdrodb.l2.empirical_dsd.get_moment(drop_number_concentration, diameter, diameter_bin_width, moment)[source][source]#

Calculate the m-th moment of the drop size distribution.

Computes the m-th moment of the drop size distribution (DSD), denoted as E[D**m], where D is the drop diameter and m is the order of the moment. This is useful in meteorology and hydrology for characterizing precipitation. For example, weather radar measurements correspond to the sixth moment of the DSD (m = 6).

Parameters:
  • drop_number_concentration (xarray.DataArray) – The drop number concentration N(D) for each diameter bin, typically in units of number per cubic meter per millimeter (m⁻³ mm⁻¹).

  • diameter (xarray.DataArray) – The equivalent volume diameters D of the drops in each bin, in meters (m).

  • diameter_bin_width (xarray.DataArray) – The width dD of each diameter bin, in millimeters (mm).

  • moment (int or float) – The order m of the moment to compute.

Returns:

moment_value – The computed m-th moment of the drop size distribution, typically in units dependent on the input units, such as mmᵐ m⁻³.

Return type:

xarray.DataArray

Notes

The m-th moment is calculated using the formula:

\[\begin{split}M_m = \\sum_{\text{bins}} N(D) \\cdot D^m \\cdot dD\end{split}\]

where:

  • \( M_m \) is the m-th moment of the DSD.

  • \( N(D) \) is the drop number concentration for diameter \( D \).

  • \( D^m \) is the diameter raised to the power of \( m \).

  • \( dD \) is the diameter bin width.

This computation integrates over the drop size distribution to provide a scalar value representing the statistical momen

disdrodb.l2.empirical_dsd.get_normalized_intercept_parameter(liquid_water_content, mean_volume_diameter, water_density=1000)[source][source]#

Calculate the normalized intercept parameter \( N_w \) of the drop size distribution.

A higher \( N_w \) indicates a higher concentration of smaller drops. The \( N_w \) is used in models to represent the DSD when assuming a normalized gamma distribution.

Parameters:
  • liquid_water_content (float or array-like) – Liquid water content \( LWC \) in grams per cubic meter (g/m³).

  • mean_volume_diameter (float or array-like) – Mean volume diameter \( D_m \) in millimeters (mm).

  • water_density (float, optional) – Density of water \( rho_w \) in kilograms per cubic meter (kg/m³). The default is 1000 kg/m³.

Returns:

Nw – Normalized intercept parameter \( N_w \) in units of m⁻3·mm⁻¹.

Return type:

xarray.DataArray or float

Notes

The normalized intercept parameter \( N_w \) is calculated using the formula:

\[\begin{split}N_w = \frac{256}{\\pi \rho_w} \\cdot \frac{W}{D_m^4}\end{split}\]

where:

  • \( N_w \) is the normalized intercept parameter.

  • \( W \) is the liquid water content in g/m³.

  • \( D_m \) is the mean volume diameter in mm.

  • \( rho_w \) is the density of water in kg/m³.

disdrodb.l2.empirical_dsd.get_normalized_intercept_parameter_from_moments(moment_3, moment_4)[source][source]#

Calculate the normalized intercept parameter \( N_w \) of the drop size distribution.

moment_3float or array-like

The third moment of the drop size distribution, \( M_3 \), in units of [m⁻³·mm³] (number per cubic meter times diameter cubed).

moment_4float or array-like

The fourth moment of the drop size distribution, \( M_3 \), in units of [m⁻³·mm4].

Returns:

Nw – Normalized intercept parameter \( N_w \) in units of m⁻3·mm⁻¹.

Return type:

xarray.DataArray or float

References

Testud, J., S. Oury, R. A. Black, P. Amayenc, and X. Dou, 2001: The Concept of “Normalized” Distribution to Describe Raindrop spectrum: A Tool for Cloud Physics and Cloud Remote Sensing. J. Appl. Meteor. Climatol., 40, 1118-1140, https://doi.org/10.1175/1520-0450(2001)040<1118:TCONDT>2.0.CO;2

disdrodb.l2.empirical_dsd.get_quantile_volume_drop_diameter(drop_number_concentration, diameter, diameter_bin_width, fraction, water_density=1000)[source][source]#

Compute the diameter corresponding to a specified fraction of the cumulative liquid water content (LWC).

This function calculates the diameter ( D_f ) at which the cumulative LWC reaches a specified fraction ( f ) of the total LWC for each drop size distribution (DSD). When ( f = 0.5 ), it computes the median volume drop diameter.

Parameters:
  • drop_number_concentration (xarray.DataArray) – The drop number concentration ( N(D) ) for each diameter bin, typically in units of number per cubic meter per millimeter (m⁻³·mm⁻¹).

  • diameter (xarray.DataArray) – The equivalent volume diameters ( D ) of the drops in each bin, in meters (m).

  • diameter_bin_width (xarray.DataArray) – The width ( Delta D ) of each diameter bin, in millimeters (mm).

  • fraction (float or numpy.ndarray) – The fraction ( f ) of the total liquid water content to compute the diameter for. Default is 0.5, which computes the median volume diameter (D50). For other percentiles, use 0.1 for D10, 0.9 for D90, etc. Values must be between 0 and 1 (exclusive).

  • water_density (float, optional) – The density of water in kg/m^3. The default is 1000 kg/m3.

Returns:

D_f – The diameter ( D_f ) corresponding to the specified fraction ( f ) of cumulative LWC, in millimeters (mm). For fraction=0.5, this is the median volume drop diameter D50.

Return type:

xarray.DataArray

Notes

The calculation involves computing the cumulative sum of the liquid water content contributed by each diameter bin and finding the diameter at which the cumulative sum reaches the specified fraction ( f ) of the total liquid water content.

Linear interpolation is used between the two diameter bins where the cumulative LWC crosses the target LWC fraction.

disdrodb.l2.empirical_dsd.get_rain_accumulation(rain_rate, sample_interval)[source][source]#

Calculate the total rain accumulation over a specified time period.

Parameters:
  • rain_rate (float or array-like) – The rain rate in millimeters per hour (mm/h).

  • sample_interval (int) – The time over which to accumulate rain, specified in seconds.

Returns:

The total rain accumulation in millimeters (mm) over the specified time period.

Return type:

float or numpy.ndarray

disdrodb.l2.empirical_dsd.get_rain_rate(drop_number_concentration, velocity, diameter, diameter_bin_width)[source][source]#

Compute the rain rate \( R \) [mm/h] based on the drop size distribution and raindrop velocities.

Calculates the rain rate by integrating over the drop size distribution (DSD), considering the volume of water falling per unit time and area.

Parameters:
  • drop_number_concentration (xarray.DataArray) – Array of drop number concentrations \( N(D) \) in m⁻³·mm⁻¹.

  • velocity (xarray.DataArray) – Array of drop fall velocities \( v(D) \) corresponding to each diameter bin in meters per second (m/s).

  • diameter (xarray.DataArray) – Array of drop diameters \( D \) in meters (m).

  • diameter_bin_width (xarray.DataArray) – Width of each diameter bin \( \Delta D \) in millimeters (mm).

Returns:

rain_rate – The rain rate \( R \) in millimeters per hour (mm/h), representing the volume of water falling per unit area per unit time.

Return type:

xarray.DataArray

Notes

The rain rate \( R \) is calculated using:

\[\begin{split}R = \frac{\\pi}{6} \times 10^{-3} \times 3600 \times \\sum_{\text{bins}} N(D) \\cdot v(D) \\cdot D^3 \\cdot \\Delta D\end{split}\]

where:

  • \( N(D) \): Drop number concentration [m⁻³·mm⁻¹].

  • \( v(D) \): Fall velocity of drops in diameter bin \( D \) [m/s].

  • \( D \): Drop diameter [mm].

  • \( \Delta D \): Diameter bin width [mm].

  • The factor \( frac{\pi}{6} \) converts the diameter cubed to volume of a sphere.

  • The factor \( 10^{-3} \) converts from mm³ to m³.

  • The factor \( 3600 \) converts from seconds to hours.

disdrodb.l2.empirical_dsd.get_rain_rate_contribution(drop_number_concentration, velocity, diameter, diameter_bin_width)[source][source]#

Compute the rain rate contribution per diameter class.

Parameters:
  • drop_number_concentration (xarray.DataArray) – Array of drop number concentrations \( N(D) \) in m⁻³·mm⁻¹.

  • velocity (xarray.DataArray) – Array of drop fall velocities \( v(D) \) corresponding to each diameter bin in meters per second (m/s).

  • diameter (xarray.DataArray) – Array of drop diameters \( D \) in meters (m).

  • diameter_bin_width (xarray.DataArray) – Width of each diameter bin \( \Delta D \) in millimeters (mm).

Returns:

The rain rate contribution percentage per diameter class.

Return type:

xarray.DataArray

disdrodb.l2.empirical_dsd.get_rain_rate_from_drop_number(drop_number, sampling_area, diameter, sample_interval)[source][source]#

Compute the rain rate \( R \) [mm/h] based on the drop size distribution and drop velocities.

This function calculates the rain rate by integrating over the drop size distribution (DSD), considering the volume of water falling per unit time and area. It uses the number of drops counted in each diameter class, the effective sampling area of the sensor, the diameters of the drops, and the time interval over which the drops are counted.

Parameters:
  • drop_number (xarray.DataArray) – Array representing the number of drops per diameter class and, optionally, velocity class \( n(D, (v)) \).

  • sample_interval (float or xarray.DataArray) – The time duration over which drops are counted \( \Delta t \) in seconds (s).

  • sampling_area (float or xarray.DataArray) – The effective sampling area \( A \) of the sensor in square meters (m²).

  • diameter (xarray.DataArray) – Array of drop diameters \( D \) in meters (m).

Returns:

rain_rate – The computed rain rate \( R \) in millimeters per hour (mm/h), which represents the volume of water falling per unit area per unit time.

Return type:

xarray.DataArray

Notes

The rain rate \( R \) is calculated using the following formula:

\[ \begin{align}\begin{aligned}\begin{split}R = \frac{\\pi}{6} \times 10^{3} \times 3600 \times \\sum_{\text{bins}} n(D) \cdot A(D) \cdot D^3 \cdot \\Delta t\end{split}\\\begin{split} = \\pi \times 0.6 \times 10^{6} \times \\sum_{\text{bins}} n(D) \cdot A(D) \cdot D^3 \cdot \\Delta t\end{split}\\\begin{split} = \\pi \times 6 \times 10^{5} \times \\sum_{\text{bins}} n(D) \cdot A(D) \cdot D^3 \cdot \\Delta t\end{split}\end{aligned}\end{align} \]

Where: - \( n(D) \) is the number of drops in each diameter class. - \( A(D) \) is the effective sampling area. - \( D \) is the drop diameter. - \( \Delta t \) is the time interval for drop counts.

This formula incorporates a conversion factor to express the rain rate in millimeters per hour.

In the literature, when the diameter is expected in millimeters, the formula is given as: .. math:

R = \\pi  \times {6} \times 10^{-4}  \times
\\sum_{\text{bins}} n(D) \cdot A(D) \cdot D^3 \cdot \\Delta t
disdrodb.l2.empirical_dsd.get_rain_rate_spectrum(drop_number_concentration, velocity, diameter)[source][source]#

Compute the rain rate per diameter class.

It represents the rain rate as a function of raindrop diameter. The total rain rate can be obtained by multiplying the spectrum with the diameter bin width and summing over the diameter bins.

Parameters:
  • drop_number_concentration (xarray.DataArray) – Array of drop number concentrations \( N(D) \) in m⁻³·mm⁻¹.

  • velocity (xarray.DataArray) – Array of drop fall velocities \( v(D) \) corresponding to each diameter bin in meters per second (m/s).

  • diameter (xarray.DataArray) – Array of drop diameters \( D \) in meters (m).

Returns:

The rain rate spectrum in millimeters per hour per mm, representing the volume of water falling per unit area per unit time per unit diameter.

Return type:

xarray.DataArray

disdrodb.l2.empirical_dsd.get_std_volume_drop_diameter(moment_3, moment_4, moment_5)[source][source]#

Calculate the standard deviation of the mass-weighted drop diameter (σₘ).

This parameter is often also referred as the mass spectrum standard deviation. It quantifies the spread or variability of DSD.

If drops are recorded in just one bin, the standard deviation of the mass-weighted drop diameter is set to 0. If no drops are recorded, the output values is NaN.

Parameters:
  • drop_number_concentration (xarray.DataArray) – The drop number concentration \( N(D) \) for each diameter bin, typically in units of number per cubic meter per millimeter (m⁻³·mm⁻¹).

  • diameter (xarray.DataArray) – The equivalent volume diameters \( D \) of the drops in each bin, in meters (m).

  • diameter_bin_width (xarray.DataArray) – The width \( \Delta D \) of each diameter bin, in millimeters (mm).

  • mean_volume_diameter (xarray.DataArray) – The mean volume diameter \( D_m \), in millimeters (mm). This is typically computed using the third and fourth moments or directly from the DSD.

Returns:

sigma_m – The standard deviation of the mass-weighted drop diameter, \( \sigma_m \), in millimeters (mm).

Return type:

xarray.DataArray or float

Notes

The standard deviation of the mass-weighted drop diameter is calculated using the formula:

\[\begin{split}\\sigma_m = \\sqrt{\frac{\\sum [N(D) \\cdot (D - D_m)^2 \\cdot D^3 \\cdot \\Delta D]}{\\sum [N(D) \\cdot D^3 \\cdot \\Delta D]}}\end{split}\]

where:

  • \( N(D) \) is the drop number concentration for diameter \( D \) [m⁻³·mm⁻¹].

  • \( D \) is the drop diameter [mm].

  • \( D_m \) is the mean volume diameter [mm].

  • \( \Delta D \) is the diameter bin width [mm].

  • The numerator computes the weighted variance of diameters.

  • The weighting factor \( D^3 \) accounts for mass (since mass ∝ \( D^3 \)).

Physical Interpretation:

  • A smaller \( \sigma_m \) indicates that the mass is concentrated around the mean mass-weighted diameter, implying less variability in drop sizes.

  • A larger \( \sigma_m \) suggests a wider spread of drop sizes contributing to the mass, indicating greater variability.

References

  • Smith, P. L., Johnson, R. W., & Kliche, D. V. (2019). On Use of the Standard Deviation of the Mass Distribution as a Parameter in Raindrop Size Distribution Functions. Journal of Applied Meteorology and Climatology, 58(4), 787-796. https://doi.org/10.1175/JAMC-D-18-0086.1

  • Williams, C. R., and Coauthors, 2014: Describing the Shape of Raindrop Size Distributions Using Uncorrelated Raindrop Mass Spectrum Parameters. J. Appl. Meteor. Climatol., 53, 1282-1296, https://doi.org/10.1175/JAMC-D-13-076.1.

disdrodb.l2.empirical_dsd.get_total_number_concentration(drop_number_concentration, diameter_bin_width)[source][source]#

Compute the total number concentration \( N_t \) from the drop size distribution.

Calculates the total number concentration \( N_t \) [m⁻³] by integrating the drop number concentration over all diameter bins.

Parameters:
  • drop_number_concentration (xarray.DataArray) – Array of drop number concentrations \( N(D) \) in m⁻³·mm⁻¹.

  • diameter_bin_width (xarray.DataArray) – Width of each diameter bin \( \Delta D \) in millimeters (mm).

Returns:

total_number_concentration – Total number concentration \( N_t \) in m⁻³, representing the total number of drops per unit volume.

Return type:

xarray.DataArray or ndarray

Notes

The total number concentration \( N_t \) is calculated by integrating over the diameter bins:

\[\begin{split}N_t = \\sum_{\text{bins}} N(D) \\cdot \\Delta D\end{split}\]

where:

  • \( N(D) \): Drop number concentration in each diameter bin [m⁻³·mm⁻¹].

  • \( \Delta D \): Diameter bin width in millimeters (mm).

disdrodb.l2.event module#

Functions for event definition.

disdrodb.l2.event.get_files_partitions(list_partitions, filepaths, sample_interval, accumulation_interval, rolling)[source][source]#

Provide information about the required files for each event.

For each event in list_partitions, this function identifies the file paths from filepaths that overlap with the event period, adjusted by the accumulation_interval. The event period is extended backward or forward based on the rolling parameter.

Parameters:
  • list_partitions (list of dict) – List of events, where each event is a dictionary containing at least ‘start_time’ and ‘end_time’ keys with numpy.datetime64 values.

  • filepaths (list of str) – List of file paths corresponding to data files.

  • sample_interval (numpy.timedelta64 or int) – The sample interval of the input dataset.

  • accumulation_interval (numpy.timedelta64 or int) – Time interval to adjust the event period for accumulation. If an integer is provided, it is assumed to be in seconds.

  • rolling (bool) – If True, adjust the event period backward by accumulation_interval (rolling backward). If False, adjust forward (aggregate forward).

Returns:

A list where each element is a dictionary containing: - ‘start_time’: Adjusted start time of the event (numpy.datetime64). - ‘end_time’: Adjusted end time of the event (numpy.datetime64). - ‘filepaths’: List of file paths overlapping with the adjusted event period.

Return type:

list of dict

disdrodb.l2.event.group_timesteps_into_event(timesteps, event_max_time_gap, event_min_size=0, event_min_duration='0S', neighbor_min_size=0, neighbor_time_interval='0S')[source][source]#

Group candidate timesteps into events based on temporal criteria.

This function groups valid candidate timesteps into events by considering how they cluster in time. Any isolated timesteps (based on neighborhood criteria) are first removed. Then, consecutive timesteps are grouped into the same event if the time gap between them does not exceed event_max_time_gap. Finally, events that do not meet minimum size or duration requirements are filtered out.

Please note that neighbor_min_size and neighbor_time_interval are very sensitive to the actual sample interval of the data !

Parameters:
  • timesteps (np.ndarray) – Candidate timesteps to be grouped into events.

  • neighbor_time_interval (str) – The time interval around a given a timestep defining the neighborhood. Only timesteps that fall within this time interval before or after a timestep are considered neighbors.

  • neighbor_min_size (int, optional) – The minimum number of neighboring timesteps required within neighbor_time_interval for a timestep to be considered non-isolated. Isolated timesteps are removed ! - If neighbor_min_size=0, then no timestep is considered isolated and no filtering occurs. - If `neighbor_min_size=1, the timestep must have at least one neighbor within neighbor_time_interval. - If neighbor_min_size=2, the timestep must have at least two timesteps within neighbor_time_interval. Defaults to 1.

  • event_max_time_gap (str) – The maximum time interval between two timesteps to be considered part of the same event. This parameters is used to group timesteps into events !

  • event_min_duration (str) – The minimum duration an event must span. Events shorter than this duration are discarded.

  • event_min_size (int, optional) – The minimum number of valid timesteps required for an event. Defaults to 1.

Returns:

A list of events, where each event is represented as a dictionary with keys: - “start_time”: np.datetime64, start time of the event - “end_time”: np.datetime64, end time of the event - “duration”: np.timedelta64, duration of the event - “n_timesteps”: int, number of valid timesteps in the event

Return type:

list of dict

disdrodb.l2.event.group_timesteps_into_events(timesteps, event_max_time_gap)[source][source]#

Group valid timesteps into events based on a maximum allowed dry interval.

Parameters:
  • timesteps (array-like of np.datetime64) – Sorted array of valid timesteps.

  • event_max_time_gap (np.timedelta64) – Maximum time interval allowed between consecutive valid timesteps for them to be considered part of the same event.

Returns:

A list of events, where each event is an array of timesteps.

Return type:

list of np.ndarray

disdrodb.l2.event.remove_isolated_timesteps(timesteps, neighbor_min_size, neighbor_time_interval)[source][source]#

Remove isolated timesteps that do not have enough neighboring timesteps within a specified time gap.

A timestep is considered isolated (and thus removed) if it does not have at least neighbor_min_size other timesteps within the neighbor_time_interval before or after it. In other words, for each timestep, we look for how many other timesteps fall into the time interval [t - neighbor_time_interval, t + neighbor_time_interval], excluding it itself. If the count of such neighbors is less than neighbor_min_size, that timestep is removed.

Parameters:
  • timesteps (array-like of np.datetime64) – Sorted or unsorted array of valid timesteps.

  • neighbor_time_interval (np.timedelta64) – The time interval around a given a timestep defining the neighborhood. Only timesteps that fall within this time interval before or after a timestep are considered neighbors.

  • neighbor_min_size (int, optional) – The minimum number of neighboring timesteps required within neighbor_time_interval for a timestep to be considered non-isolated. - If neighbor_min_size=0, then no timestep is considered isolated and no filtering occurs. - If `neighbor_min_size=1, the timestep must have at least one neighbor within neighbor_time_interval. - If neighbor_min_size=2, the timestep must have at least two timesteps within neighbor_time_interval. Defaults to 1.

Returns:

Array of timesteps with isolated entries removed.

Return type:

np.ndarray

disdrodb.l2.processing module#

Implement DISDRODB L2 processing.

disdrodb.l2.processing.check_l2e_input_dataset(ds)[source][source]#

Check dataset validity for L2E production.

disdrodb.l2.processing.check_l2m_input_dataset(ds)[source][source]#

Check dataset validity for L2M production.

disdrodb.l2.processing.define_diameter_array(diameter_min=0, diameter_max=10, diameter_spacing=0.05)[source][source]#

Define an array of diameters and their corresponding bin properties.

Parameters:
  • diameter_min (float, optional) – The minimum diameter value. The default value is 0 mm.

  • diameter_max (float, optional) – The maximum diameter value. The default value is 10 mm.

  • diameter_spacing (float, optional) – The spacing between diameter values. The default value is 0.05 mm.

Returns:

A DataArray containing the center of each diameter bin, with coordinates for the bin width, lower bound, upper bound, and center.

Return type:

xr.DataArray

disdrodb.l2.processing.define_velocity_array(ds)[source][source]#

Create the fall velocity DataArray using various methods.

If ‘velocity_bin_center’ is a dimension in the dataset, returns a Dataset with ‘measured_velocity’, ‘average_velocity’, and ‘fall_velocity’ as variables. Otherwise, returns the ‘fall_velocity’ DataArray from the input dataset.

Parameters:

ds (xarray.Dataset) – The input dataset containing velocity variables.

Returns:

velocity

Return type:

xarray.DataArray

disdrodb.l2.processing.generate_l2_radar(ds, frequency=None, num_points=1024, diameter_max=10, canting_angle_std=7, axis_ratio_model='Thurai2007', permittivity_model='Turner2016', water_temperature=10, elevation_angle=0, parallel=True)[source][source]#

Simulate polarimetric radar variables from empirical drop number concentration or the estimated PSD.

Parameters:
  • ds (xarray.Dataset) – Dataset containing the drop number concentration variable or the PSD parameters.

  • frequency (str, float, or list of str and float, optional) – Frequencies in GHz for which to compute the radar parameters. Alternatively, also strings can be used to specify common radar frequencies. If None, the common radar frequencies will be used. See disdrodb.scattering.available_radar_bands().

  • num_points (int or list of integer, optional) – Number of bins into which discretize the PSD.

  • diameter_max (float or list of float, optional) – Maximum diameter. The default value is 10 mm.

  • canting_angle_std (float or list of float, optional) – Standard deviation of the canting angle. The default value is 7.

  • axis_ratio_model (str or list of str, optional) – Models to compute the axis ratio. The default model is Thurai2007. See available models with disdrodb.scattering.available_axis_ratio_models().

  • permittivity_model (str str or list of str, optional) – Permittivity model to use to compute the refractive index and the rayleigh_dielectric_factor. The default is Turner2016. See available models with disdrodb.scattering.available_permittivity_models().

  • water_temperature (float or list of float, optional) – Water temperature in degree Celsius to be used in the permittivity model. The default is 10 degC.

  • elevation_angle (float or list of float, optional) – Radar elevation angles in degrees. Specify 90 degrees for vertically pointing radars. The default is 0 degrees.

  • parallel (bool, optional) – Whether to compute radar variables in parallel. The default value is True.

Returns:

Dataset containing the computed radar parameters.

Return type:

xarray.Dataset

disdrodb.l2.processing.generate_l2e(ds, ds_env=None, compute_spectra=False, compute_percentage_contribution=False, minimum_ndrops=1, minimum_nbins=1, minimum_rain_rate=0.01)[source][source]#

Generate the DISDRODB L2E dataset from the DISDRODB L1 dataset.

Parameters:
  • ds (xarray.Dataset) –

    DISDRODB L1 dataset. Alternatively, a xarray dataset with at least:

    • variables: drop_number, fall_velocity

    • dimension: DIAMETER_DIMENSION

    • coordinates: diameter_bin_center, diameter_bin_width, sample_interval

    • attributes: sensor_name

  • ds_env (xarray.Dataset, optional) – Environmental dataset used for fall velocity and water density estimates. If None, a default environment dataset will be loaded.

Returns:

DISRODB L2E dataset.

Return type:

xarray.Dataset

disdrodb.l2.processing.generate_l2m(ds, psd_model, optimization=None, optimization_kwargs=None, diameter_min=0, diameter_max=10, diameter_spacing=0.05, ds_env=None, fall_velocity_method='Beard1976', minimum_ndrops=1, minimum_nbins=3, minimum_rain_rate=0.01, gof_metrics=True)[source][source]#

Generate the DISDRODB L2M dataset from a DISDRODB L2E dataset.

This function estimates PSD model parameters and successively computes DSD integral parameters. Optionally, radar variables at various bands are simulated using T-matrix simulations. Goodness-of-fit metrics of the PSD can also be optionally included into the output dataset.

Parameters:
  • ds (xarray.Dataset) – DISDRODB L2E dataset.

  • psd_model (str) – The PSD model to fit. See disdrodb.psd.available_psd_models().

  • ds_env (xarray.Dataset, optional) – Environmental dataset used for fall velocity and water density estimates. If None, a default environment dataset will be loaded.

  • diameter_min (float, optional) – Minimum PSD diameter. The default value is 0 mm.

  • diameter_max (float, optional) – Maximum PSD diameter. The default value is 8 mm.

  • diameter_spacing (float, optional) – PSD diameter spacing. The default value is 0.05 mm.

  • optimization (str, optional) – The fitting optimization procedure. Either “GS” (Grid Search), “ML (Maximum Likelihood) or “MOM” (Method of Moments).

  • optimization_kwargs (dict, optional) – Dictionary with arguments to customize the fitting procedure.

  • minimum_nbins (int) – Minimum number of bins with drops required to fit the PSD model. The default value is 5.

  • gof_metrics (bool, optional) – Whether to add goodness-of-fit metrics to the output dataset. The default is True.

Returns:

DISDRODB L2M dataset.

Return type:

xarray.Dataset

disdrodb.l2.processing.select_timesteps_with_drops(ds, minimum_ndrops=0)[source][source]#

Select timesteps with at least the specified number of drops.

disdrodb.l2.processing.select_timesteps_with_minimum_nbins(ds, minimum_nbins)[source][source]#

Select timesteps with at least the specified number of diameter bins with drops.

disdrodb.l2.processing.select_timesteps_with_minimum_rain_rate(ds, minimum_rain_rate)[source][source]#

Select timesteps with at least the specified rain rate.

disdrodb.l2.routines module#

Implements routines for DISDRODB L2 processing.

class disdrodb.l2.routines.ProcessingOptions(product, filepaths, parallel, temporal_resolutions=None)[source][source]#

Bases: object

Define L2 products processing options.

Define L2 products processing options.

get_files_partitions(temporal_resolution)[source][source]#

Return files partitions dictionary for a specific L2E product.

get_folder_partitioning(temporal_resolution)[source][source]#

Return the folder partitioning for a specific L2E product.

get_product_options(temporal_resolution)[source][source]#

Return product options dictionary for a specific L2E product.

disdrodb.l2.routines.define_temporal_partitions(filepaths, strategy, parallel, strategy_options)[source][source]#

Define temporal file processing partitions.

Parameters:
  • filepaths (list) – List of files paths to be processed

  • strategy (str) –

    Which partitioning strategy to apply:

    • 'time_block' defines fixed time intervals (e.g. monthly) covering input files.

    • 'event' detect clusters of precipitation (“events”).

  • parallel (bool) – If True, parallel data loading is used to identify events.

  • strategy_options (dict) –

    Dictionary with strategy-specific parameters:

    If strategy == 'time_block', supported options are:

    • freq: Time unit for blocks. One of {‘year’, ‘season’, ‘month’, ‘day’}.

    See identify_time_partitions for more information.

    If strategy == 'event', supported options are:

    • min_drops : int Minimum number of drops to consider a timestep.

    • neighbor_min_size : int Minimum cluster size for merging neighboring events.

    • neighbor_time_interval : str Time window (e.g. “5MIN”) to merge adjacent clusters.

    • event_max_time_gap : str Maximum allowed gap (e.g. “6H”) within a single event.

    • event_min_duration : str Minimum total duration (e.g. “5MIN”) of an event.

    • event_min_size : int Minimum number of records in an event.

    See identify_events for more information.

Returns:

A list of dictionaries, each containing:

  • start_time (numpy.datetime64[s])

    Inclusive start of an event or time block.

  • end_time (numpy.datetime64[s])

    Inclusive end of an event or time block.

Return type:

list

Notes

  • The 'event' strategy requires loading data into memory to identify clusters.

  • The 'time_block' strategy can operate on metadata alone, without full data loading.

  • The 'event' strategy implicitly performs data selection on which files to process !

  • The 'time_block' strategy does not performs data selection on which files to process !

disdrodb.l2.routines.identify_events(filepaths, parallel=False, min_drops=5, neighbor_min_size=2, neighbor_time_interval='5MIN', event_max_time_gap='6H', event_min_duration='5MIN', event_min_size=3)[source][source]#

Return a list of rainy events.

Rainy timesteps are defined when N > min_drops. Any rainy isolated timesteps (based on neighborhood criteria) is removed. Then, consecutive rainy timesteps are grouped into the same event if the time gap between them does not exceed event_max_time_gap. Finally, events that do not meet minimum size or duration requirements are filtered out.

Parameters:
  • filepaths (list) – List of L1C file paths.

  • parallel (bool) – Whether to load the files in parallel. Set parallel=True only in a multiprocessing environment. The default is False.

  • neighbor_time_interval (str) – The time interval around a given a timestep defining the neighborhood. Only timesteps that fall within this time interval before or after a timestep are considered neighbors.

  • neighbor_min_size (int, optional) – The minimum number of neighboring timesteps required within neighbor_time_interval for a timestep to be considered non-isolated. Isolated timesteps are removed ! - If neighbor_min_size=0, then no timestep is considered isolated and no filtering occurs. - If `neighbor_min_size=1, the timestep must have at least one neighbor within neighbor_time_interval. - If neighbor_min_size=2, the timestep must have at least two timesteps within neighbor_time_interval. Defaults to 1.

  • event_max_time_gap (str) – The maximum time interval between two timesteps to be considered part of the same event. This parameters is used to group timesteps into events !

  • event_min_duration (str) – The minimum duration an event must span. Events shorter than this duration are discarded.

  • event_min_size (int, optional) – The minimum number of valid timesteps required for an event. Defaults to 1.

Returns:

A list of events, where each event is represented as a dictionary with keys: - “start_time”: np.datetime64, start time of the event - “end_time”: np.datetime64, end time of the event - “duration”: np.timedelta64, duration of the event - “n_timesteps”: int, number of valid timesteps in the event

Return type:

list of dict

disdrodb.l2.routines.identify_time_partitions(filepaths: list[str], freq: str) list[dict][source][source]#

Identify the set of time blocks covered by files.

The result is a minimal, sorted, and unique set of time partitions.

Parameters:
  • filepaths (list of str) – Paths to input files from which start and end times will be extracted via get_start_end_time_from_filepaths.

  • freq ({'none', 'hour', 'day', 'month', 'quarter', 'season', 'year'}) – Frequency determining the granularity of candidate blocks. See generate_time_blocks for more details.

Returns:

A list of dictionaries, each containing:

  • start_time (numpy.datetime64[s])

    Inclusive start of a time block.

  • end_time (numpy.datetime64[s])

    Inclusive end of a time block.

Only those blocks that overlap at least one file’s interval are returned. The list is sorted by start_time and contains no duplicate blocks.

Return type:

list of dict

disdrodb.l2.routines.is_possible_product(accumulation_interval, sample_interval, rolling)[source][source]#

Assess if production is possible given the requested accumulation interval and source sample_interval.

disdrodb.l2.routines.precompute_scattering_tables(frequency, num_points, diameter_max, canting_angle_std, axis_ratio_model, permittivity_model, water_temperature, elevation_angle, verbose=True)[source][source]#

Precompute the pyTMatrix scattering tables required for radar variables simulations.

disdrodb.l2.routines.run_l2e_station(data_source, campaign_name, station_name, force: bool = False, verbose: bool = True, parallel: bool = True, debugging_mode: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None)[source][source]#

Generate the L2E product of a specific DISDRODB station when invoked from the terminal.

This function is intended to be called through the disdrodb_run_l2e_station command-line interface.

This routine generates L2E files. Files are defined based on the DISDRODB archive settings options. The DISDRODB archive settings allows to produce L2E files either per custom block of time (i.e day/month/year) or per blocks of (rainy) events.

For stations with varying measurement intervals, DISDRODB defines a separate list of partitions for each measurement interval option. In other words, DISDRODB does not mix files with data acquired at different sample intervals when resampling the data.

L0C product generation ensure creation of files with unique sample intervals.

Parameters:
  • data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.

  • campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.

  • station_name (str) – The name of the station.

  • force (bool, optional) – If True, existing data in the destination directories will be overwritten. If False (default), an error will be raised if data already exists in the destination directories.

  • verbose (bool, optional) – If True (default), detailed processing information will be printed to the terminal. If False, less information will be displayed.

  • parallel (bool, optional) – If True, files will be processed in multiple processes simultaneously, with each process using a single thread to avoid issues with the HDF/netCDF library. If False (default), files will be processed sequentially in a single process, and multi-threading will be automatically exploited to speed up I/O tasks.

  • debugging_mode (bool, optional) – If True, the amount of data processed will be reduced. Only the first 3 files will be processed. The default value is False.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

disdrodb.l2.routines.run_l2m_station(data_source, campaign_name, station_name, force: bool = False, verbose: bool = True, parallel: bool = True, debugging_mode: bool = False, data_archive_dir: str | None = None, metadata_archive_dir: str | None = None)[source][source]#

Run the L2M processing of a specific DISDRODB station when invoked from the terminal.

This function is intended to be called through the disdrodb_run_l2m_station command-line interface.

Parameters:
  • data_source (str) – The name of the institution (for campaigns spanning multiple countries) or the name of the country (for campaigns or sensor networks within a single country). Must be provided in UPPER CASE.

  • campaign_name (str) – The name of the campaign. Must be provided in UPPER CASE.

  • station_name (str) – The name of the station.

  • force (bool, optional) – If True, existing data in the destination directories will be overwritten. If False (default), an error will be raised if data already exists in the destination directories.

  • verbose (bool, optional) – If True (default), detailed processing information will be printed to the terminal. If False, less information will be displayed.

  • parallel (bool, optional) – If True, files will be processed in multiple processes simultaneously, with each process using a single thread to avoid issues with the HDF/netCDF library. If False (default), files will be processed sequentially in a single process, and multi-threading will be automatically exploited to speed up I/O tasks.

  • debugging_mode (bool, optional) – If True, the amount of data processed will be reduced. Only the first 3 files will be processed. The default value is False.

  • data_archive_dir (str, optional) – The base directory of DISDRODB, expected in the format <...>/DISDRODB. If not specified, the path specified in the DISDRODB active configuration will be used.

Module contents#

Module for DISDRODB L2 production.

disdrodb.l2.generate_l2_radar(ds, frequency=None, num_points=1024, diameter_max=10, canting_angle_std=7, axis_ratio_model='Thurai2007', permittivity_model='Turner2016', water_temperature=10, elevation_angle=0, parallel=True)[source][source]#

Simulate polarimetric radar variables from empirical drop number concentration or the estimated PSD.

Parameters:
  • ds (xarray.Dataset) – Dataset containing the drop number concentration variable or the PSD parameters.

  • frequency (str, float, or list of str and float, optional) – Frequencies in GHz for which to compute the radar parameters. Alternatively, also strings can be used to specify common radar frequencies. If None, the common radar frequencies will be used. See disdrodb.scattering.available_radar_bands().

  • num_points (int or list of integer, optional) – Number of bins into which discretize the PSD.

  • diameter_max (float or list of float, optional) – Maximum diameter. The default value is 10 mm.

  • canting_angle_std (float or list of float, optional) – Standard deviation of the canting angle. The default value is 7.

  • axis_ratio_model (str or list of str, optional) – Models to compute the axis ratio. The default model is Thurai2007. See available models with disdrodb.scattering.available_axis_ratio_models().

  • permittivity_model (str str or list of str, optional) – Permittivity model to use to compute the refractive index and the rayleigh_dielectric_factor. The default is Turner2016. See available models with disdrodb.scattering.available_permittivity_models().

  • water_temperature (float or list of float, optional) – Water temperature in degree Celsius to be used in the permittivity model. The default is 10 degC.

  • elevation_angle (float or list of float, optional) – Radar elevation angles in degrees. Specify 90 degrees for vertically pointing radars. The default is 0 degrees.

  • parallel (bool, optional) – Whether to compute radar variables in parallel. The default value is True.

Returns:

Dataset containing the computed radar parameters.

Return type:

xarray.Dataset

disdrodb.l2.generate_l2e(ds, ds_env=None, compute_spectra=False, compute_percentage_contribution=False, minimum_ndrops=1, minimum_nbins=1, minimum_rain_rate=0.01)[source][source]#

Generate the DISDRODB L2E dataset from the DISDRODB L1 dataset.

Parameters:
  • ds (xarray.Dataset) –

    DISDRODB L1 dataset. Alternatively, a xarray dataset with at least:

    • variables: drop_number, fall_velocity

    • dimension: DIAMETER_DIMENSION

    • coordinates: diameter_bin_center, diameter_bin_width, sample_interval

    • attributes: sensor_name

  • ds_env (xarray.Dataset, optional) – Environmental dataset used for fall velocity and water density estimates. If None, a default environment dataset will be loaded.

Returns:

DISRODB L2E dataset.

Return type:

xarray.Dataset

disdrodb.l2.generate_l2m(ds, psd_model, optimization=None, optimization_kwargs=None, diameter_min=0, diameter_max=10, diameter_spacing=0.05, ds_env=None, fall_velocity_method='Beard1976', minimum_ndrops=1, minimum_nbins=3, minimum_rain_rate=0.01, gof_metrics=True)[source][source]#

Generate the DISDRODB L2M dataset from a DISDRODB L2E dataset.

This function estimates PSD model parameters and successively computes DSD integral parameters. Optionally, radar variables at various bands are simulated using T-matrix simulations. Goodness-of-fit metrics of the PSD can also be optionally included into the output dataset.

Parameters:
  • ds (xarray.Dataset) – DISDRODB L2E dataset.

  • psd_model (str) – The PSD model to fit. See disdrodb.psd.available_psd_models().

  • ds_env (xarray.Dataset, optional) – Environmental dataset used for fall velocity and water density estimates. If None, a default environment dataset will be loaded.

  • diameter_min (float, optional) – Minimum PSD diameter. The default value is 0 mm.

  • diameter_max (float, optional) – Maximum PSD diameter. The default value is 8 mm.

  • diameter_spacing (float, optional) – PSD diameter spacing. The default value is 0.05 mm.

  • optimization (str, optional) – The fitting optimization procedure. Either “GS” (Grid Search), “ML (Maximum Likelihood) or “MOM” (Method of Moments).

  • optimization_kwargs (dict, optional) – Dictionary with arguments to customize the fitting procedure.

  • minimum_nbins (int) – Minimum number of bins with drops required to fit the PSD model. The default value is 5.

  • gof_metrics (bool, optional) – Whether to add goodness-of-fit metrics to the output dataset. The default is True.

Returns:

DISDRODB L2M dataset.

Return type:

xarray.Dataset