disdrodb.retrievals package#
Submodules#
disdrodb.retrievals.lut module#
Routines for 1D and 2D Look Up Tables (LUT).
- class disdrodb.retrievals.lut.NearestNeighbourLUT1D(df, x, columns=None, dtype=<class 'numpy.float32'>)[source][source]#
Bases:
objectA 1D nearest neighbor lookup table using k-d tree.
This class builds a k-d tree from 1D points and their associated values, enabling fast nearest neighbor queries for interpolation or lookup purposes.
- Parameters:
df (pandas.DataFrame) – Input DataFrame containing the coordinate and values.
x (str) – Name of the column representing the x-coordinate.
columns (list of str, optional) – List of column names to include in the lookup table values. If None, all columns are included. Default is None.
dtype (numpy.dtype, optional) – Data type for storing points and values. Default is np.float32.
Notes
Rows with NaNs in the selected value columns are discarded before constructing the lookup table.
- predict(x, return_distance=False, max_distance=None)[source][source]#
Query the lookup table for nearest neighbor values.
- Parameters:
x (array-like) – x-coordinates of query points. Can be scalar or array.
return_distance (bool, optional) – If True, include the distance to the nearest neighbor in the output. Default is False.
max_distance (float, tuple of float, or None, optional) –
Maximum distance threshold for valid predictions.
If None: no distance masking is applied.
If float: points with distance > max_distance are set to NaN.
If tuple (dx,): points are masked if |x - x_nearest| > dx.
Default is None.
- Returns:
DataFrame containing the nearest neighbor values for each query point. Columns include x coordinates, the lookup values, and optionally the distance to the nearest neighbor. Values beyond max_distance are NaN.
- Return type:
- class disdrodb.retrievals.lut.NearestNeighbourLUT2D(df, x, y, columns=None, dtype=<class 'numpy.float32'>)[source][source]#
Bases:
objectA 2D nearest neighbor lookup table using k-d tree.
This class builds a k-d tree from 2D points and their associated values, enabling fast nearest neighbor queries for interpolation or lookup purposes.
- Parameters:
df (pandas.DataFrame) – Input DataFrame containing the coordinates and values.
x (str) – Name of the column representing the x-coordinate.
y (str) – Name of the column representing the y-coordinate.
columns (list of str, optional) – List of column names to include in the lookup table values. If None, all columns are included. Default is None.
dtype (numpy.dtype, optional) – Data type for storing points and values. Default is np.float32.
- dtype#
Data type used for storage.
- Type:
- points#
Array of shape (n, 2) containing the coordinates.
- Type:
- values#
Array of shape (n, len(columns)) containing the lookup values.
- Type:
- tree#
k-d tree for fast nearest neighbor queries.
- Type:
- Raises:
ValueError – If x or y are not columns in the DataFrame.
ValueError – If any of the specified columns are not found in the DataFrame.
Notes
Rows with NaNs in the selected value columns are discarded before constructing the lookup table.
Examples
>>> import pandas as pd >>> df = pd.DataFrame({'x': [0, 1, 2], 'y': [0, 1, 2], 'value': [10, 20, 30]}) >>> lut = NearestNeighbourLUT2D(df, 'x', 'y', columns=['value']) >>> lut.predict([0.5], [0.5])
- predict(x, y, return_distance=False, max_distance=None)[source][source]#
Query the lookup table for nearest neighbor values.
- Parameters:
x (array-like) – x-coordinates of query points. Can be scalar or array.
y (array-like) – y-coordinates of query points. Can be scalar or array.
return_distance (bool, optional) – If True, include the distance to the nearest neighbor in the output. Default is False.
max_distance (float, tuple of float, or None, optional) –
Maximum distance threshold for valid predictions.
If None: no distance masking is applied.
If float: points with Euclidean distance > max_distance are set to NaN.
If tuple (dx, dy): points are masked if |x - x_nearest| > dx OR |y - y_nearest| > dy.
Default is None.
- Returns:
DataFrame containing the nearest neighbor values for each query point. Columns include x, y coordinates, the lookup values, and optionally the distance to the nearest neighbor. Values beyond max_distance are NaN.
- Return type:
- Raises:
ValueError – If x and y do not have the same shape.
Notes
The method automatically converts scalar inputs to 1D arrays. Input values that are NaN or inf will produce NaN output rows.
- predict_dict(x, y)[source][source]#
Query the lookup table and return result as a dictionary.
- Parameters:
x (scalar) – x-coordinate of the query point.
y (scalar) – y-coordinate of the query point.
- Returns:
Dictionary with column names as keys and nearest neighbor values as values, including x and y coordinates.
- Return type:
- Raises:
ValueError – If x or y are not scalars.
Notes
This is a convenience method for single-point queries returning a dict instead of a DataFrame.
Module contents#
DISDRODB module for DSD retrievals.
- class disdrodb.retrievals.NearestNeighbourLUT1D(df, x, columns=None, dtype=<class 'numpy.float32'>)[source][source]#
Bases:
objectA 1D nearest neighbor lookup table using k-d tree.
This class builds a k-d tree from 1D points and their associated values, enabling fast nearest neighbor queries for interpolation or lookup purposes.
- Parameters:
df (pandas.DataFrame) – Input DataFrame containing the coordinate and values.
x (str) – Name of the column representing the x-coordinate.
columns (list of str, optional) – List of column names to include in the lookup table values. If None, all columns are included. Default is None.
dtype (numpy.dtype, optional) – Data type for storing points and values. Default is np.float32.
Notes
Rows with NaNs in the selected value columns are discarded before constructing the lookup table.
- predict(x, return_distance=False, max_distance=None)[source][source]#
Query the lookup table for nearest neighbor values.
- Parameters:
x (array-like) – x-coordinates of query points. Can be scalar or array.
return_distance (bool, optional) – If True, include the distance to the nearest neighbor in the output. Default is False.
max_distance (float, tuple of float, or None, optional) –
Maximum distance threshold for valid predictions.
If None: no distance masking is applied.
If float: points with distance > max_distance are set to NaN.
If tuple (dx,): points are masked if |x - x_nearest| > dx.
Default is None.
- Returns:
DataFrame containing the nearest neighbor values for each query point. Columns include x coordinates, the lookup values, and optionally the distance to the nearest neighbor. Values beyond max_distance are NaN.
- Return type:
- class disdrodb.retrievals.NearestNeighbourLUT2D(df, x, y, columns=None, dtype=<class 'numpy.float32'>)[source][source]#
Bases:
objectA 2D nearest neighbor lookup table using k-d tree.
This class builds a k-d tree from 2D points and their associated values, enabling fast nearest neighbor queries for interpolation or lookup purposes.
- Parameters:
df (pandas.DataFrame) – Input DataFrame containing the coordinates and values.
x (str) – Name of the column representing the x-coordinate.
y (str) – Name of the column representing the y-coordinate.
columns (list of str, optional) – List of column names to include in the lookup table values. If None, all columns are included. Default is None.
dtype (numpy.dtype, optional) – Data type for storing points and values. Default is np.float32.
- dtype#
Data type used for storage.
- Type:
- points#
Array of shape (n, 2) containing the coordinates.
- Type:
- values#
Array of shape (n, len(columns)) containing the lookup values.
- Type:
- tree#
k-d tree for fast nearest neighbor queries.
- Type:
- Raises:
ValueError – If x or y are not columns in the DataFrame.
ValueError – If any of the specified columns are not found in the DataFrame.
Notes
Rows with NaNs in the selected value columns are discarded before constructing the lookup table.
Examples
>>> import pandas as pd >>> df = pd.DataFrame({'x': [0, 1, 2], 'y': [0, 1, 2], 'value': [10, 20, 30]}) >>> lut = NearestNeighbourLUT2D(df, 'x', 'y', columns=['value']) >>> lut.predict([0.5], [0.5])
- predict(x, y, return_distance=False, max_distance=None)[source][source]#
Query the lookup table for nearest neighbor values.
- Parameters:
x (array-like) – x-coordinates of query points. Can be scalar or array.
y (array-like) – y-coordinates of query points. Can be scalar or array.
return_distance (bool, optional) – If True, include the distance to the nearest neighbor in the output. Default is False.
max_distance (float, tuple of float, or None, optional) –
Maximum distance threshold for valid predictions.
If None: no distance masking is applied.
If float: points with Euclidean distance > max_distance are set to NaN.
If tuple (dx, dy): points are masked if |x - x_nearest| > dx OR |y - y_nearest| > dy.
Default is None.
- Returns:
DataFrame containing the nearest neighbor values for each query point. Columns include x, y coordinates, the lookup values, and optionally the distance to the nearest neighbor. Values beyond max_distance are NaN.
- Return type:
- Raises:
ValueError – If x and y do not have the same shape.
Notes
The method automatically converts scalar inputs to 1D arrays. Input values that are NaN or inf will produce NaN output rows.
- predict_dict(x, y)[source][source]#
Query the lookup table and return result as a dictionary.
- Parameters:
x (scalar) – x-coordinate of the query point.
y (scalar) – y-coordinate of the query point.
- Returns:
Dictionary with column names as keys and nearest neighbor values as values, including x and y coordinates.
- Return type:
- Raises:
ValueError – If x or y are not scalars.
Notes
This is a convenience method for single-point queries returning a dict instead of a DataFrame.