es_sfgtools.tiledb_tools.tiledb_schemas module

This module defines the TileDB array schemas and provides a set of classes for interacting with those arrays.

The schemas are defined for various types of seafloor geodesy data, including kinematic position data, IMU data, acoustic data, and raw shot data.

The TBDArray class and its subclasses provide a high-level interface for creating, writing to, and reading from these TileDB arrays, handling both local and S3 storage.

class es_sfgtools.tiledb_tools.tiledb_schemas.TBDArray(uri: Path | S3Path | str)

Bases: object

A base class for interacting with a TileDB array.

This class provides common functionality for creating, reading, and writing pandas DataFrames to and from a TileDB array. It is intended to be subclassed for specific data types and schemas.

dataframe_schema

A pandera schema for validating DataFrames.

array_schema

A tiledb.ArraySchema for creating the array.

name

A human-readable name for the array type.

Type:

str

uri

The URI of the TileDB array.

Type:

str

array_schema = None
consolidate()

Consolidates and vacuums the TileDB array to improve performance.

dataframe_schema = None
get_unique_dates(field: str) ndarray

Gets the unique dates from a specified datetime field in the array.

Parameters:

field (str) – The name of the datetime field to query.

Returns:

An array of unique dates, or None if an error occurs.

Return type:

np.ndarray

name = 'TBD Array'
read_df(start: datetime | datetime64, end: datetime | datetime64 = None, validate: bool = True, **kwargs) DataFrame

Read a DataFrame from the array between a start and end date.

Parameters:
  • start (datetime.datetime | np.datetime64) – The start date for the data slice.

  • end (datetime.datetime | np.datetime64, optional) – The end date for the data slice. If None, defaults to one day after start. Defaults to None.

  • validate (bool, optional) – Whether to validate the returned DataFrame. Defaults to True.

Returns:

A DataFrame containing the data for the specified date range. Returns an empty DataFrame if no data is found or on error.

Return type:

pd.DataFrame

view(network: str = '', station: str = '')

Generates a plot showing the dates for which data is available.

Parameters:
  • network (str, optional) – Network name to display in the title. Defaults to “”.

  • station (str, optional) – Station name to display in the title. Defaults to “”.

Raises:

ValueError – If no data is found in the array.

write_df(df: DataFrame, validate: bool = True)

Write a pandas DataFrame to the array.

The DataFrame is validated against the class’s dataframe_schema before being written.

Parameters:
  • df (pd.DataFrame) – The DataFrame to write.

  • validate (bool, optional) – Whether to validate the DataFrame. Defaults to True.

class es_sfgtools.tiledb_tools.tiledb_schemas.TDBAcousticArray(uri: Path | S3Path | str)

Bases: TBDArray

Handles TileDB storage for acoustic ranging data.

array_schema = ArraySchema(   domain=Domain(*[     Dim(name='time', domain=(numpy.datetime64('-292275055-05-16T16:47:04.193'), numpy.datetime64('292278994-08-17T07:12:55.807')), tile=numpy.timedelta64(-1,'ms'), dtype='datetime64[ms]', filters=FilterList([ZstdFilter(level=7), ])),     Dim(name='transponderID', domain=('', ''), tile=None, dtype='|S0', var=True, filters=FilterList([ZstdFilter(level=7), ])),   ]),   attrs=[     Attr(name='returnTime', dtype='datetime64[ns]', var=False, nullable=False, enum_label=None),     Attr(name='tt', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='dbv', dtype='float32', var=False, nullable=False, enum_label=None),     Attr(name='xc', dtype='uint8', var=False, nullable=False, enum_label=None),     Attr(name='snr', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='tat', dtype='float64', var=False, nullable=False, enum_label=None),   ],   cell_order='col-major',   tile_order='row-major',   capacity=10000,   sparse=True,   allows_duplicates=False, )
dataframe_schema

alias of AcousticDataFrame

get_unique_dates(field='triggerTime') ndarray

Gets unique dates from the ‘triggerTime’ field.

read_df(start: <module 'datetime' from '/home/docs/checkouts/readthedocs.org/user_builds/es-sfgtools/conda/readme/lib/python3.11/datetime.py'>, end: <module 'datetime' from '/home/docs/checkouts/readthedocs.org/user_builds/es-sfgtools/conda/readme/lib/python3.11/datetime.py'> = None, **kwargs) DataFrame

Reads acoustic data for a given time range.

write_df(df: DataFrame)

Writes an acoustic data DataFrame to the array.

class es_sfgtools.tiledb_tools.tiledb_schemas.TDBGNSSObsArray(uri: Path | S3Path | str)

Bases: TBDArray

Handles TileDB storage for GNSS observation data.

array_schema = ArraySchema(   domain=Domain(*[     Dim(name='time', domain=(315964800000, 4102444800000), tile=43200000, dtype='int64', filters=FilterList([ZstdFilter(level=7), ])),     Dim(name='sys', domain=(0, 254), tile=1, dtype='uint8', filters=FilterList([ZstdFilter(level=7), ])),     Dim(name='sat', domain=(0, 254), tile=1, dtype='uint8', filters=FilterList([ZstdFilter(level=7), ])),     Dim(name='obs', domain=(0, 65534), tile=1, dtype='uint16', filters=FilterList([ZstdFilter(level=7), ])),   ]),   attrs=[     Attr(name='range', dtype='float64', var=False, nullable=False, enum_label=None, filters=FilterList([FloatScaleFilter(factor=0.0001,offset=0.0,bytewidth=8), ZstdFilter(level=7), ])),     Attr(name='phase', dtype='float64', var=False, nullable=False, enum_label=None, filters=FilterList([FloatScaleFilter(factor=0.0001,offset=0.0,bytewidth=8), ZstdFilter(level=7), ])),     Attr(name='doppler', dtype='float64', var=False, nullable=False, enum_label=None, filters=FilterList([FloatScaleFilter(factor=0.0001,offset=0.0,bytewidth=8), ZstdFilter(level=7), ])),     Attr(name='snr', dtype='float32', var=False, nullable=False, enum_label=None, filters=FilterList([FloatScaleFilter(factor=0.001,offset=0.0,bytewidth=4), ZstdFilter(level=7), ])),     Attr(name='slip', dtype='uint16', var=False, nullable=False, enum_label=None, filters=FilterList([BitWidthReductionFilter(window=256), ZstdFilter(level=7), ])),     Attr(name='flags', dtype='uint16', var=False, nullable=False, enum_label=None, filters=FilterList([BitWidthReductionFilter(window=256), ZstdFilter(level=7), ])),     Attr(name='fcn', dtype='int8', var=False, nullable=False, enum_label=None, filters=FilterList([ZstdFilter(level=7), ])),   ],   cell_order='row-major',   tile_order='row-major',   capacity=500000,   sparse=True,   allows_duplicates=False, )
get_unique_dates(field: str = 'time') ndarray

Gets unique dates from a specified datetime field in the array.

Parameters:

field (str, optional) – The name of the datetime field to query. Defaults to “time”.

Returns:

An array of unique dates, or None if an error occurs.

Return type:

np.ndarray

class es_sfgtools.tiledb_tools.tiledb_schemas.TDBIMUPositionArray(uri: Path | S3Path | str)

Bases: TBDArray

Handles TileDB storage for IMU position and orientation data.

array_schema = ArraySchema(   domain=Domain(*[     Dim(name='time', domain=(numpy.datetime64('-292275055-05-16T16:47:04.193'), numpy.datetime64('292278994-08-17T07:12:55.807')), tile=numpy.timedelta64(-1,'ms'), dtype='datetime64[ms]', filters=FilterList([ZstdFilter(level=7), ])),   ]),   attrs=[     Attr(name='azimuth', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='pitch', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='roll', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='latitude', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='longitude', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='height', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='latitude_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='longitude_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='height_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='northVelocity', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='eastVelocity', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='upVelocity', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='northVelocity_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='eastVelocity_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='upVelocity_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='roll_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='pitch_std', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='azimuth_std', dtype='float64', var=False, nullable=True, enum_label=None),   ],   cell_order='col-major',   tile_order='row-major',   capacity=10000,   sparse=True,   allows_duplicates=False, )
dataframe_schema

alias of IMUPositionDataFrame

get_unique_dates(field='time') ndarray

Gets unique dates from the ‘time’ field.

class es_sfgtools.tiledb_tools.tiledb_schemas.TDBKinPositionArray(uri: Path | S3Path | str)

Bases: TBDArray

Handles TileDB storage for kinematic GNSS position data.

array_schema = ArraySchema(   domain=Domain(*[     Dim(name='time', domain=(numpy.datetime64('-292275055-05-16T16:47:04.193'), numpy.datetime64('292278994-08-17T07:12:55.807')), tile=numpy.timedelta64(-1,'ms'), dtype='datetime64[ms]', filters=FilterList([ZstdFilter(level=7), ])),   ]),   attrs=[     Attr(name='latitude', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='longitude', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='height', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='east', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='north', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='up', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='number_of_satellites', dtype='uint8', var=False, nullable=False, enum_label=None),     Attr(name='pdop', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='wrms', dtype='float64', var=False, nullable=False, enum_label=None),   ],   cell_order='col-major',   tile_order='row-major',   capacity=10000,   sparse=True,   allows_duplicates=False, )
dataframe_schema

alias of KinPositionDataFrame

get_unique_dates(field='time') ndarray

Gets unique dates from the ‘time’ field.

name = 'Kin Position Data'
class es_sfgtools.tiledb_tools.tiledb_schemas.TDBShotDataArray(uri: Path | S3Path | str)

Bases: TBDArray

Handles TileDB storage for processed shot data.

array_schema = ArraySchema(   domain=Domain(*[     Dim(name='pingTime', domain=(numpy.datetime64('1677-09-21T00:12:43.145224193'), numpy.datetime64('2262-04-11T23:47:16.854775807')), tile=numpy.timedelta64(-1,'ns'), dtype='datetime64[ns]', filters=FilterList([ZstdFilter(level=7), ])),     Dim(name='transponderID', domain=('', ''), tile=None, dtype='|S0', var=True, filters=FilterList([ZstdFilter(level=7), ])),   ]),   attrs=[     Attr(name='head0', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='pitch0', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='roll0', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='head1', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='pitch1', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='roll1', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='east0', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='north0', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='up0', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='east1', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='north1', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='up1', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='east_std0', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='north_std0', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='up_std0', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='east_std1', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='north_std1', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='up_std1', dtype='float64', var=False, nullable=True, enum_label=None),     Attr(name='returnTime', dtype='datetime64[ns]', var=False, nullable=False, enum_label=None),     Attr(name='tt', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='dbv', dtype='float32', var=False, nullable=False, enum_label=None),     Attr(name='xc', dtype='uint8', var=False, nullable=False, enum_label=None),     Attr(name='snr', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='tat', dtype='float64', var=False, nullable=False, enum_label=None),     Attr(name='isUpdated', dtype='bool', var=False, nullable=False, enum_label=None),   ],   cell_order='col-major',   tile_order='row-major',   capacity=10000,   sparse=True,   allows_duplicates=False, )
dataframe_schema

alias of ShotDataFrame

get_unique_dates(field='pingTime') ndarray

Gets unique dates from the ‘pingTime’ field.

name = 'Shot Data'
read_df(start: <module 'datetime' from '/home/docs/checkouts/readthedocs.org/user_builds/es-sfgtools/conda/readme/lib/python3.11/datetime.py'>, end: <module 'datetime' from '/home/docs/checkouts/readthedocs.org/user_builds/es-sfgtools/conda/readme/lib/python3.11/datetime.py'> = None, **kwargs) DataFrame

Read a DataFrame from the array between the start and end dates.

Parameters:
  • start (datetime.datetime) – The start date.

  • end (datetime.datetime, optional) – The end date. Defaults to None.

Returns:

A DataFrame of shot data, or None on error.

Return type:

pd.DataFrame

write_df(df: DataFrame, validate: bool = True)

Write a shot data DataFrame to the array.

Handles conversion of timestamp columns from float or datetime objects to the required nanosecond-precision numpy datetime64 format.

Parameters:
  • df (pd.DataFrame) – The dataframe to write.

  • validate (bool, optional) – Whether to validate the dataframe. Defaults to True.