es_sfgtools.data_mgmt.assetcatalog.handler module

class es_sfgtools.data_mgmt.assetcatalog.handler.PreProcessCatalogHandler(db_path: Path)

Bases: object

A class to handle the preprocessing catalog.

add_entry(entry: AssetEntry) bool

Adds an entry to the database.

Parameters:

entry (AssetEntry) – The entry to add.

Returns:

True if the entry was added, False otherwise.

Return type:

bool

add_merge_job(parent_type: str, child_type: str, parent_ids: List[int], **kwargs)

Adds a merge job to the database.

Parameters:
  • parent_type (str) – The parent asset type.

  • child_type (str) – The child asset type.

  • parent_ids (List[int]) – The parent asset IDs.

add_or_update(entry: AssetEntry) bool

Adds or updates an entry in the database.

Parameters:

entry (AssetEntry) – The entry to add or update.

Returns:

True if the entry was added or updated, False otherwise.

Return type:

bool

delete_entries(network: str, station: str, campaign: str, type: AssetType | str, where: str = None) None

Deletes entries from the Assets table based on the specified criteria.

Parameters:
  • network (str) – The network identifier for the assets to be deleted.

  • station (str) – The station identifier for the assets to be deleted.

  • campaign (str) – The campaign identifier for the assets to be deleted.

  • type (AssetType | str) – The type of asset to be deleted. Can be an AssetType enum or a string representation.

  • where (str, optional) – Additional SQL conditions to filter the assets to be deleted, by default None.

Raises:
  • KeyError – If the provided asset type string is invalid and cannot be mapped to an AssetType enum.

  • Exception – If an error occurs during the deletion process.

delete_entry(entry: AssetEntry) bool

Deletes an entry from the database.

Parameters:

entry (AssetEntry) – The entry to delete.

Returns:

True if the entry was deleted, False otherwise.

Return type:

bool

find_entry(entry: AssetEntry) AssetEntry | None

Finds an entry in the database.

Parameters:

entry (AssetEntry) – The entry to find.

Returns:

The entry if found, otherwise None.

Return type:

AssetEntry | None

get_assets(network: str, station: str, campaign: str, type: AssetType | str) List[AssetEntry]

Gets assets for a given network, station, campaign, and type.

Parameters:
  • network (str) – The network name.

  • station (str) – The station name.

  • campaign (str) – The campaign name.

  • type (AssetType | str) – The asset type.

Returns:

A list of assets.

Return type:

List[AssetEntry]

get_ctds(station: str, campaign: str) List[AssetEntry]

Get all svp, ctd and seabird assets for a given station and campaign.

Parameters:
  • station (str) – The station.

  • campaign (str) – The campaign.

Returns:

A list of AssetEntry objects.

Return type:

List[AssetEntry]

get_dtype_counts(network: str, station: str, campaign: str, **kwargs) Dict[str, int]

Gets the counts of each data type for a given network, station, and campaign.

Parameters:
  • network (str) – The network name.

  • station (str) – The station name.

  • campaign (str) – The campaign name.

Returns:

A dictionary of data types and their counts.

Return type:

Dict[str, int]

get_local_assets(network: str, station: str, campaign: str, type: AssetType) List[AssetEntry]

Get local assets for a given network, station, campaign, and type.

Parameters:
  • network (str) – The network.

  • station (str) – The station.

  • campaign (str) – The campaign.

  • type (AssetType) – The asset type.

Returns:

A list of AssetEntry objects.

Return type:

List[AssetEntry]

get_single_entries_to_process(network: str, station: str, campaign: str, parent_type: AssetType, child_type: AssetType = None, override: bool = False, local_only: bool = False) List[AssetEntry]

Get single entries to process.

Parameters:
  • network (str) – The network name.

  • station (str) – The station name.

  • campaign (str) – The campaign name.

  • parent_type (AssetType) – The parent asset type.

  • child_type (AssetType, optional) – The child asset type, by default None.

  • override (bool, optional) – Whether to override existing entries, by default False.

  • local_only (bool, optional) – Whether to consider only local entries, by default False.

Returns:

A list of assets.

Return type:

List[AssetEntry]

is_merge_complete(parent_type: str, child_type: str, parent_ids: List[int], **kwargs) bool

Checks if a merge job is complete.

Parameters:
  • parent_type (str) – The parent asset type.

  • child_type (str) – The child asset type.

  • parent_ids (List[int]) – The parent asset IDs.

Returns:

True if the merge job is complete, False otherwise.

Return type:

bool

query_catalog(query: str) DataFrame

Queries the catalog.

Parameters:

query (str) – The query to execute.

Returns:

A dataframe with the results.

Return type:

pd.DataFrame

remote_file_exist(network: str, station: str, campaign: str, type: AssetType, remote_path: str) bool

Check if a remote file name exists in the catalog as a local file.

Parameters:
  • network (str) – The network.

  • station (str) – The station.

  • campaign (str) – The campaign.

  • type (AssetType) – The asset type.

  • remote_path (str) – The remote path.

Returns:

True if the file exists, False if not.

Return type:

bool

update_local_path(id, local_path: str)

Update the local path for an entry in the database.

Parameters:
  • id (int) – The id of the entry to update.

  • local_path (str) – The new local path.