es_sfgtools.workflows.pipelines.sv3_pipeline module

class es_sfgtools.workflows.pipelines.sv3_pipeline.SV3Pipeline(directory_handler: DirectoryHandler | None = None, asset_catalog: PreProcessCatalogHandler | None = None, config: SV3PipelineConfig = None)

Bases: WorkflowABC

Orchestrates the end-to-end processing of Sonardyne SV3 and Novatel GNSS data for seafloor geodesy.

This class manages a comprehensive workflow for processing seafloor geodesy data, including:

  1. GNSS Data Preprocessing: - Processes Novatel 770 binary files (primary GNSS observations) - Processes Novatel 000 binary files (secondary GNSS + IMU positions) - Stores observations in TileDB arrays for efficient access

  2. RINEX Generation: - Converts TileDB GNSS observations to daily RINEX files - Manages RINEX metadata and file organization

  3. Precise Point Positioning: - Downloads GNSS product files (SP3, OBX, ATT) - Runs PRIDE-PPPAR for high-precision positioning - Generates kinematic (KIN) and residual files

  4. Kinematic Position Processing: - Converts KIN files to structured dataframes - Stores kinematic positions in TileDB for interpolation

  5. Acoustic Data Processing: - Processes Sonardyne DFOP00 files (acoustic ping-reply sequences) - Generates preliminary shotdata with acoustic ranges

  6. Shotdata Refinement: - Interpolates high-precision GNSS positions to acoustic ping times - Refines shotdata with improved position estimates

  7. Sound Velocity Profile Processing: - Processes CTD and Seabird files - Generates sound velocity profiles for acoustic corrections

The pipeline operates on a hierarchical directory structure (network/station/campaign) and uses TileDB for efficient storage and retrieval of time-series data.

directory_handler

Manages the project directory structure, including network, station, and campaign directories.

Type:

DirectoryHandler

config

Configuration settings for all pipeline steps, including Novatel, RINEX, PRIDE, DFOP00, and position update configs.

Type:

SV3PipelineConfig

asset_catalog

SQLite-based catalog for tracking processed assets and their relationships (parent-child, merge jobs).

Type:

PreProcessCatalog

current_network

Current network identifier (e.g., “cascadia-gorda”).

Type:

str

current_station

Current station identifier (e.g., “NCC1”).

Type:

str

current_campaign

Current campaign identifier (e.g., “2023_A_1126”).

Type:

str

current_network_dir

Directory object for current network.

Type:

NetworkDir

current_station_dir

Directory object for current station.

Type:

StationDir

current_campaign_dir

Directory object for current campaign.

Type:

CampaignDir

shotDataPreTDB

Preliminary shotdata (before position refinement).

Type:

TDBShotDataArray

kinPositionTDB

High-precision kinematic positions.

Type:

TDBKinPositionArray

imuPositionTDB

IMU-derived positions (from Novatel 000).

Type:

TDBIMUPositionArray

shotDataFinalTDB

Final shotdata (after position refinement).

Type:

TDBShotDataArray

gnssObsTDBURI

Primary GNSS observation array (from Novatel 770).

Type:

Path

gnssObsTDB_secondaryURI

Secondary GNSS observation array (from Novatel 000).

Type:

Path

set_network_station_campaign(network, station, campaign)

Set the current processing context and initialize directories and TileDB arrays.

_build_rinex_metadata()

Prepare metadata for RINEX file generation from GNSS observations.

pre_process_novatel()

Preprocess Novatel 770 and 000 binary files into TileDB arrays.

get_rinex_files()

Generate daily RINEX files from TileDB GNSS observations.

process_rinex()

Process RINEX files using PRIDE-PPPAR to generate Kinematic files.

process_kin()

Convert Kinematic files to structured dataframes and store in TileDB.

process_dfop00()

Process Sonardyne DFOP00 files to generate preliminary shotdata.

update_shotdata()

Refine shotdata by interpolating high-precision GNSS positions.

process_svp()

Process CTD and Seabird files to generate sound velocity profiles.

run_pipeline()

Execute the full processing pipeline in sequence.

get_rinex_files() None

Generate and catalog daily RINEX files for the current campaign.

Steps: 1. Consolidates GNSS observation data 2. Determines processing year from config or campaign name 3. Invokes tile2rinex to generate daily RINEX files 4. Creates AssetEntry for each RINEX file 5. Updates asset catalog with merge job

Raises:
  • ValueError – If a processing year cannot be determined from the campaign name.

  • Exception – If an error occurs during RINEX file generation.

mid_process_workflow: bool = False
pre_process_novatel() None

Preprocess Novatel 770 and 000 binary files for the current context.

Processing steps: 1. Novatel 770: Extracts GNSS observations to primary TileDB array 2. Novatel 000: Extracts GNSS observations to secondary array + IMU

positions

Both steps check if processing is needed (via override config or merge status) and update the asset catalog upon completion.

Raises:

Exception – If no Novatel 770 or 000 files are found.

process_dfop00() None

Process Sonardyne DFOP00 files to generate preliminary shotdata.

Steps: 1. Retrieves DFOP00 files needing processing 2. Converts each file to shotdata dataframe (acoustic ping-reply

sequences)

  1. Writes dataframes to preliminary shotdata TileDB array

  2. Marks files as processed in asset catalog

Uses multiprocessing for efficient parallel processing.

process_kin() None

Process KIN files to generate kinematic position dataframes.

Steps: 1. Retrieves KIN files needing processing 2. Converts each KIN file to a structured dataframe 3. Writes dataframes to kinematic position TileDB array 4. Marks files as processed in asset catalog

process_rinex() None

Run PRIDE-PPP on RINEX files to generate KIN and residual files.

Processing steps: 1. Retrieves RINEX files needing processing 2. Downloads GNSS product files (SP3, OBX, ATT) for each unique DOY 3. Runs PRIDE-PPPAR in parallel to convert RINEX to KIN format 4. Adds KIN and residual files to asset catalog

Uses multiprocessing for efficient parallel processing of multiple RINEX files.

process_svp(override: bool = False) None

Process CTD and Seabird files to generate sound velocity profiles (SVP).

Processing order: 1. Tries CTD files with CTD_to_svp_v2 2. If that fails, tries CTD_to_svp_v1 3. If still no success, tries Seabird files

The first successful SVP is saved to the campaign directory and processing stops.

Parameters:

override (bool, optional) – If True, forces reprocessing even if SVP file exists. Default is False.

run_pipeline() None

Execute the complete SV3 data processing pipeline in sequence.

Pipeline steps (in order): 1. pre_process_novatel(): Process Novatel GNSS data 2. get_rinex_files(): Generate RINEX files 3. process_rinex(): Run PRIDE-PPP on RINEX 4. process_kin(): Convert KIN files to dataframes 5. process_dfop00(): Process acoustic data 6. update_shotdata(): Refine shotdata with high-precision positions 7. process_svp(): Generate sound velocity profile

Each step checks if processing is needed via config overrides or catalog status.

set_network_station_campaign(network_id: str, station_id: str, campaign_id: str) None

Set the current network, station, and campaign context for pipeline processing.

This method establishes the processing context and performs several initialization tasks: 1. Resets previous context and clears TileDB arrays if context changes 2. Calls parent method to handle context switching 3. Validates data availability 4. Initializes TileDB arrays 5. Configures logging 6. Prepares RINEX metadata

Parameters:
  • network_id (str) – Network identifier (e.g., “cascadia-gorda”).

  • station_id (str) – Station identifier (e.g., “NCC1”).

  • campaign_id (str) – Campaign identifier (e.g., “2023_A_1126”).

update_shotdata()

Refine shotdata with interpolated high-precision kinematic positions.

es_sfgtools.workflows.pipelines.sv3_pipeline.rinex_to_kin_wrapper(rinex_prideconfig_path: tuple[AssetEntry, Path], writedir: Path, pridedir: Path, site: str, pride_config: PrideCLIConfig) tuple[AssetEntry | None, AssetEntry | None]

Wrapper function to convert a RINEX file to KIN format using PRIDE configuration.

This function takes a tuple containing an AssetEntry for the RINEX file and the path to the PRIDE configuration file, along with directories for writing output and PRIDE processing, the site name, and a PRIDE CLI configuration object. It updates the PRIDE configuration with the provided config file path, then calls rinex_to_kin to perform the conversion. If successful, it returns AssetEntry objects for the generated KIN file and its residuals file; otherwise, returns (None, None).

Parameters:
  • rinex_prideconfig_path (tuple[AssetEntry, Path]) – Tuple containing the RINEX AssetEntry and PRIDE config file path.

  • writedir (Path) – Directory where output files should be written.

  • pridedir (Path) – Directory for PRIDE processing.

  • site (str) – Name of the site/station.

  • pride_config (PrideCLIConfig) – PRIDE CLI configuration object.

Returns:

AssetEntry for the generated KIN file and AssetEntry for the residuals file, or (None, None) if conversion fails.

Return type:

tuple[Optional[AssetEntry], Optional[AssetEntry]]

Raises:

Exception – If an error occurs during AssetEntry creation for KIN or RES file.