es_sfgtools.workflows.pipelines.sv3_pipeline module
- class es_sfgtools.workflows.pipelines.sv3_pipeline.SV3Pipeline(directory_handler: DirectoryHandler | None = None, asset_catalog: PreProcessCatalogHandler | None = None, config: SV3PipelineConfig = None)
Bases:
WorkflowABCOrchestrates the end-to-end processing of Sonardyne SV3 and Novatel GNSS data for seafloor geodesy.
This class manages a comprehensive workflow for processing seafloor geodesy data, including:
GNSS Data Preprocessing: - Processes Novatel 770 binary files (primary GNSS observations) - Processes Novatel 000 binary files (secondary GNSS + IMU positions) - Stores observations in TileDB arrays for efficient access
RINEX Generation: - Converts TileDB GNSS observations to daily RINEX files - Manages RINEX metadata and file organization
Precise Point Positioning: - Downloads GNSS product files (SP3, OBX, ATT) - Runs PRIDE-PPPAR for high-precision positioning - Generates kinematic (KIN) and residual files
Kinematic Position Processing: - Converts KIN files to structured dataframes - Stores kinematic positions in TileDB for interpolation
Acoustic Data Processing: - Processes Sonardyne DFOP00 files (acoustic ping-reply sequences) - Generates preliminary shotdata with acoustic ranges
Shotdata Refinement: - Interpolates high-precision GNSS positions to acoustic ping times - Refines shotdata with improved position estimates
Sound Velocity Profile Processing: - Processes CTD and Seabird files - Generates sound velocity profiles for acoustic corrections
The pipeline operates on a hierarchical directory structure (network/station/campaign) and uses TileDB for efficient storage and retrieval of time-series data.
- directory_handler
Manages the project directory structure, including network, station, and campaign directories.
- Type:
- config
Configuration settings for all pipeline steps, including Novatel, RINEX, PRIDE, DFOP00, and position update configs.
- Type:
- asset_catalog
SQLite-based catalog for tracking processed assets and their relationships (parent-child, merge jobs).
- Type:
PreProcessCatalog
- current_network
Current network identifier (e.g., “cascadia-gorda”).
- Type:
str
- current_station
Current station identifier (e.g., “NCC1”).
- Type:
str
- current_campaign
Current campaign identifier (e.g., “2023_A_1126”).
- Type:
str
- current_network_dir
Directory object for current network.
- Type:
- current_station_dir
Directory object for current station.
- Type:
- current_campaign_dir
Directory object for current campaign.
- Type:
- shotDataPreTDB
Preliminary shotdata (before position refinement).
- Type:
- kinPositionTDB
High-precision kinematic positions.
- Type:
- imuPositionTDB
IMU-derived positions (from Novatel 000).
- Type:
- shotDataFinalTDB
Final shotdata (after position refinement).
- Type:
- gnssObsTDBURI
Primary GNSS observation array (from Novatel 770).
- Type:
Path
- gnssObsTDB_secondaryURI
Secondary GNSS observation array (from Novatel 000).
- Type:
Path
- set_network_station_campaign(network, station, campaign)
Set the current processing context and initialize directories and TileDB arrays.
- _build_rinex_metadata()
Prepare metadata for RINEX file generation from GNSS observations.
- pre_process_novatel()
Preprocess Novatel 770 and 000 binary files into TileDB arrays.
- get_rinex_files()
Generate daily RINEX files from TileDB GNSS observations.
- process_rinex()
Process RINEX files using PRIDE-PPPAR to generate Kinematic files.
- process_kin()
Convert Kinematic files to structured dataframes and store in TileDB.
- process_dfop00()
Process Sonardyne DFOP00 files to generate preliminary shotdata.
- update_shotdata()
Refine shotdata by interpolating high-precision GNSS positions.
- process_svp()
Process CTD and Seabird files to generate sound velocity profiles.
- run_pipeline()
Execute the full processing pipeline in sequence.
- get_rinex_files() None
Generate and catalog daily RINEX files for the current campaign.
Steps: 1. Consolidates GNSS observation data 2. Determines processing year from config or campaign name 3. Invokes tile2rinex to generate daily RINEX files 4. Creates AssetEntry for each RINEX file 5. Updates asset catalog with merge job
- Raises:
ValueError – If a processing year cannot be determined from the campaign name.
Exception – If an error occurs during RINEX file generation.
- mid_process_workflow: bool = False
- pre_process_novatel() None
Preprocess Novatel 770 and 000 binary files for the current context.
Processing steps: 1. Novatel 770: Extracts GNSS observations to primary TileDB array 2. Novatel 000: Extracts GNSS observations to secondary array + IMU
positions
Both steps check if processing is needed (via override config or merge status) and update the asset catalog upon completion.
- Raises:
Exception – If no Novatel 770 or 000 files are found.
- process_dfop00() None
Process Sonardyne DFOP00 files to generate preliminary shotdata.
Steps: 1. Retrieves DFOP00 files needing processing 2. Converts each file to shotdata dataframe (acoustic ping-reply
sequences)
Writes dataframes to preliminary shotdata TileDB array
Marks files as processed in asset catalog
Uses multiprocessing for efficient parallel processing.
- process_kin() None
Process KIN files to generate kinematic position dataframes.
Steps: 1. Retrieves KIN files needing processing 2. Converts each KIN file to a structured dataframe 3. Writes dataframes to kinematic position TileDB array 4. Marks files as processed in asset catalog
- process_rinex() None
Run PRIDE-PPP on RINEX files to generate KIN and residual files.
Processing steps: 1. Retrieves RINEX files needing processing 2. Downloads GNSS product files (SP3, OBX, ATT) for each unique DOY 3. Runs PRIDE-PPPAR in parallel to convert RINEX to KIN format 4. Adds KIN and residual files to asset catalog
Uses multiprocessing for efficient parallel processing of multiple RINEX files.
- process_svp(override: bool = False) None
Process CTD and Seabird files to generate sound velocity profiles (SVP).
Processing order: 1. Tries CTD files with CTD_to_svp_v2 2. If that fails, tries CTD_to_svp_v1 3. If still no success, tries Seabird files
The first successful SVP is saved to the campaign directory and processing stops.
- Parameters:
override (bool, optional) – If True, forces reprocessing even if SVP file exists. Default is False.
- run_pipeline() None
Execute the complete SV3 data processing pipeline in sequence.
Pipeline steps (in order): 1. pre_process_novatel(): Process Novatel GNSS data 2. get_rinex_files(): Generate RINEX files 3. process_rinex(): Run PRIDE-PPP on RINEX 4. process_kin(): Convert KIN files to dataframes 5. process_dfop00(): Process acoustic data 6. update_shotdata(): Refine shotdata with high-precision positions 7. process_svp(): Generate sound velocity profile
Each step checks if processing is needed via config overrides or catalog status.
- set_network_station_campaign(network_id: str, station_id: str, campaign_id: str) None
Set the current network, station, and campaign context for pipeline processing.
This method establishes the processing context and performs several initialization tasks: 1. Resets previous context and clears TileDB arrays if context changes 2. Calls parent method to handle context switching 3. Validates data availability 4. Initializes TileDB arrays 5. Configures logging 6. Prepares RINEX metadata
- Parameters:
network_id (str) – Network identifier (e.g., “cascadia-gorda”).
station_id (str) – Station identifier (e.g., “NCC1”).
campaign_id (str) – Campaign identifier (e.g., “2023_A_1126”).
- update_shotdata()
Refine shotdata with interpolated high-precision kinematic positions.
- es_sfgtools.workflows.pipelines.sv3_pipeline.rinex_to_kin_wrapper(rinex_prideconfig_path: tuple[AssetEntry, Path], writedir: Path, pridedir: Path, site: str, pride_config: PrideCLIConfig) tuple[AssetEntry | None, AssetEntry | None]
Wrapper function to convert a RINEX file to KIN format using PRIDE configuration.
This function takes a tuple containing an AssetEntry for the RINEX file and the path to the PRIDE configuration file, along with directories for writing output and PRIDE processing, the site name, and a PRIDE CLI configuration object. It updates the PRIDE configuration with the provided config file path, then calls rinex_to_kin to perform the conversion. If successful, it returns AssetEntry objects for the generated KIN file and its residuals file; otherwise, returns (None, None).
- Parameters:
rinex_prideconfig_path (tuple[AssetEntry, Path]) – Tuple containing the RINEX AssetEntry and PRIDE config file path.
writedir (Path) – Directory where output files should be written.
pridedir (Path) – Directory for PRIDE processing.
site (str) – Name of the site/station.
pride_config (PrideCLIConfig) – PRIDE CLI configuration object.
- Returns:
AssetEntry for the generated KIN file and AssetEntry for the residuals file, or (None, None) if conversion fails.
- Return type:
tuple[Optional[AssetEntry], Optional[AssetEntry]]
- Raises:
Exception – If an error occurs during AssetEntry creation for KIN or RES file.