es_sfgtools.workflows.pipelines.shotdata_gnss_refinement module

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.analyze_offsets(merged_positions: DataFrame) None

Analyzes the offsets between smoothed and original antenna positions.

Calculates the absolute differences for the X, Y, and Z coordinates between the columns ‘ant_x_smoothed’, ‘ant_y_smoothed’, ‘ant_z_smoothed’ and their respective original columns ‘ant_x’, ‘ant_y’, ‘ant_z’. Computes summary statistics (count, mean, std, min, 25%, 50%, 75%, max) for each offset and prints the results in a formatted table.

Parameters:

merged_positions (pd.DataFrame) – DataFrame containing the columns ‘ant_x’, ‘ant_y’, ‘ant_z’, ‘ant_x_smoothed’, ‘ant_y_smoothed’, and ‘ant_z_smoothed’.

Returns:

Prints the summary statistics to the console. If the DataFrame is empty, prints a message and returns.

Return type:

None

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.combine_data(imu_position_data: DataFrame, ppp_position_data: DataFrame) DataFrame

Combines IMU position and PPP position data into a single DataFrame.

This is done with a specified column order.

Parameters:
  • imu_position_data (pd.DataFrame) – DataFrame containing IMU position data with columns matching the expected column order.

  • ppp_position_data (pd.DataFrame) – DataFrame containing PPP position data with columns matching the expected column order.

Returns:

Combined DataFrame containing both position and GPS data, ordered by time and columns. Note: Rows with NaN values are retained to preserve kinematic velocity information.

Return type:

pd.DataFrame

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.filter_spatial_outliers(df: DataFrame, radius: float = 5000) DataFrame

Filters out rows that are outside a specified radius from the median ECEF position.

Parameters:
  • df (pd.DataFrame) – Input DataFrame containing ECEF position columns ‘ant_x’, ‘ant_y’, ‘ant_z’.

  • radius (float) – Radius in meters to define the acceptable range from the median position.

Returns:

Filtered DataFrame with rows outside the specified radius removed.

Return type:

pd.DataFrame

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.interpolate_enu(tenu_l: ndarray, enu_l_sig: ndarray, tenu_r: ndarray, enu_r_sig: ndarray) ndarray

Interpolate the enu values between the left and right enu values.

Parameters:
  • tenu_l (np.ndarray) – The left enu time values in unix epoch.

  • enu_l_sig (np.ndarray) – The standard deviation of the left enu values in ECEF coordinates.

  • tenu_r (np.ndarray) – The right enu time values in unix epoch.

  • enu_r_sig (np.ndarray) – The standard deviation of the right enu values in ECEF coordinates.

Returns:

The interpolated enu values and the standard deviation of the interpolated enu values predicted at the time values from tenu_r.

Return type:

np.ndarray

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.interpolate_enu_kernelridge(kin_position_data: ndarray, shot_data: ndarray, lengthscale: float = 0.5) ndarray

Interpolate the enu values using Kernel Ridge Regression.

Parameters:
  • kin_position_data (np.ndarray) – The kinematic position data.

  • shot_data (np.ndarray) – The shot data.

  • lengthscale (float, optional) – The length scale for the kernel, by default 0.5.

Returns:

The interpolated enu values at the time values from tenu_r.

Return type:

np.ndarray

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.interpolate_enu_radius_regression(kin_position_df: DataFrame, shotdata_df: DataFrame, lengthscale: float = 0.1) DataFrame

Interpolate the enu values using Radius Neighbors Regression.

Parameters:
  • kin_position_df (pd.DataFrame) – The kinematic position data.

  • shotdata_df (pd.DataFrame) – The shot data.

  • lengthscale (float, optional) – The length scale for the kernel, by default 0.1.

Returns:

The updated shotdata DataFrame.

Return type:

pd.DataFrame

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.main(shotdata: DataFrame, kin_positions: DataFrame, positions_data: DataFrame, gnss_pos_psd: float | ndarray = 3.125e-05, vel_psd: float | ndarray = 0.0025, cov_err: float | ndarray = 0.25, start_dt: float | Timestamp = 0.05, filter_radius: float = 5000) DataFrame

Refines shotdata using GNSS and IMU data through Kalman filtering and smoothing.

Parameters:
  • shotdata (pd.DataFrame) – DataFrame containing shot event data to be refined.

  • kin_positions (pd.DataFrame) – DataFrame containing kinematic GNSS positions.

  • positions_data (pd.DataFrame) – DataFrame containing original positions data.

  • gnss_pos_psd (float or array-like, optional) – GNSS position process noise spectral density (default: constants.GNSS_POS_PSD).

  • vel_psd (float or array-like, optional) – Velocity process noise spectral density (default: constants.VEL_PSD).

  • cov_err (float or array-like, optional) – Initial covariance error (default: constants.COV_ERR).

  • start_dt (float or pd.Timestamp, optional) – Start datetime for filtering (default: constants.START_DT).

  • filter_radius (float, optional) – Radius for spatial outlier filtering in meters (default: 5000).

Returns:

Updated shotdata DataFrame with refined positions and antenna offsets.

Return type:

pd.DataFrame

Notes

  • Combines positions and kinematic GNSS data, filters spatial outliers, and applies Kalman filter smoothing.

  • Merges smoothed results with original and kinematic positions for offset analysis.

  • Updates shotdata with refined positions and prints summary statistics of antenna offsets.

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.merge_shotdata_kinposition(shotdata_pre: TDBShotDataArray, shotdata: TDBShotDataArray, kin_position: TDBKinPositionArray, position_data: TDBIMUPositionArray, dates: List[datetime64], filter_radius: float = 5000) TDBShotDataArray

Merge the shotdata and kin_position data.

Parameters:
  • shotdata_pre (TDBShotDataArray) – The DFOP00 data.

  • shotdata (TDBShotDataArray) – The shotdata array to write to.

  • kin_position (TDBKinPositionArray) – The TileDB KinPosition array.

  • position_data (TDBIMUPositionArray) – The TileDB IMU position array.

  • dates (List[datetime64]) – The dates to merge.

  • filter_radius (float, optional) – Radius for spatial outlier filtering in meters, by default 5000.

Returns:

The updated shotdata array.

Return type:

TDBShotDataArray

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.merge_shotdata_kinposition_radius_regression(shotdata_pre: TDBShotDataArray, shotdata: TDBShotDataArray, kin_position: TDBKinPositionArray, dates: List[datetime64], lengthscale: float = 0.1, plot: bool = False) TDBShotDataArray

Merge the shotdata and kin_position data.

Parameters:
  • shotdata_pre (TDBShotDataArray) – The DFOP00 data.

  • shotdata (TDBShotDataArray) – The shotdata array to write to.

  • kin_position (TDBKinPositionArray) – The TileDB KinPosition array.

  • dates (List[datetime64]) – The dates to merge.

  • lengthscale (float, optional) – The length scale for the kernel, by default 0.1.

  • plot (bool, optional) – Plot the interpolated values, by default False.

Returns:

The updated shotdata array.

Return type:

TDBShotDataArray

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.prepare_kinematic_data(kin_positions: DataFrame) DataFrame

Prepares kinematic GPS data for Kalman filtering.

This is done by computing velocities and filtering outliers.

This function takes a DataFrame containing kinematic GPS positions and processes it as follows: - Copies the input DataFrame to avoid modifying the original. - Renames position columns (‘east’, ‘north’, ‘up’) to antenna

coordinates (‘ant_x’, ‘ant_y’, ‘ant_z’).

  • Initializes velocity columns (‘east’, ‘north’, ‘up’) with NaN values.

  • Adds uncertainty and correlation columns with default values.

  • Calculates velocity components by differentiating position over time.

  • Filters out rows with velocity spikes using a z-score threshold.

  • Prints the reduction in data size after filtering.

Parameters:

kin_positions (pd.DataFrame) – DataFrame containing kinematic GPS positions with columns ‘east’, ‘north’, ‘up’, and ‘time’.

Returns:

Processed DataFrame with velocity columns and outlier rows removed.

Return type:

pd.DataFrame

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.prepare_positions_data(positions_data: DataFrame) DataFrame

Prepares IMU positions data for Kalman filtering.

This is done by converting geodetic coordinates to ECEF, computing median positions, and adding velocity and uncertainty columns.

Parameters:

positions_data (pandas.DataFrame) – DataFrame containing IMU position and velocity data with columns: ‘latitude’, ‘longitude’, ‘height’, ‘eastVelocity’, ‘northVelocity’, ‘upVelocity’, and their respective standard deviations.

Returns:

positions_data_copy – A copy of the input DataFrame with additional columns: - ‘ant_x’, ‘ant_y’, ‘ant_z’: ECEF coordinates - ‘east’, ‘north’, ‘up’: velocity components - ‘ant_sigx’, ‘ant_sigy’, ‘ant_sigz’: uncertainties in position - ‘rho_xy’, ‘rho_xz’, ‘rho_yz’: correlation coefficients (set to 0) - ‘east_sig’, ‘north_sig’, ‘up_sig’: uncertainties in velocity - ‘v_sden’, ‘v_sdeu’, ‘v_sdnu’: additional velocity uncertainty columns (set to 0)

Return type:

pandas.DataFrame

Notes

Also sets global variables MEDIAN_EAST_POSITION, MEDIAN_NORTH_POSITION, and MEDIAN_UP_POSITION to the median ECEF coordinates.

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.run_kalman_filter_and_smooth(df_all: DataFrame, start_dt: float, gnss_pos_psd: float, vel_psd: float, cov_err: float) DataFrame

Runs a Kalman filter simulation on GNSS shot data and processes the results.

Parameters:
  • df_all (pd.DataFrame) – Input DataFrame containing GNSS shot data. Rows with NaN values are dropped before processing.

  • start_dt (float) – Initial time delta for the Kalman filter simulation.

  • gnss_pos_psd (float) – Position process spectral density for GNSS measurements.

  • vel_psd (float) – Velocity process spectral density for the filter.

  • cov_err (float) – Initial covariance error for the filter.

Returns:

DataFrame containing smoothed GNSS positions and associated covariance statistics. If the input DataFrame is empty after dropping NaNs, returns an empty DataFrame.

Return type:

pd.DataFrame

es_sfgtools.workflows.pipelines.shotdata_gnss_refinement.update_shotdata_with_smoothed_positions(shotdata: DataFrame, smoothed_results: DataFrame) DataFrame

Interpolates smoothed positions onto shotdata ping and return times.

Parameters:
  • shotdata (pd.DataFrame) – The shotdata DataFrame.

  • smoothed_results (pd.DataFrame) – The smoothed results DataFrame.

Returns:

The updated shotdata DataFrame.

Return type:

pd.DataFrame