API Documentation

This module provides the tools to download OOI data from- the OOI Raw Data Server.

Request Module

Tools for downloading OOI data

Hydrophone Request

This modules handles the downloading of OOI Data. As of current, the supported OOI sensors include all broadband hydrophones (Fs = 64 kHz), all low frequency hydrophones (Fs = 200 Hz), and bottom mounted OBSs. All supported hydrophone nodes are listed in the Hydrophone Nodes section below.

ooipy.request.hydrophone_request.get_acoustic_data(starttime, endtime, node, fmin=None, fmax=None, max_workers=-1, append=True, verbose=False, mseed_file_limit=None, large_gap_limit=1800.0, obspy_merge_method=0, gapless_merge=True, single_ms_buffer=False, jupyter_hub=False)

Get broadband acoustic data for specific time frame and sensor node. The data is returned as a HydrophoneData object. This object is based on the obspy data trace.

>>> import ooipy
>>> start_time = datetime.datetime(2017,3,10,0,0,0)
>>> end_time = datetime.datetime(2017,3,10,0,5,0)
>>> node = 'PC01A'
>>> data = ooipy.request.get_acoustic_data(start_time, end_time, node)
>>> # To access stats for retrieved data:
>>> print(data.stats)
>>> # To access numpy array of data:
>>> print(data.data)
Parameters:
  • start_time (datetime.datetime) – time of the first noise sample

  • end_time (datetime.datetime) – time of the last noise sample

  • node (str) – hydrophone name or identifier

  • fmin (float, optional) – lower cutoff frequency of hydrophone’s bandpass filter. Default is None which results in no filtering.

  • fmax (float, optional) – higher cutoff frequency of hydrophones bandpass filter. Default is None which results in no filtering.

  • print_exceptions (bool, optional) – whether or not exceptions are printed in the terminal line

  • max_workers (int, optional) – number of maximum workers for concurrent processing. Default is -1 (uses number of available cores)

  • append (bool, optional) – specifies if extra mseed files should be appended at beginning and end in case of boundary gaps in data. Default is True

  • verbose (bool, optional) – specifies whether print statements should occur or not

  • mseed_file_limit (int, optional) – If the number of mseed traces to be merged exceed this value, the function returns None. For some days the mseed files contain only a few seconds or milli seconds of data and merging a huge amount of files can dramatically slow down the program. if None (default), the number of mseed files will not be limited. This also limits the number of traces in a single file.

  • large_gap_limit (float, optional) – Defines the length in second of large gaps in the data. Sometimes, large data gaps are present on particular days. This can cause long interpolation times if data_gap_mode 0 or 2 are used, possibly resulting in a memory overflow. If a data gap is longer than large_gap_limit, data are only retrieved before (if the gap stretches beyond the requested time) or after (if the gap starts prior to the requested time) the gap, or not at all (if the gap is within the requested time).

  • obspy_merge_method (int, optional) –

    either [0,1], see [obspy documentation](https://docs.obspy.org/packages/autogen/

    obspy.core.trace.Trace.html#handling-overlaps)

    for description of merge methods

  • gapless_merge (bool, optional) – OOI BB hydrophones have had problems with data fragmentation, where individual files are only fractions of seconds long. Before June 2023, these were saved as separate mseed files. after 2023 (and in some cases, but not all retroactively), 5 minute mseed files contain many fragmented traces. These traces are essentially not possible to merge with obspy.merge. If True, then method to merge traces without consideration of gaps will be attempted. This will only be done if there is full data coverage over 5 min file length, but could still result in unalligned data. Default value is True. You should probably not use this method for data before June 2023 because it will likely cause an error.

  • single_ms_buffer (bool) – If true, than 5 minute samples that have ± 1ms of data will also be allowed when using gapless merge. There is an issue in the broadband hydrophone data where there is occasionally ± 1 ms of data for a 5 minute segment (64 samples). This is likely due to the GPS clock errors that cause the data fragmentation in the first place.

  • jupyter_hub (bool) – If true, ooipy uses local directory structure of the jupyter hub to access the data. This should be set to true if you are using the ooi jupyter hub. Default is false, which accesses data through the http raw data server.

  • starttime (datetime)

  • endtime (datetime)

Return type:

HydrophoneData

ooipy.request.hydrophone_request.get_acoustic_data_LF(starttime, endtime, node, fmin=None, fmax=None, verbose=False, zero_mean=False, channel='HDH', correct=False, merge_traces=False)

Get low frequency acoustic data for specific time frame and sensor node. The data is returned as a HydrophoneData object. This object is based on the obspy data trace. Example usage is shown below. This function does not include the full functionality provided by the IRIS data portal.

If there is no data for the specified time window, then None is returned

>>> starttime = datetime.datetime(2017,3,10,7,0,0)
>>> endtime = datetime.datetime(2017,3,10,7,1,30)
>>> location = 'Axial_Base'
>>> fmin = None
>>> fmax = None
>>> # Returns ooipy.ooipy.hydrophone.base.HydrophoneData Object
>>> data_trace = hydrophone_request.get_acoustic_data_LF(
        starttime, endtime, location, fmin, fmax, zero_mean=True)
>>> # Access data stats
>>> data_trace.stats
>>> # Access numpy array containing data
>>> data_trace.data
Parameters:
  • start_time (datetime.datetime) – time of the first noise sample

  • end_time (datetime.datetime) – time of the last noise sample

  • node (str) – hydrophone

  • fmin (float, optional) – lower cutoff frequency of hydrophone’s bandpass filter. Default is None which results in no filtering.

  • fmax (float, optional) – higher cutoff frequency of hydrophones bandpass filter. Default is None which results in no filtering.

  • verbose (bool, optional) – specifies whether print statements should occur or not

  • zero_mean (bool, optional) – specifies whether the mean should be removed. Default to False

  • channel (str) – Channel of hydrophone to get data from. Currently supported options are ‘HDH’ - hydrophone, ‘HNE’ - east seismometer, ‘HNN’ - north seismometer, ‘HNZ’ - z seismometer. NOTE calibration is only valid for ‘HDH’ channel. All other channels are for raw data only at this time.

  • correct (bool) – whether or not to use IRIS calibration code. NOTE: when this is true, computing PSDs is currently broken as calibration is computed twice

  • merge_traces (bool) – if true will merge all traces within start_time and end_time returned from Earthscope

Returns:

hydrophone_data – Hyrophone data object. If there is no data in the time window, None is returned

Return type:

HydrophoneData

ooipy.request.hydrophone_request.ooipy_read(device, node, starttime, endtime, fmin=None, fmax=None, verbose=False, data_gap_mode=0, zero_mean=False)

this function is under development

General Purpose OOIpy read function. Parses input parameters to appropriate, device specific, read function.

Parameters:
  • device (str) – Specifies device type. Valid option are ‘broadband_hydrohpone’ and ‘low_frequency_hydrophone’

  • node (str) – Specifies data acquisition device location. TODO add available options

  • starttime (datetime.datetime) – Specifies start time of data requested

  • endtime (datetime.datetime) – Specifies end time of data requested

  • fmin (float) – Low frequency corner for filtering. If None are give, then no filtering happens. Broadband hydrophone data is filtered using Obspy. Low frequency hydrophone uses IRIS filtering.

  • fmax (float) – High frequency corner for filtering.

  • verbose (bool) – Specifies whether or not to print status update statements.

  • data_gap_mode (int) – specifies how gaps in data are handled see documentation for get_acoustic_data

Returns:

hydrophone_data – Object that stores hydrophone data. Similar to obspy trace.

Return type:

HydrophoneData

CTD Request

Tools for downloading CTD Data. The first place you should look is the ooi data explorer, but if the data you need is not available, then these tools may be helpful.

ooipy.request.ctd_request.get_ctd_data(start_datetime, end_datetime, location, limit=10000, only_profilers=False, delivery_method='auto')

Requests CTD data between start_detetime and end_datetime for the specified location if data is available. For each location, data of all available CTDs are requested and concatenated in a list. That is the final data list can consists of multiple segments of data, where each segment contains the data from one instrument. This means that the final list might not be ordered in time, but rather should be treated as an unordered list of CTD data points.

Parameters:
  • start_datetime (datetime.datetime) – time of first sample from CTD

  • end_datetime (datetime.datetime) – time of last sample from CTD

  • location (str) – location for which data are requested. Possible choices are: * ‘oregon_inshore’ * ‘oregon_shelf’ * ‘oregon_offshore’ * ‘oregon_slope’ * ‘washington_inshore’ * ‘washington_shelf’ * ‘washington_offshore’ * ‘axial_base’

  • limit (int) – maximum number of data points returned in one request. The limit applies for each instrument separately. That is the final list of data points can contain more samples then indicated by limit if data from multiple CTDs is available at the given location and time. Default is 10,0000.

  • only_profilers (bool) – Specifies whether only data from the water column profilers should be requested. Default is False

  • delivery_method (str) –

    Specifies which delivery method is considered. For details please refer to http://oceanobservatories.org/glossary/. Options are: * ‘auto’ (default): automatically uses method that has data

    available

    • ’streamed’: only considers data that are streamed to shore

      via cable

    • ’telemetered’: only considers data that are streamed to shore

      via satellite

    • ’recovered’: only considers data that were reteived when the

      instrument was retrieved

Returns:

ctd_data – object, where the data array is stored in the raw_data attribute. Each data sample consists of a dictionary of parameters measured by the CTD.

Return type:

ooipy.ctd.basic.CtdProfile

ooipy.request.ctd_request.get_ctd_data_daily(datetime_day, location, limit=10000, only_profilers=False, delivery_method='auto')

Requests CTD data for specified day and location. The day is split into 24 1-hour periods and for each 1-hour period ooipy.ctd_request.get_ctd_data is called. The data for all 1-hour periods are then concatednated in stored in a :class:`ooipy.ctd.basic.CtdProfile() object.

Parameters:
  • datetime_day (datetime.datetime) – Day for which CTD data are requested

  • location (str) – See :func:`ooipy.ctd_request.get_ctd_data

  • limit (int) – See :func:`ooipy.ctd_request.get_ctd_data

  • only_profilers (bool) – See :func:`ooipy.ctd_request.get_ctd_data

  • delivery_method (str) – See :func:`ooipy.ctd_request.get_ctd_data

Returns:

  • ooipy.ctd.basic.CtdProfile object, where the data array is

  • stored in the raw_data attribute. Each data sample consists of a

  • dictionary of parameters measured by the CTD.

Hydrophone data object

The {py:meth}`ooipy.get_acoustic_data` and {py:meth}`ooipy.get_acoustic_data_LF` functions return the {py:class}`ooipy.HydrophoneData` object.

The ooipy.HydrophoneData objects inherits from obspy.Trace, and methods for computing calibrated spectrograms and power spectral densities are added.

class ooipy.hydrophone.basic.HydrophoneData(data=array([], dtype=float64), header=None, node='')

Object that stores hydrophone data

type

Either ‘broadband’ or ‘low_frequency’ specifies the type of hydrophone that the date is from.

Type:

str

compute_psd_welch(win='hann', L=4096, overlap=0.5, avg_method='median', interpolate=None, scale='log', verbose=True)

Compute power spectral density estimates of noise data using Welch’s method.

Parameters:
  • win (str, optional) – Window function used to taper the data. See scipy.signal.get_window for a list of possible window functions (Default is Hann-window.)

  • L (int, optional) – Length of each data block for computing the FFT (Default is 4096).

  • overlap (float, optional) – Percentage of overlap between adjacent blocks if Welch’s method is used. Parameter is ignored if avg_time is None. (Default is 50%)

  • avg_method (str, optional) – Method for averaging the periodograms when using Welch’s method. Either ‘mean’ or ‘median’ (default) can be used

  • interpolate (float, optional) – Resolution in frequency domain in Hz. If None (default), the resolution will be sampling frequency fs divided by L. If interpolate is smaller than fs/L, the PSD will be interpolated using zero-padding

  • scale (str, optional) – If ‘log’ (default) PSD in logarithmic scale (dB re 1µPa^2/H) is returned. If ‘lin’, PSD in linear scale (1µPa^2/H) is returned

  • verbose (bool, optional) – If true (default), exception messages and some comments are printed.

Returns:

psd

An xarray.DataArray object that contains frequency bins and PSD values. If no

noise date is available, None is returned.

Return type:

xr.DataArray

compute_spectrogram(win='hann', L=4096, avg_time=None, overlap=0.5, verbose=True, average_type='median')

Compute spectrogram of acoustic signal. For each time step of the spectrogram either a modified periodogram (avg_time=None) or a power spectral density estimate using Welch’s method with median or mean averaging is computed.

Parameters:
  • win (str, optional) – Window function used to taper the data. See scipy.signal.get_window for a list of possible window functions (Default is Hann-window.)

  • L (int, optional) – Length of each data block for computing the FFT (Default is 4096).

  • avg_time (float, optional) – Time in seconds that is covered in one time step of the spectrogram. Default value is None and one time step covers L samples. If the signal covers a long time period it is recommended to use a higher value for avg_time to avoid memory overflows and to facilitate visualization.

  • overlap (float, optional) – Percentage of overlap between adjacent blocks if Welch’s method is used. Parameter is ignored if avg_time is None. (Default is 50%)

  • verbose (bool, optional) – If true (default), exception messages and some comments are printed.

  • average_type (str) – type of averaging if Welch PSD estimate is used. options are ‘median’ (default) and ‘mean’.

Returns:

spectrogram – An xarray.DataArray object that contains time and frequency bins as well as corresponding values. If no noise date is available, None is returned.

Return type:

xr.DataArray

frequency_calibration(N)

Apply a frequency dependent sensitivity correction to the acoustic data based on the information from the calibration sheets. Hydrophone deployments are found at https://github.com/OOI-CabledArray/deployments Hydrophone calibration sheets are found at https://github.com/OOI-CabledArray/calibrationFiles :type N: int :param N: length of the data segment :type N: int

Returns:

output_array – array with correction coefficient for every frequency

Return type:

np.array

get_asset_ID()

get_asset_ID returns the hydrophone asset ID for a given data sample. This data can be found here for broadband hydrophones. Since Low frequency hydrophones remain constant with location and time, if the hydrophone is low frequency, {location}-{channel} string combination is returned

save(file_format, filename, wav_kwargs={})

save hydrophone data in specified method. Supported methods are: - pickle - saves the HydrophoneData object as a pickle file - netCDF - saves HydrophoneData object as netCDF. Time coordinates are not included - mat - saves HydrophoneData object as a .mat file - wav - calls wav_write method to save HydrophoneData object as a .wav file

Parameters:
  • file_format (str) – format to save HydrophoneData object as. Supported formats are [‘pkl’, ‘nc’, ‘mat’, ‘wav’]

  • filepath (str) – filepath to save HydrophoneData object. file extension should not be included

  • wav_kwargs (dict) – dictionary of keyword arguments to pass to wav_write method

Return type:

None

wav_write(filename, norm=False, new_sample_rate=None)

method that stores HydrophoneData into .wav file

Parameters:
  • filename (str) – filename to store .wav file as

  • norm (bool) – specifies whether data should be normalized to 1

  • new_sample_rate (float) – specifies new sample rate of wav file to be saved. (Resampling is done with scipy.signal.resample()). Default is None which keeps original sample rate of data.

ooipy.hydrophone.basic.node_id(node)

mapping of name of hydrophone node to ID

Parameter

nodestr

name or ID of the hydrophone node

returns:

ID of hydrophone node

rtype:

str

ooipy.hydrophone.basic.node_name(node)

mapping of ID of hydrophone node to name

Parameter

nodestr

ID or name of the hydrophone node

returns:

name of hydrophone node

rtype:

str

CTD data object

Module for CTD data objects

class ooipy.ctd.basic.CtdData(raw_data=None, extract_parameters=True)

Object that stores conductivity, temperature, depth (CTD) data, and provides functions for calculating sound speed, temperature, pressure, and salinity profiles. When a CtdData object is created and extract_parameters = True (default), then temperature, pressure, salinity, and time are automatically extracted from the raw data.

raw_data

list containing sample from CTD. Each sample is a dictionary containing all parameters measured by the CTD.

Type:

list of dict

temperature

array containing temperature samples in degree celsius.

Type:

numpy.ndarray

pressure

array containing pressure samples in dbar.

Type:

numpy.ndarray

salinity

array containing salinity samples in parts per thousand.

Type:

numpy.ndarray

depth

array containing depth samples in meter.

Type:

numpy.ndarray

density

array containing density samples in kg/cubic meter.

Type:

numpy.ndarray

conductivity

array containing conductivity samples in siemens/meter.

Type:

numpy.ndarray

sound_speed

array containing sound speed samples in meter/second.

Type:

numpy.ndarray

time

array containing time samples as datetime.datetime objects.

Type:

numpy.ndarray

sound_speed_profile

object for sound speed profile.

Type:

ooipy.ctd.basic.CtdProfile

temperature_profile

object for temperature profile.

Type:

ooipy.ctd.basic.CtdProfile

salinity_profile

object for salinity profile.

Type:

ooipy.ctd.basic.CtdProfile

pressure_profile

object for pressure profile.

Type:

ooipy.ctd.basic.CtdProfile

density_profile

object for density profile.

Type:

ooipy.ctd.basic.CtdProfile

conductivity_profile

object for conductivity profile.

Type:

ooipy.ctd.basic.CtdProfile

calc_depth_from_pressure()

Calculates depth from pressure array

calc_sound_speed()

Calculates sound speed from temperature, salinity and pressure array. The equation for calculating the sound speed is from: Chen, C. T., & Millero, F. J. (1977). Speed of sound in seawater at high pressures. Journal of the Acoustical Society of America, 62(5), 1129–1135. https://doi.org/10.1121/1.381646

get_parameter(parameter)

Extension of get_parameters_from_rawdata. Also sound speed and depth can be requested.

get_parameter_from_rawdata(parameter)

Extracts parameters from raw data dictionary.

get_profile(max_depth, parameter)

Compute the profile for sound speed, temperature, pressure, or salinity over the vater column.

Parameters:
  • max_depth (int) – The profile will be computed from 0 meters to max_depth meters in 1-meter increments

  • parameter (str) –

    • ‘sound_speed’

    • ’temperature’

    • ’salinity’

    • ’pressure’

    • ’density’

    • ’conductivity’

ntp_seconds_to_datetime(ntp_seconds)

Converts timestamp into dattime object.

class ooipy.ctd.basic.CtdProfile(parameter_mean, parameter_var, depth_mean, depth_var, n_samp)

Simple object that stores a parameter profile over the water column. For each 1-meter interval, there is one data point in the profile.

parameter_mean

mean of parameter within each 1-meter depth interval

Type:

array of float

parameter_var

variance of parameter within each 1-meter depth interval

Type:

array of float

depth_mean

mean of depth within each 1-meter depth interval

Type:

array of float

depth_var

variance of depth within each 1-meter depth interval

Type:

array of float

n_samp

number of samples within each 1-meter depth interval

Type:

array of int

convert_to_ssp()

converts to numpy array with correct format for arlpy simulation

Returns:

ssp – 2D numpy array containing sound speed profile column 0 is depth, column 1 is sound speed (in m/s)

Return type:

numpy array

plot(**kwargs)

redirects to ooipy.ooiplotlib.plot_ctd_profile() please see ooipy.hydrophone.basic.plot_psd()