API Documentation¶
This module provides the tools to download OOI data from- the OOI Raw Data Server.
Request Module¶
Tools for downloading OOI data
Hydrophone Request¶
Oregon Shelf Base Seafloor (Fs = 64 kHz)
‘LJ01D’
Oregon Slope Base Seafloor (Fs = 64 kHz)
‘LJ01A’
Slope Base Shallow (Fs = 64 kHz)
‘PC01A’
Axial Base Shallow Profiler (Fs = 64 kHz)
‘PC03A’
Offshore Base Seafloor (Fs = 64 kHz)
‘LJ01C’
Axial Base Seafloor (Fs = 64 kHz)
‘LJ03A’
Axial Base Seafloor (Fs = 200 Hz)
‘Axial_Base’
‘AXABA1’
-
‘Central_Caldera’
‘AXCC1’
-
‘Eastern_Caldera’
‘AXEC2’
Southern Hydrate (Fs = 200 Hz)
‘Southern_Hydrate’
‘HYS14’
Oregon Slope Base Seafloor (Fs = 200 Hz)
‘Slope_Base’
‘HYSB1’
This modules handles the downloading of OOI Data. As of current, the supported OOI sensors include all broadband hydrophones (Fs = 64 kHz), all low frequency hydrophones (Fs = 200 Hz), and bottom mounted OBSs. All supported hydrophone nodes are listed in the Hydrophone Nodes section below.
- ooipy.request.hydrophone_request.get_acoustic_data(starttime, endtime, node, fmin=None, fmax=None, max_workers=-1, append=True, verbose=False, mseed_file_limit=None, large_gap_limit=1800.0, obspy_merge_method=0, gapless_merge=False)¶
Get broadband acoustic data for specific time frame and sensor node. The data is returned as a
HydrophoneDataobject. This object is based on the obspy data trace.>>> import ooipy >>> start_time = datetime.datetime(2017,3,10,0,0,0) >>> end_time = datetime.datetime(2017,3,10,0,5,0) >>> node = 'PC01A' >>> data = ooipy.request.get_acoustic_data(start_time, end_time, node) >>> # To access stats for retrieved data: >>> print(data.stats) >>> # To access numpy array of data: >>> print(data.data)
- Parameters:
start_time (datetime.datetime) – time of the first noise sample
end_time (datetime.datetime) – time of the last noise sample
node (str) – hydrophone name or identifier
fmin (float, optional) – lower cutoff frequency of hydrophone’s bandpass filter. Default is None which results in no filtering.
fmax (float, optional) – higher cutoff frequency of hydrophones bandpass filter. Default is None which results in no filtering.
print_exceptions (bool, optional) – whether or not exceptions are printed in the terminal line
max_workers (int, optional) – number of maximum workers for concurrent processing. Default is -1 (uses number of available cores)
append (bool, optional) – specifies if extra mseed files should be appended at beginning and end in case of boundary gaps in data. Default is True
verbose (bool, optional) – specifies whether print statements should occur or not
data_gap_mode (int, optional) – How gaps in the raw data will be handled. Options are: ‘0’: gaps will be linearly interpolated ‘1’: no interpolation; mask array is returned ‘2’: subtract mean of data and fill gap with zeros; mask array is returned
mseed_file_limit (int, optional) – If the number of mseed traces to be merged exceed this value, the function returns None. For some days the mseed files contain only a few seconds or milli seconds of data and merging a huge amount of files can dramatically slow down the program. if None (default), the number of mseed files will not be limited. This also limits the number of traces in a single file.
large_gap_limit (float, optional) – Defines the length in second of large gaps in the data. Sometimes, large data gaps are present on particular days. This can cause long interpolation times if data_gap_mode 0 or 2 are used, possibly resulting in a memory overflow. If a data gap is longer than large_gap_limit, data are only retrieved before (if the gap stretches beyond the requested time) or after (if the gap starts prior to the requested time) the gap, or not at all (if the gap is within the requested time).
obspy_merge_method (int, optional) – either [0,1], see [obspy documentation](https://docs.obspy.org/packages/autogen/ obspy.core.trace.Trace.html#handling-overlaps) for description of merge methods
gapless_merge (bool, optional) – OOI BB hydrophones have had problems with data fragmentation, where individual files are only fractions of seconds long. Before June 2023, these were saved as separate mseed files. after 2023 (and in some cases, but not all retroactively), 5 minute mseed files contain many fragmented traces. These traces are essentially not possible to merge with obspy.merge. If True, then experimental method to merge traces without consideration of gaps will be attempted. This will only be done if there is full data coverage over 5 min file length, but could still result in unalligned data. This is an experimental feature and should be used with caution.
starttime (
datetime)endtime (
datetime)
- Return type:
- ooipy.request.hydrophone_request.get_acoustic_data_LF(starttime, endtime, node, fmin=None, fmax=None, verbose=False, zero_mean=False, channel='HDH', correct=False)¶
Get low frequency acoustic data for specific time frame and sensor node. The data is returned as a
HydrophoneDataobject. This object is based on the obspy data trace. Example usage is shown below. This function does not include the full functionality provided by the IRIS data portal.If there is no data for the specified time window, then None is returned
>>> starttime = datetime.datetime(2017,3,10,7,0,0) >>> endtime = datetime.datetime(2017,3,10,7,1,30) >>> location = 'Axial_Base' >>> fmin = None >>> fmax = None >>> # Returns ooipy.ooipy.hydrophone.base.HydrophoneData Object >>> data_trace = hydrophone_request.get_acoustic_data_LF( starttime, endtime, location, fmin, fmax, zero_mean=True) >>> # Access data stats >>> data_trace.stats >>> # Access numpy array containing data >>> data_trace.data
- Parameters:
start_time (datetime.datetime) – time of the first noise sample
end_time (datetime.datetime) – time of the last noise sample
node (str) – hydrophone
fmin (float, optional) – lower cutoff frequency of hydrophone’s bandpass filter. Default is None which results in no filtering.
fmax (float, optional) – higher cutoff frequency of hydrophones bandpass filter. Default is None which results in no filtering.
verbose (bool, optional) – specifies whether print statements should occur or not
zero_mean (bool, optional) – specifies whether the mean should be removed. Default to False
channel (str) – Channel of hydrophone to get data from. Currently supported options are ‘HDH’ - hydrophone, ‘HNE’ - east seismometer, ‘HNN’ - north seismometer, ‘HNZ’ - z seismometer. NOTE calibration is only valid for ‘HDH’ channel. All other channels are for raw data only at this time.
correct (bool) – whether or not to use IRIS calibration code. NOTE: when this is true, computing PSDs is currently broken as calibration is computed twice
- Returns:
hydrophone_data – Hyrophone data object. If there is no data in the time window, None is returned
- Return type:
- ooipy.request.hydrophone_request.ooipy_read(device, node, starttime, endtime, fmin=None, fmax=None, verbose=False, data_gap_mode=0, zero_mean=False)¶
this function is under development
General Purpose OOIpy read function. Parses input parameters to appropriate, device specific, read function.
- Parameters:
device (str) – Specifies device type. Valid option are ‘broadband_hydrohpone’ and ‘low_frequency_hydrophone’
node (str) – Specifies data acquisition device location. TODO add available options
starttime (datetime.datetime) – Specifies start time of data requested
endtime (datetime.datetime) – Specifies end time of data requested
fmin (float) – Low frequency corner for filtering. If None are give, then no filtering happens. Broadband hydrophone data is filtered using Obspy. Low frequency hydrophone uses IRIS filtering.
fmax (float) – High frequency corner for filtering.
verbose (bool) – Specifies whether or not to print status update statements.
data_gap_mode (int) – specifies how gaps in data are handled see documentation for get_acoustic_data
- Returns:
hydrophone_data – Object that stores hydrophone data. Similar to obspy trace.
- Return type:
CTD Request¶
Tools for downloading CTD Data. The first place you should look is the ooi data explorer, but if the data you need is not available, then these tools may be helpful.
- ooipy.request.ctd_request.get_ctd_data(start_datetime, end_datetime, location, limit=10000, only_profilers=False, delivery_method='auto')¶
Requests CTD data between start_detetime and end_datetime for the specified location if data is available. For each location, data of all available CTDs are requested and concatenated in a list. That is the final data list can consists of multiple segments of data, where each segment contains the data from one instrument. This means that the final list might not be ordered in time, but rather should be treated as an unordered list of CTD data points.
- Parameters:
start_datetime (datetime.datetime) – time of first sample from CTD
end_datetime (datetime.datetime) – time of last sample from CTD
location (str) – location for which data are requested. Possible choices are: * ‘oregon_inshore’ * ‘oregon_shelf’ * ‘oregon_offshore’ * ‘oregon_slope’ * ‘washington_inshore’ * ‘washington_shelf’ * ‘washington_offshore’ * ‘axial_base’
limit (int) – maximum number of data points returned in one request. The limit applies for each instrument separately. That is the final list of data points can contain more samples then indicated by limit if data from multiple CTDs is available at the given location and time. Default is 10,0000.
only_profilers (bool) – Specifies whether only data from the water column profilers should be requested. Default is False
delivery_method (str) –
Specifies which delivery method is considered. For details please refer to http://oceanobservatories.org/glossary/. Options are: * ‘auto’ (default): automatically uses method that has data
available
- ’streamed’: only considers data that are streamed to shore
via cable
- ’telemetered’: only considers data that are streamed to shore
via satellite
- ’recovered’: only considers data that were reteived when the
instrument was retrieved
- Returns:
ctd_data – object, where the data array is stored in the raw_data attribute. Each data sample consists of a dictionary of parameters measured by the CTD.
- Return type:
- ooipy.request.ctd_request.get_ctd_data_daily(datetime_day, location, limit=10000, only_profilers=False, delivery_method='auto')¶
Requests CTD data for specified day and location. The day is split into 24 1-hour periods and for each 1-hour period
ooipy.ctd_request.get_ctd_data is called. The data for all 1-hour periods are then concatednated in stored in a :class:`ooipy.ctd.basic.CtdProfile()object.- Parameters:
datetime_day (datetime.datetime) – Day for which CTD data are requested
only_profilers (bool) – See :func:`ooipy.ctd_request.get_ctd_data
delivery_method (str) – See :func:`ooipy.ctd_request.get_ctd_data
- Returns:
ooipy.ctd.basic.CtdProfileobject, where the data array isstored in the raw_data attribute. Each data sample consists of a
dictionary of parameters measured by the CTD.
Hydrophone data object¶
The {py:meth}`ooipy.get_acoustic_data` and {py:meth}`ooipy.get_acoustic_data_LF` functions return the {py:class}`ooipy.HydrophoneData` object.
The ooipy.HydrophoneData objects inherits from obspy.Trace, and methods for
computing calibrated spectrograms and power spectral densities are added.
- class ooipy.hydrophone.basic.HydrophoneData(data=array([], dtype=float64), header=None, node='')¶
Object that stores hydrophone data
- type¶
Either ‘broadband’ or ‘low_frequency’ specifies the type of hydrophone that the date is from.
- Type:
- compute_psd_welch(win='hann', L=4096, overlap=0.5, avg_method='median', interpolate=None, scale='log', verbose=True)¶
Compute power spectral density estimates of noise data using Welch’s method.
- Parameters:
win (str, optional) – Window function used to taper the data. See scipy.signal.get_window for a list of possible window functions (Default is Hann-window.)
L (int, optional) – Length of each data block for computing the FFT (Default is 4096).
overlap (float, optional) – Percentage of overlap between adjacent blocks if Welch’s method is used. Parameter is ignored if avg_time is None. (Default is 50%)
avg_method (str, optional) – Method for averaging the periodograms when using Welch’s method. Either ‘mean’ or ‘median’ (default) can be used
interpolate (float, optional) – Resolution in frequency domain in Hz. If None (default), the resolution will be sampling frequency fs divided by L. If interpolate is smaller than fs/L, the PSD will be interpolated using zero-padding
scale (str, optional) – If ‘log’ (default) PSD in logarithmic scale (dB re 1µPa^2/H) is returned. If ‘lin’, PSD in linear scale (1µPa^2/H) is returned
verbose (bool, optional) – If true (default), exception messages and some comments are printed.
- Returns:
psd –
- An
xarray.DataArrayobject that contains frequency bins and PSD values. If no noise date is available, None is returned.
- An
- Return type:
xr.DataArray
- compute_spectrogram(win='hann', L=4096, avg_time=None, overlap=0.5, verbose=True, average_type='median')¶
Compute spectrogram of acoustic signal. For each time step of the spectrogram either a modified periodogram (avg_time=None) or a power spectral density estimate using Welch’s method with median or mean averaging is computed.
- Parameters:
win (str, optional) – Window function used to taper the data. See scipy.signal.get_window for a list of possible window functions (Default is Hann-window.)
L (int, optional) – Length of each data block for computing the FFT (Default is 4096).
avg_time (float, optional) – Time in seconds that is covered in one time step of the spectrogram. Default value is None and one time step covers L samples. If the signal covers a long time period it is recommended to use a higher value for avg_time to avoid memory overflows and to facilitate visualization.
overlap (float, optional) – Percentage of overlap between adjacent blocks if Welch’s method is used. Parameter is ignored if avg_time is None. (Default is 50%)
verbose (bool, optional) – If true (default), exception messages and some comments are printed.
average_type (str) – type of averaging if Welch PSD estimate is used. options are ‘median’ (default) and ‘mean’.
- Returns:
spectrogram – An
xarray.DataArrayobject that contains time and frequency bins as well as corresponding values. If no noise date is available, None is returned.- Return type:
xr.DataArray
- compute_spectrogram_mp(n_process=None, win='hann', L=4096, avg_time=None, overlap=0.5, verbose=True, average_type='median')¶
Same as function compute_spectrogram but using multiprocessing. This function is intended to be used when analyzing large data sets.
- Parameters:
n_process (int, optional) – Number of processes in the pool. None (default) means that n_process is equal to the number of CPU cores.
win (str, optional) – Window function used to taper the data. See scipy.signal.get_window for a list of possible window functions (Default is Hann-window.)
L (int, optional) – Length of each data block for computing the FFT (Default is 4096).
avg_time (float, optional) – Time in seconds that is covered in one time step of the spectrogram. Default value is None and one time step covers L samples. If the signal covers a long time period it is recommended to use a higher value for avg_time to avoid memory overflows and to facilitate visualization.
overlap (float, optional) – Percentage of overlap between adjacent blocks if Welch’s method is used. Parameter is ignored if avg_time is None. (Default is 50%)
verbose (bool, optional) – If true (default), exception messages and some comments are printed.
average_type (str) – type of averaging if Welch PSD estimate is used. options are ‘median’ (default) and ‘mean’.
- Returns:
spectrogram – An
xarray.DataArrayobject that contains time and frequency bins as well as corresponding values. If no noise date is available, None is returned.- Return type:
xr.DataArray
- frequency_calibration(N)¶
Apply a frequency dependent sensitivity correction to the acoustic data based on the information from the calibration sheets. Hydrophone deployments are found at https://github.com/OOI-CabledArray/deployments Hydrophone calibration sheets are found at https://github.com/OOI-CabledArray/calibrationFiles :type N: :param N: length of the data segment :type N: int
- Returns:
output_array – array with correction coefficient for every frequency
- Return type:
np.array
- get_asset_ID()¶
get_asset_ID returns the hydrophone asset ID for a given data sample. This data can be found here for broadband hydrophones. Since Low frequency hydrophones remain constant with location and time, if the hydrophone is low frequency, {location}-{channel} string combination is returned
- save(file_format, filename, wav_kwargs={})¶
save hydrophone data in specified method. Supported methods are: - pickle - saves the HydrophoneData object as a pickle file - netCDF - saves HydrophoneData object as netCDF. Time coordinates are not included - mat - saves HydrophoneData object as a .mat file - wav - calls wav_write method to save HydrophoneData object as a .wav file
- Parameters:
- Return type:
None
- wav_write(filename, norm=False, new_sample_rate=None)¶
method that stores HydrophoneData into .wav file
- Parameters:
- class ooipy.hydrophone.basic.Psd(freq, values)¶
A class used to represent a PSD object
- TODO¶
- visualize(plot_psd=True, save_psd=False, filename='psd.png', title='PSD',
- xlabel='frequency', xlabel_rot=0, ylabel='spectral level', fmin=0,
- fmax=32, vmin=20, vmax=80, figsize=(16,9), dpi=96)
Visualizes PSD estimate using matplotlib.
- save(filename='psd.json', ancillary_data=[], ancillary_data_label=[])¶
Saves PSD estimate and ancillary data in .json file.
- plot(**kwargs)¶
redirects to ooipy.ooiplotlib.plot_psd() please see
ooipy.hydrophone.basic.plot_psd()
- save(filename='psd.json', ancillary_data=[], ancillary_data_label=[])¶
!!!!! This function will be moved into a different module in the future. The current documentation might not be accurate !!!!!
Save PSD estimates along with with ancillary data (stored in dictionary) in json file.
filename (str): directory for saving the data ancillary_data ([array like]): list of ancillary data ancillary_data_label ([str]): labels for ancillary data used as keys in the output dictionary.
Array has same length as ancillary_data array.
- visualize(plot_psd=True, save_psd=False, filename='psd.png', title='PSD', xlabel='frequency', xlabel_rot=0, ylabel='spectral level', fmin=0, fmax=32000, vmin=20, vmax=80, figsize=(16, 9), dpi=96)¶
!!!!! This function will be moved into a different module in the future. The current documentation might not be accurate !!!!!
Basic visualization of PSD estimate based on matplotlib. The function offers two options: Plot PSD in Python (plot_psd = True) and save PSD plot in directory (save_psd = True). PSDs are plotted in dB re 1µ Pa^2/Hz.
plot_psd (bool): whether or not PSD is plotted using Python save_psd (bool): whether or not PSD plot is saved filename (str): directory where PSD plot is saved. Use ending “.png” or “.pdf” to save as PNG or PDF
file. This value will be ignored if save_psd=False
title (str): title of plot ylabel (str): label of vertical axis xlabel (str): label of horizontal axis xlabel_rot (float): rotation of xlabel. This is useful if xlabel are longer strings. fmin (float): minimum frequency (unit same as f) that is displayed fmax (float): maximum frequency (unit same as f) that is displayed vmin (float): minimum value (dB) of PSD. vmax (float): maximum value (dB) of PSD. figsize (tuple(int)): size of figure dpi (int): dots per inch
- class ooipy.hydrophone.basic.Spectrogram(time, freq, values)¶
A class used to represent a spectrogram object.
- values¶
Values of the spectrogram. For each time-frequency-bin pair there has to be one entry in values. That is, if time has length N and freq length M, values is a NxM array.
- Type:
2-D array of float
- plot(**kwargs)¶
redirects to ooipy.ooiplotlib.plot_spectrogram() please see
ooipy.hydrophone.basic.plot_spectrogram()
- save(filename='spectrogram.pickle')¶
!!!!! This function will be moved into a different module in the future. The current documentation might not be accurate !!!!!
Save spectrogram in pickle file.
filename (str): directory where spectrogram data is saved. Ending has to be “.pickle”.
- visualize(plot_spec=True, save_spec=False, filename='spectrogram.png', title='spectrogram', xlabel='time', xlabel_rot=70, ylabel='frequency', fmin=0, fmax=32000, vmin=20, vmax=80, vdelta=1.0, vdelta_cbar=5, figsize=(16, 9), dpi=96, res_reduction_time=1, res_reduction_freq=1, time_limits=None)¶
This function will be depreciated into a different module in the future. The current documentation might not be accurate.
To plot spectrograms please see
ooipy.hydrophone.basic.plot_spectrogram()Basic visualization of spectrogram based on matplotlib. The function offers two options: Plot spectrogram in Python (plot_spec = True) and save spectrogram plot in directory (save_spec = True). Spectrograms are plotted in dB re 1µ Pa^2/Hz.
plot_spec (bool): whether or not spectrogram is plotted using Python save_spec (bool): whether or not spectrogram plot is saved filename (str): directory where spectrogram plot is saved. Use ending “.png” or “.pdf” to save as PNG or PDF file. This value will be ignored if save_spec=False title (str): title of plot ylabel (str): label of vertical axis xlabel (str): label of horizontal axis xlabel_rot (float): rotation of xlabel. This is useful if xlabel are longer strings for example when using datetime.datetime objects. fmin (float): minimum frequency (unit same as f) that is displayed fmax (float): maximum frequency (unit same as f) that is displayed vmin (float): minimum value (dB) of spectrogram that is colored. All values below are displayed in white. vmax (float): maximum value (dB) of spectrogram that is colored. All values above are displayed in white. vdelta (float): color resolution vdelta_cbar (int): label ticks in colorbar are in vdelta_cbar steps figsize (tuple(int)): size of figure dpi (int): dots per inch time_limits : list
specifies xlimits on spectrogram. List contains two datetime.datetime objects
CTD data object¶
Module for CTD data objects
- class ooipy.ctd.basic.CtdData(raw_data=None, extract_parameters=True)¶
Object that stores conductivity, temperature, depth (CTD) data, and provides functions for calculating sound speed, temperature, pressure, and salinity profiles. When a CtdData object is created and extract_parameters = True (default), then temperature, pressure, salinity, and time are automatically extracted from the raw data.
- raw_data¶
list containing sample from CTD. Each sample is a dictionary containing all parameters measured by the CTD.
- temperature¶
array containing temperature samples in degree celsius.
- Type:
numpy.ndarray
- pressure¶
array containing pressure samples in dbar.
- Type:
numpy.ndarray
- salinity¶
array containing salinity samples in parts per thousand.
- Type:
numpy.ndarray
- depth¶
array containing depth samples in meter.
- Type:
numpy.ndarray
- density¶
array containing density samples in kg/cubic meter.
- Type:
numpy.ndarray
- conductivity¶
array containing conductivity samples in siemens/meter.
- Type:
numpy.ndarray
- sound_speed¶
array containing sound speed samples in meter/second.
- Type:
numpy.ndarray
- time¶
array containing time samples as datetime.datetime objects.
- Type:
numpy.ndarray
- sound_speed_profile¶
object for sound speed profile.
- temperature_profile¶
object for temperature profile.
- salinity_profile¶
object for salinity profile.
- pressure_profile¶
object for pressure profile.
- density_profile¶
object for density profile.
- conductivity_profile¶
object for conductivity profile.
- calc_depth_from_pressure()¶
Calculates depth from pressure array
- calc_sound_speed()¶
Calculates sound speed from temperature, salinity and pressure array. The equation for calculating the sound speed is from: Chen, C. T., & Millero, F. J. (1977). Speed of sound in seawater at high pressures. Journal of the Acoustical Society of America, 62(5), 1129–1135. https://doi.org/10.1121/1.381646
- get_parameter(parameter)¶
Extension of get_parameters_from_rawdata. Also sound speed and depth can be requested.
- get_parameter_from_rawdata(parameter)¶
Extracts parameters from raw data dictionary.
- get_profile(max_depth, parameter)¶
Compute the profile for sound speed, temperature, pressure, or salinity over the vater column.
- ntp_seconds_to_datetime(ntp_seconds)¶
Converts timestamp into dattime object.
- class ooipy.ctd.basic.CtdProfile(parameter_mean, parameter_var, depth_mean, depth_var, n_samp)¶
Simple object that stores a parameter profile over the water column. For each 1-meter interval, there is one data point in the profile.
- convert_to_ssp()¶
converts to numpy array with correct format for arlpy simulation
- Returns:
ssp – 2D numpy array containing sound speed profile column 0 is depth, column 1 is sound speed (in m/s)
- Return type:
numpy array
- plot(**kwargs)¶
redirects to ooipy.ooiplotlib.plot_ctd_profile() please see
ooipy.hydrophone.basic.plot_psd()