Skip to content

Tidi #253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open

Tidi #253

Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .zenodo.json
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
"orcid": "0000-0002-8191-4765"
},
{
"affiliation":"University of Colorado at Boulder",
"affiliation":"University of Colorado at Boulder, SW TREC",
"name": "Navarro, Luis",
"orcid": "0000-0002-6362-6575"
},
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ This project adheres to [Semantic Versioning](https://semver.org/).
## [0.1.0] - 2025-XX-XX
* New Instruments
* Mars Global Surveyor Magnetometer (MGS Mag)
* TIMED TIDI
* Documentation
* Updated controlled information review statement for clarity

Expand Down
8 changes: 8 additions & 0 deletions docs/supported_instruments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -267,3 +267,11 @@ TIMED SEE

.. automodule:: pysatNASA.instruments.timed_see
:members:

.. _timed_tidi:

TIMED TIDI
---------

.. automodule:: pysatNASA.instruments.timed_tidi
:members:
2 changes: 1 addition & 1 deletion pysatNASA/instruments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
'icon_mighti', 'igs_gps', 'iss_fpmu', 'jpl_gps', 'maven_insitu_kp',
'maven_mag', 'maven_sep', 'mgs_mag', 'omni_hro',
'reach_dosimeter', 'ses14_gold', 'timed_guvi', 'timed_saber',
'timed_see']
'timed_see', 'timed_tidi']

for inst in __all__:
exec("from pysatNASA.instruments import {x}".format(x=inst))
Expand Down
9 changes: 8 additions & 1 deletion pysatNASA/instruments/methods/cdaweb.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,13 @@
"""

import datetime as dt
import gzip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in the standard library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is included into python core as far as I understand https://docs.python.org/3/library/gzip.html

import numpy as np
import os
from packaging.version import Version as pack_ver
import pandas as pds
import requests
import shutil
import tempfile
from time import sleep
import xarray as xr
Expand Down Expand Up @@ -618,7 +620,7 @@ def _get_file(remote_file, data_path, fname, temp_path=None, zip_method=None):
Path to temporary directory. Must be specified if zip_method is True.
(Default=None)
zip_method : str
The method used to zip the file. Supports 'zip' and None.
The method used to zip the file. Supports 'zip', 'gz' and None.
If None, downloads files directly. (default=None)

Raises
Expand All @@ -645,6 +647,11 @@ def _get_file(remote_file, data_path, fname, temp_path=None, zip_method=None):
if zip_method == 'zip':
with zipfile.ZipFile(dl_fname, 'r') as open_zip:
open_zip.extractall(data_path)
elif zip_method == 'gz':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the TIDI files zipped? I don't see a 'zip_method' assigned to the instrument.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the zip method assignment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tidi files are zipped with zlib module.

dest = os.path.join(data_path, fname.replace('.gz',''))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the functionality or issue that gzip is handling that zipfile package does not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried using the zip package to unzip the gzip files but I wasn't able to since these are for other compression library, but if you could take a look would be best.

with gzip.open(dl_fname, 'rb') as open_gz:
with open(dest, 'wb') as open_file:
shutil.copyfileobj(open_gz, open_file)

elif zip_method is not None:
logger.warning('{:} is not a recognized zip method'.format(zip_method))
Expand Down
11 changes: 9 additions & 2 deletions pysatNASA/instruments/methods/timed.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@

rules_url = {'guvi': 'http://guvitimed.jhuapl.edu/home_guvi-datausage',
'saber': 'https://saber.gats-inc.com/data_services.php',
'see': 'https://www.timed.jhuapl.edu/WWW/scripts/mdc_rules.pl'}
'see': 'https://www.timed.jhuapl.edu/WWW/scripts/mdc_rules.pl',
'tidi': 'https://tidi.engin.umich.edu/conditions-of-use/'}

ackn_str = "".join(["This Thermosphere Ionosphere Mesosphere Energetics ",
"Dynamics (TIMED) satellite data is provided through ",
Expand Down Expand Up @@ -51,4 +52,10 @@
'W. K., and Woodraska, D. L. (2005),',
'Solar EUV Experiment (SEE): Mission',
'overview and first results, J. Geophys.',
'Res., 110, A01312, doi:10.1029/2004JA010765.'))}
'Res., 110, A01312, doi:10.1029/2004JA010765.')),
'tidi': ' '.join(('Killeen, T. L., Skinner, W. R., Johnson,',
'R. M., Edmonson, C. J., Wu, Q., Niciejewski,',
'R. J., Grassl, H. J., Gell, D. A., Hansen,',
'P. E., Harvey, J. D., Kafkalidis, J. F. (1999),',
'TIMED Doppler Interferometer, Proc. SPIE,',
'3756, 289–315.'))}
255 changes: 255 additions & 0 deletions pysatNASA/instruments/timed_tidi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
# -*- coding: utf-8 -*-
"""Module for the TIMED TIDI instrument.

Supports the TIMED Doppler Interferometer (TIDI) instrument on the Thermosphere
Ionosphere Mesosphere Energetics Dynamics (TIMED) satellite data from the
NASA Coordinated Data Analysis Web (CDAWeb).

Properties
----------
platform
'timed'
name
'tidi'
tag
['profile','los','vector',]
inst_id
''
'ncar'

Warnings
--------
- The cleaning parameters for the instrument are still under development.

Example
-------
::

import pysat
tidi = pysat.Instrument('timed', 'tidi', tag='vecetor',
inst_id='', clean_level='None')
tidi.download(dt.datetime(2020, 1, 30), dt.datetime(2020, 1, 31))
tidi.load(2020, 2)

::

"""

import datetime as dt
import functools
import numpy as np
import pandas as pds
import xarray as xr

import pysat
from pysat.instruments.methods import general as mm_gen
from pysat.utils.io import load_netcdf

from pysatNASA.instruments.methods import cdaweb as cdw
from pysatNASA.instruments.methods import general as mm_nasa
from pysatNASA.instruments.methods import timed as mm_timed

# ----------------------------------------------------------------------------
# Instrument attributes

platform = 'timed'
name = 'tidi'
tags = {'profile': 'Level 1 TIDI data',
'los': 'Level 2 TIDI data',
'vector': 'Level 3 TIDI data'}
inst_ids = {'': ['los','profile','vector'],
'ncar': ['vector']}

pandas_format = False

# ----------------------------------------------------------------------------
# Instrument test attributes

_test_dates = {jj: {kk: dt.datetime(2019, 1, 1) for kk in inst_ids[jj]}
for jj in inst_ids.keys()}

# ----------------------------------------------------------------------------
# Instrument methods

# Use standard init routine
def init(self, module=mm_timed, name=name):
mm_nasa.init(self, module=module, name=name)

# Same timing cold/warm for Michigan files.
self.strict_time_flag = False

# No cleaning, use standard warning function instead
clean = mm_nasa.clean_warn

# ----------------------------------------------------------------------------
# Instrument functions
#
# Use the default CDAWeb and pysat methods

# Set the list_files routine
fname = ''.join(('TIDI_PB_{{year:04d}}{{day:03d}}_P????_S????_',
'D{{version:03d}}_R{{revision:02d}}.{ext:s}{gz:s}'))
fname_ext = {'vector': 'VEC',
'los': 'LOS',
'profile': 'PRF'}
fname_ncar = ''.join(('timed_windvectorsncar_tidi_',
'{year:04d}{month:02d}{day:02d}',
'????_v??.cdf'))
supported_tags = {'': {tag: fname.format(ext=fname_ext[tag], gz='')
for tag in tags},
'ncar': {'vector': fname_ncar}}
list_files = functools.partial(mm_gen.list_files,
supported_tags=supported_tags,)

# Set the load routine
def load(fnames, tag='', inst_id=''):
"""Load TIMED TIDI data into `xarray.DataSet` and `pysat.Meta` objects.

This routine is called as needed by pysat. It is not intended
for direct user interaction.

Parameters
----------
fnames : array-like
iterable of filename strings, full path, to data files to be loaded.
This input is nominally provided by pysat itself.
tag : str
tag name used to identify particular data set to be loaded.
This input is nominally provided by pysat itself.
inst_id : str
Satellite ID used to identify particular data set to be loaded.
This input is nominally provided by pysat itself.

Returns
-------
data : xr.DataSet
A xarray DataSet with data prepared for the pysat.Instrument
meta : pysat.Meta
Metadata formatted for a pysat.Instrument object.

Raises
------
ValueError
If temporal dimensions are not consistent

Note
----
Any additional keyword arguments passed to pysat.Instrument
upon instantiation are passed along to this routine.

Examples
--------
::

inst = pysat.Instrument('timed', 'tidi',
inst_id='', tag='vector')
inst.load(2005, 179)

"""

labels = {'units': ('Units', str), 'name': ('Long_Name', str),
'notes': ('Var_Notes', str), 'desc': ('CatDesc', str),
'plot': ('plot', str), 'axis': ('axis', str),
'scale': ('scale', str),
'min_val': ('Valid_Min', np.float64),
'max_val': ('Valid_Max', np.float64),
'fill_val': ('fill', np.float64)}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aburrell There are a few repeated type_mistmatch issues related to metadata and labels. One of the variables is unlike the others...

/Users/russellstoneback/Code/pysat/pysat/_meta.py:440: UserWarning: Metadata with type <class 'str'> does not match expected type <class 'numpy.float64'>. Dropping input for 'ut_date' with key 'Valid_Min'
warnings.warn(''.join((
/Users/russellstoneback/Code/pysat/pysat/_meta.py:440: UserWarning: Metadata with type <class 'str'> does not match expected type <class 'numpy.float64'>. Dropping input for 'ut_date' with key 'Valid_Max'
warnings.warn(''.join((

Is there a way to address these warnings?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the recent versions should auto-identify the types. Try running without the labels.

# Generate custom meta translation table. When left unspecified the default
# table handles the multiple values for fill. We must recreate that
# functionality in our table. The targets for meta_translation should
# map to values in `labels` above.
meta_translation = {'FIELDNAM': 'plot', 'LABLAXIS': 'axis',
'ScaleTyp': 'scale', 'VALIDMIN': 'Valid_Min',
'Valid_Min': 'Valid_Min', 'VALIDMAX': 'Valid_Max',
'Valid_Max': 'Valid_Max', '_FillValue': 'fill',
'FillVal': 'fill', 'TIME_BASE': 'time_base'}
if inst_id == 'ncar':
if tag == 'vector':
data, meta = cdw.load(fnames, tag, inst_id,
pandas_format=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pandas_format=True)
pandas_format=True,
meta_translation=meta_translation)

@jklenzing I think we may need a meta_kwargs pass thru in cdw.load to supply the labels dict to meta.

meta_translation is also not currently supported by the pandas_format option. My suggestion may be premature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example of metadata from this data set

In [39]: tidi.meta['Epochcold']
Out[39]: 
FIELDNAM              Epoch cold
VAR_TYPE            support_data
DEPEND_0                     NaN
DEPEND_1                     NaN
LABL_PTR_1                   NaN
DISPLAY_TYPE                 NaN
FORMAT                       NaN
LABLAXIS                     NaN
SCALETYP                     NaN
units                        NaN
long_name              Epochcold
notes                  Epochcold
desc                Default time
value_min       62798371200000.0
value_max       63460972800000.0
fill                         NaN
Name: Epochcold, dtype: object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be improved. I used a generic metadata dict and not sure how they match with the fields within the loaded TIDI files. I would need assistance for that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Posted this to help with our pysat meeting conversation on the code. Thanks again for the pull!

data = data.to_xarray()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the switch to xarray for data type consistency with other inst_id/tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. I tested it alsof or the NCAR tag and worked fine but hopefully could be improved.

data = data.rename(index='time')

elif inst_id == '':
data = []
for fname in fnames:
idata, meta = load_netcdf(fname, pandas_format=pandas_format,
epoch_name='time', epoch_unit='s',
epoch_origin='1980-01-06 00:00:00',
meta_kwargs={'labels': labels},
meta_translation=meta_translation,
drop_meta_labels='FILLVAL',
)
data.append(idata)

Copy link
Collaborator

@rstoneback rstoneback Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having some issues with the 'vector' tag. I'm using a new Python install so the issue may be on my end. I can't seem to access data at the Instrument level. @jklenzing

In [16]: tidi['ut_date']
Out[16]: 
<xarray.Dataset> Size: 0B
Dimensions:  ()
Data variables:
    *empty*

In [17]: tidi.data['ut_date']
Out[17]: 
<xarray.DataArray 'ut_date' (time: 1317)> Size: 11kB
array([b'2019001', b'2019001', b'2019001', ..., b'2019002', b'2019002',
       b'2019002'], dtype=object)
Coordinates:
    time     (time) datetime64[ns] 11kB 2019-01-01T00:03:37 ... 2019-01-02T00...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the data distribution, loading data for Jan 1 includes Jan 2, the multi_file_day should be set for each data set like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be better solution. I did notice that some vector files contain part of the next day for different type i.e cold/warm

if tag in ['vector', 'profile']:
_dim = 'nvec' if tag=='vector' else 'nprofs'
alt_retrieved = data[0].alt_retrieved
for i,idata in enumerate(data):
idata = idata.drop_vars('alt_retrieved')
idata = idata.assign_coords(time=idata.time)
data[i] = idata.rename({_dim:'time'})
data = xr.concat(data, 'time')
data = data.assign_coords(alt=('nalts', alt_retrieved.data))
data = data.rename(nalts='alt')
data = data.sortby('time')

elif tag == 'los':
for i,idata in enumerate(data):
idata = idata.assign_coords(time=idata.time)
data[i] = idata.rename(nlos='time')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line, and similar, produces a warning on my system:

/Users/russellstoneback/Code/pysatNASA/pysatNASA/instruments/timed_tidi.py:205: UserWarning: rename 'nlos' to 'time' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
  data[i] = idata.rename(nlos='time')


hh = [t.drop_dims(['time','nrecs_size']) for t in data]
names2avoid = list(hh[0].data_vars.keys())
ff = [t.drop_dims(['time']).drop_vars(names2avoid) for t in data]
ff = xr.concat(ff,'nrecs_size')
ee = [t.drop_dims(['nrecs_size']).drop_vars(names2avoid)
for t in data]
ee = xr.concat(ee,'time')
data = xr.merge([ff,ee,hh[0]])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 'profile' and 'los' tag I'm getting an empty meta.data. 'profile' example shown.

In [25]: tidi.meta.data
Out[25]: 
Empty DataFrame
Columns: [units, long_name, notes, desc, value_min, value_max, fill, plot, axis, scale]
Index: []

In [26]: tidi.data
Out[26]: 
<xarray.Dataset> Size: 1MB
Dimensions:         (time: 1880, alt: 21)
Coordinates:
    time            (time) datetime64[ns] 15kB 2019-01-01T00:08:01 ... 2019-0...
    alt             (alt) float32 84B 70.0 72.5 75.0 77.5 ... 115.0 117.5 120.0
Data variables: (12/46)
    ms_time         (time) float32 8kB 695.0 280.0 925.0 ... 190.0 845.0 500.0
    ut_date         (time) object 15kB b'2019001' b'2019001' ... b'2019001'
    ut_time         (time) float64 15kB 4.66e+05 4.69e+05 ... 8.638e+07
    rec_index       (time) float64 15kB 3.0 1.0 2.0 ... 1.88e+03 1.879e+03

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course if the files themselves have no metadata that is ok

return data, meta

# Set download tags. Note that tlimb uses the general implementation, while
# other tags use the cdasws implementation.
url = '/pub/data/timed/tidi/{tag:s}/{{year:04d}}/'
download_tags = {'': {tag: {'remote_dir': url.format(tag=tag),
'zip_method': 'gz',
'fname': fname.format(ext=fname_ext[tag],
gz='.gz')}
for tag in tags.keys()},
'ncar': {'vector': 'TIMED_WINDVECTORSNCAR_TIDI'},
}

# Set the download routine
def download(date_array, tag='', inst_id='', data_path=None):
"""Download NASA TIMED/TIDI data.

This routine is intended to be used by pysat instrument modules supporting
a particular NASA CDAWeb dataset.

Parameters
----------
date_array : array-like
Array of datetimes to download data for. Provided by pysat.
tag : str
Data product tag (default='')
inst_id : str
Instrument ID (default='')
data_path : str or NoneType
Path to data directory. If None is specified, the value previously
set in Instrument.files.data_path is used. (default=None)

"""

if inst_id in ['ncar',]:
cdw.cdas_download(date_array, tag=tag, inst_id=inst_id,
supported_tags=download_tags, data_path=data_path)
else:
cdw.download(date_array, tag=tag, inst_id=inst_id,
supported_tags=download_tags, data_path=data_path)

# Set the list_remote_files routine
list_remote_files = functools.partial(cdw.cdas_list_remote_files,
supported_tags=download_tags)
2 changes: 1 addition & 1 deletion pysatNASA/tests/test_methods_platform.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ class TestTIMEDMethods(object):

def setup_method(self):
"""Set up the unit test environment for each method."""
self.names = ['see', 'saber', 'guvi']
self.names = ['see', 'saber', 'guvi', 'tidi']
self.module = methods.timed
self.platform_str = '(TIMED)'
return
Expand Down
Loading