Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError in queries with MISSING values #98

Closed
PaulJWright opened this issue May 18, 2023 · 2 comments · Fixed by #102
Closed

ValueError in queries with MISSING values #98

PaulJWright opened this issue May 18, 2023 · 2 comments · Fixed by #102

Comments

@PaulJWright
Copy link
Contributor

PaulJWright commented May 18, 2023

Describe the bug

Certain drms queries return a ValueError, e.g.:

keys = client.query('hmi.M_720s[2011.04.14_00:30:00/6h@2h]',
               key=drms.const.all)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[162], line 1
----> 1 keys = client.query('hmi.M_720s[2011.04.14_00:30:00/6h@2h]',
      2                key=drms.const.all)
      4 print(keys[['DATE__OBS','QUALITY']])

File ~/.pyenv/versions/arccnet/lib/python3.9/site-packages/drms/client.py:1072, in Client.query(self, ds, key, seg, link, convert_numeric, skip_conversion, pkeys, rec_index, n)
   1070         res_key = pd.DataFrame()
   1071     if convert_numeric:
-> 1072         self._convert_numeric_keywords(ds, res_key, skip_conversion)
   1073     res.append(res_key)
   1075 if seg is not None:

File ~/.pyenv/versions/arccnet/lib/python3.9/site-packages/drms/client.py:654, in Client._convert_numeric_keywords(self, ds, kdf, skip_conversion)
    652     if idx.any():
    653         k_idx = kdf.columns.get_loc(k)
--> 654         kdf[kdf.columns[k_idx]] = kdf[kdf.columns[k_idx]].apply(int, base=16)
    655 if k in num_keys:
    656     kdf[k] = _pd_to_numeric_coerce(kdf[k])

File ~/.pyenv/versions/arccnet/lib/python3.9/site-packages/pandas/core/series.py:4626, in Series.apply(self, func, convert_dtype, args, **kwargs)
   4516 def apply(
   4517     self,
   4518     func: AggFuncType,
   (...)
   4521     **kwargs,
   4522 ) -> DataFrame | Series:
   4523     """
   4524     Invoke function on values of Series.
   4525 
   (...)
   4624     dtype: float64
   4625     """
-> 4626     return SeriesApply(self, func, convert_dtype, args, kwargs).apply()

File ~/.pyenv/versions/arccnet/lib/python3.9/site-packages/pandas/core/apply.py:1025, in SeriesApply.apply(self)
   1022     return self.apply_str()
   1024 # self.f is Callable
-> 1025 return self.apply_standard()

File ~/.pyenv/versions/arccnet/lib/python3.9/site-packages/pandas/core/apply.py:1076, in SeriesApply.apply_standard(self)
   1074     else:
   1075         values = obj.astype(object)._values
-> 1076         mapped = lib.map_infer(
   1077             values,
   1078             f,
   1079             convert=self.convert_dtype,
   1080         )
   1082 if len(mapped) and isinstance(mapped[0], ABCSeries):
   1083     # GH#43986 Need to do list(mapped) in order to get treated as nested
   1084     #  See also GH#25959 regarding EA support
   1085     return obj._constructor_expanddim(list(mapped), index=obj.index)

File ~/.pyenv/versions/arccnet/lib/python3.9/site-packages/pandas/_libs/lib.pyx:2834, in pandas._libs.lib.map_infer()

File ~/.pyenv/versions/arccnet/lib/python3.9/site-packages/pandas/core/apply.py:133, in Apply.__init__.<locals>.f(x)
    132 def f(x):
--> 133     return func(x, *args, **kwargs)

ValueError: invalid literal for int() with base 16: 'MISSING'

To Reproduce

In the most simple case, requesting 6 hours of data from 2011.04.14_00:30:00, at a 2 hour cadence, for all keys (drms.const.all), will raise the above error.

client = drms.Client(debug=True, verbose=True, email=<email_address>)

keys = client.query('hmi.M_720s[2011.04.14_00:30:00/6h@2h]',
               key=drms.const.all)

As would perhaps be expected, the following queries complete successfully:

keys = client.query('hmi.M_720s[2011.04.14_00:30:00/6h@1h][? (QUALITY!=0) ?]',
               key=drms.const.all)

print(keys[['DATE__OBS','QUALITY']])

  DATE__OBS     QUALITY
0   MISSING  3221356544
1   MISSING  3221356544
2   MISSING  3221356544
keys = client.query('hmi.M_720s[2011.04.14_00:30:00/6h@1h][? (QUALITY=0) ?]',
               key=drms.const.all)

print(keys[['DATE__OBS','QUALITY']])

                 DATE__OBS  QUALITY
0  2011-04-14T03:34:20.00Z        0
1  2011-04-14T04:34:20.00Z        0
2  2011-04-14T05:34:20.00Z        0

and the following raises the same ValueError

keys = client.query('hmi.M_720s[2011.04.14_00:30:00/6h@1h][? (QUALITY<65536) ?]',
               key=drms.const.all)

print(keys[['DATE__OBS','QUALITY']])

Screenshots

The JSOC query of hmi.M_720s[2011.04.14_00:30:00/6h@2h] points to the issue: image

System Details

==============================
sunpy Installation Information
==============================

General
#######
OS: Mac OS 13.3.1
Arch: 64bit, (arm)
sunpy: 4.1.6
Installation path: /Users/pjwright/.pyenv/versions/arccnet/lib/python3.9/site-packages/sunpy-4.1.6.dist-info

Required Dependencies
#####################
astropy: 5.2.2
numpy: 1.24.3
packaging: 23.1
parfive: 2.0.2

Optional Dependencies
#####################
asdf: 2.15.0
asdf-astropy: 0.4.0
beautifulsoup4: 4.12.2
cdflib: 0.4.9
dask: 2023.5.0
drms: 0.6.3
glymur: 0.12.5
h5netcdf: 1.1.0
h5py: 3.8.0
lxml: 4.9.2
matplotlib: 3.7.1
mpl-animators: 1.1.0
pandas: 2.0.1
python-dateutil: 2.8.2
reproject: 0.10.0
scikit-image: 0.20.0
scipy: 1.9.1
sqlalchemy: 2.0.13
tqdm: 4.65.0
zeep: 4.2.1

Installation method

pip

@PaulJWright PaulJWright changed the title ValueError in queries with MISSING values ValueError in queries with MISSING values May 18, 2023
@PaulJWright
Copy link
Contributor Author

PaulJWright commented May 18, 2023

@alasdairwilson notes this may be come about between versions 0.6.2 and 0.6.3

@nabobalis nabobalis added the Bug label May 18, 2023
@PaulJWright
Copy link
Contributor Author

PaulJWright commented Jun 9, 2023

Might look into this over the weekend... Here is the line in client.py: https://github.com/sunpy/drms/blame/8b2b8666e336bb28f3260ea6fe6f38722f96dd5b/drms/client.py#L654

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants