Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading gdp datapackage results in ValueError: Cannot cast u'1968' for <Year> #145

Closed
Fak3 opened this issue Jan 12, 2017 · 8 comments
Closed
Assignees

Comments

@Fak3
Copy link

Fak3 commented Jan 12, 2017

I just tried to use jsontableschema-pandas, and got an error from jsontableschema:

In [1]: import datapackage

In [2]: storage = datapackage.push_datapackage('http://data.okfn.org/data/core/gdp/datapackage.json', 'pandas')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-250f172059bb> in <module>()
----> 1 storage = datapackage.push_datapackage('http://data.okfn.org/data/core/gdp/datapackage.json', 'pandas')

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/datapackage/pushpull.pyc in push_datapackage(descriptor, backend, **backend_options)
     74     for table in storage.buckets:
     75         if table in datamap:
---> 76             storage.write(table, datamap[table])
     77     return storage
     78 

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/jsontableschema_pandas/storage.pyc in write(self, bucket, rows)
    136         # Prepare
    137         descriptor = self.describe(bucket)
--> 138         new_data_frame = mappers.descriptor_and_rows_to_dataframe(descriptor, rows)
    139 
    140         # Just set new DataFrame if current is empty

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/jsontableschema_pandas/mappers.pyc in descriptor_and_rows_to_dataframe(descriptor, rows)
     31     index_rows = []
     32     jtstypes_map = {}
---> 33     for row in rows:
     34         values = []
     35         index = None

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/datapackage/pushpull.pyc in values(schema, data)
     53         # TODO: review
     54         def values(schema, data):
---> 55             for item in data:
     56                 row = []
     57                 for field in schema['fields']:

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/datapackage/resource.pyc in _iter_from_tabulator(self, table, schema)
    330                     except JsonTableSchemaException as exception:
    331                         message = 'Cannot cast %r for <%s>' % (value, field.name)
--> 332                         six.raise_from(ValueError(message), exception)
    333             yield keyed_row
    334 

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/six.pyc in raise_from(value, from_value)
    716 else:
    717     def raise_from(value, from_value):
--> 718         raise value
    719 
    720 

ValueError: Cannot cast u'1968' for <Year>
@Fak3
Copy link
Author

Fak3 commented Jan 12, 2017

Importing country-list datapackage worked just fine.

@Fak3 Fak3 changed the title Importing core\gdp results in ValueError: Cannot cast u'1968' for <Year> Reading core\gdp datapackage results in ValueError: Cannot cast u'1968' for <Year> Jan 12, 2017
@Fak3
Copy link
Author

Fak3 commented Jan 12, 2017

Same error with airport-codes:

In [6]: data_url = 'http://data.okfn.org/data/core/airport-codes/datapackage.json'

In [7]: storage = datapackage.push_datapackage(data_url, 'pandas')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-e3838b8c6b1f> in <module>()
----> 1 storage = datapackage.push_datapackage(data_url, 'pandas')

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/datapackage/pushpull.pyc in push_datapackage(descriptor, backend, **backend_options)
     74     for table in storage.buckets:
     75         if table in datamap:
---> 76             storage.write(table, datamap[table])
     77     return storage
     78 

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/jsontableschema_pandas/storage.pyc in write(self, bucket, rows)
    136         # Prepare
    137         descriptor = self.describe(bucket)
--> 138         new_data_frame = mappers.descriptor_and_rows_to_dataframe(descriptor, rows)
    139 
    140         # Just set new DataFrame if current is empty

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/jsontableschema_pandas/mappers.pyc in descriptor_and_rows_to_dataframe(descriptor, rows)
     31     index_rows = []
     32     jtstypes_map = {}
---> 33     for row in rows:
     34         values = []
     35         index = None

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/datapackage/pushpull.pyc in values(schema, data)
     53         # TODO: review
     54         def values(schema, data):
---> 55             for item in data:
     56                 row = []
     57                 for field in schema['fields']:

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/datapackage/resource.pyc in _iter_from_tabulator(self, table, schema)
    330                     except JsonTableSchemaException as exception:
    331                         message = 'Cannot cast %r for <%s>' % (value, field.name)
--> 332                         six.raise_from(ValueError(message), exception)
    333             yield keyed_row
    334 

/home/u1/.virtualenvs/pandas/lib/python2.7/site-packages/six.pyc in raise_from(value, from_value)
    716 else:
    717     def raise_from(value, from_value):
--> 718         raise value
    719 
    720 

ValueError: Cannot cast u'40.07080078' for <latitude_deg>

@Fak3
Copy link
Author

Fak3 commented Jan 12, 2017

Same error on python3

@rufuspollock rufuspollock changed the title Reading core\gdp datapackage results in ValueError: Cannot cast u'1968' for <Year> Reading gdp datapackage results in ValueError: Cannot cast u'1968' for <Year> Jan 13, 2017
@roll roll self-assigned this Jan 13, 2017
@roll
Copy link
Member

roll commented Jan 13, 2017

Is this format according to spec?

name: "Year",
type: "date",
format: "yyyy"

@roll
Copy link
Member

roll commented Jan 13, 2017

$ goodtables datapackage http://data.okfn.org/data/core/gdp/datapackage.json
DATASET
=======
{'error-count': 999, 'table-count': 1, 'time': 2.588, 'valid': False}

TABLE [1]
=========
{'datapackage': 'http://data.okfn.org/data/core/gdp/datapackage.json',
 'error-count': 999,
 'headers': ['Country Name', 'Country Code', 'Year', 'Value'],
 'row-count': 1000,
 'source': 'https://raw.githubusercontent.com/datasets/gdp/master/data/gdp.csv',
 'time': 1.948,
 'valid': False}
---------
[2,3] [non-castable-value] Row 2 has non castable value in column 3 (type: date, format: yyyy)
[3,3] [non-castable-value] Row 3 has non castable value in column 3 (type: date, format: yyyy)
...

@roll roll added the [0.25d] label Jan 13, 2017
@pwalsh
Copy link
Member

pwalsh commented Jan 13, 2017

Looks like invalid datapackage in reference to spec.

@roll roll added (py) and removed (py) labels Jan 16, 2017
@Fak3
Copy link
Author

Fak3 commented Jan 16, 2017

Should the bug be moved to https://github.com/datasets/gdp repo?

@roll
Copy link
Member

roll commented Jan 16, 2017

MOVED
datasets/gdp#6

@roll roll closed this as completed Jan 16, 2017
@roll roll added duplicate and removed current labels Jan 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants