Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashes in category/file names make retrieval difficult #11

Open
serin-delaunay opened this issue Nov 26, 2016 · 5 comments
Open

Dashes in category/file names make retrieval difficult #11

serin-delaunay opened this issue Nov 26, 2016 · 5 comments

Comments

@serin-delaunay
Copy link

serin-delaunay commented Nov 26, 2016

At the moment there are categories in corpora like "film-tv" and files like "materials/abridged-body-fluids" which cannot be accessed using the standard syntax of pycorpora.category_name.file_name['key'], because - is not a legal character in Python identifiers.
I can work around this as follows:
getattr(pycorpora, 'film-tv').tv_shows['tv_shows']
pycorpora.materials.get_file('abridged-body-fluids')['abridged body fluids']
However, this isn't ideal and probably either pycorpora should perform these workarounds internally (translating - to _, for instance), or corpora should restrict category and file names to valid JS/Python/C (for example) identifiers.
I've opened a similar issue in corpora: dariusk/corpora#236.

@hugovk
Copy link
Contributor

hugovk commented Nov 26, 2016

Generally speaking, it'd be good to have corpora all nice and consistent, but a great thing about that project is it gets contributions from people who aren't familiar with Git in the first place, which is already quite a hurdle.

So it's probably better to have this tool deal with it.

(It might be an idea to have a guideline to avoid dashes over in corpora. It may be worth converting existing filenames, but then it may break code alway using it. And both those things are unnecessary if these tools deal with it.)

@aparrish
Copy link
Owner

I merged a fix for this in #9 a few weeks ago, actually. It just hasn't made it to PyPI yet. For now, you can take advantage of the fix by installing directly from github. I'll leave this open until I have a chance to make a new release and close when the fix is generally available.

@dariusk
Copy link

dariusk commented Sep 25, 2017

@aparrish Is this on PyPI now? I took a look at it but it seems like it's still at 0.1.2 which is from before the change you referenced. But also I'm not 100% sure how to read the versioning and versus the commit log.

@aparrish
Copy link
Owner

@dariusk not on pypi yet, unfortunately. I'd sorta been waiting until I'd found a good fix for #8 before pushing another pypi release. :(

@dariusk
Copy link

dariusk commented Sep 25, 2017

Ok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants