- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5
Typing #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Typing #160
Conversation
For automatic generation we used MonkeyType. We run monkeytype on the pytests and generate the stubs. The information of the stubs was manually slightly simplified and applied. Then minor refactoring was necessary to pass typechecking. https://github.com/Instagram/MonkeyType
| @Trybnetic  @derNarr Finally tests also pass for python3.5 . | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for replying so late. I had some very busy weeks. I really like the changes and they gave me the first good impression how "typed" python code looks.
Additionally, some minor changes that should be merged into the master directly are included in this pull request.
I added some questions for clarification and some comments into the diff. I noted where imho code changes should go directly into master.
I will address the questions from the pull request comment in a different comment.
Before merging:
- the type imports should be discussed
- questions in the pull request description should be discussed
Now some overall observations follow:
Overall, the code get's more specific and less general by typing it. This is also reflected in some of the name changes of variables / labels, e. g. activations -> activation_dict, events -> cues_gen. This feels like the right naming scheme after typing the code. But at the same time it feels slightly unpythonic and reduces the readability for me a bit. I might only need to get used to this kind of written code and I see the benefits of the typing. But still the code looks less clean and more cluttered to me right now. Maybe I simply needs to get used to it.
| _, total, used, *_ = memory[0].split() | ||
| else: | ||
| total, used = '?', '?' | ||
| osinfo += "{} {}MiB/{}MiB\n".format(identifier, used, total) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This of the memory printing change should go into master as well (independent of the typing). IMHO.
        
          
                pyndl/__init__.py
              
                Outdated
          
        
      |  | ||
| if uname.sysname == "Linux": | ||
| _, *lines = os.popen("free -m").readlines() | ||
| for identifier in ["Mem:", "Swap:"]: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterate over Tuple here? i. e. ("Mem:", "Swap:")
        
          
                pyndl/activation.py
              
                Outdated
          
        
      |  | ||
|  | ||
| from numpy import ndarray | ||
| from xarray.core.dataarray import DataArray | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type can also be accessed via xr.core.dataarray.DataArray therefore it would be possible to define the DataArray like:
DataArray = xr.core.dataarray.DataArray
instead of
from xarray.core.dataarray import DataArray
But the proposed version is probably cleaner. I am still a bit undecided. Any opinions on that? (The same applies to ndarray.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
| if text in PUNCTUATION: | ||
| words.append(text) | ||
| else: | ||
| elif text is not None: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have enough test coverage here? Is this the right thing to do? If yes, this should be merged into master as soon as possible as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed that it might be good to raise an exception here when text is None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we raise a exception for null text.
        
          
                pyndl/io.py
              
                Outdated
          
        
      | from typing import Iterator, List, Optional, Tuple, Union, cast | ||
|  | ||
| import pandas as pd | ||
| from pandas.core.frame import DataFrame | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above about how to define / import the DataFrame class / type.
| """ | ||
| parser.addoption("--runslow", action="store_true", | ||
| help="run slow tests") | ||
| default=False, help="run slow tests") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should go into master.
|  | ||
|  | ||
| @slow | ||
| @pytest.mark.slow | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should go into master.
| @@ -1,5 +1,5 @@ | |||
| [tox] | |||
| envlist = py{35,36}-test, checkstyle, documentation | |||
| envlist = py{35,36}-test, checkstyle, checktypes, documentation | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should go into master.
| setenv = | ||
| MYPYPATH=./stubs/ | ||
| commands = mypy --ignore-missing-imports pyndl | ||
| ignore_outcome = True | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should go into master.
|  | ||
|  | ||
| class JobFilter(): | ||
| class JobFilterBase(): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reasoning behind defining this base class? Is it to define the interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The type checker requires method definitions, but doesn't accept assignment already existing methods.
| Addressing some of the questions posed by David: 
 I like these kind of type aliases as they improve readability imho. 
 IMHO sometimes three layers like  
 If there is a good and nice and comparably fast way to implement the filtering, I would be really happy about any suggestions. For me this was the nicest way of writing it, but I was never really happy about this piece of code. At the same time all the other ways of implementing it -- that I explored -- felt way more obscure. 
 For me this is a mixed bag. On the one hand I would like to have a consistent easy to understand API, on the other hand especially for the events, it is really convenient, to either see them as a pandas.DataFrame, a generator of tuples, or a file path. I would really like to keep the API and even stabilize it in the direction of having these three representations for an event. What are your opinions? 
 I am fine with dropping Python 3.5 support from 2019 onwards. Therefore we can merge and create a  | 
| Union, | ||
| TypeVar, | ||
| Generic | ||
| ) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see @derNarr's comment above about indentation.
|  | ||
|  | ||
| from numpy import ndarray | ||
| from xarray.core.dataarray import DataArray | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See @derNarr's comments above about how to declare types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think his suggestion can be applied here:
In the other files, numpy / xarray is already importet.
| @derNarr @Trybnetic thanks for your reviews. I'll try to implement your suggestions when i've some time left for this. @derNarr 
 I'll think about it. But probably refactoring it is beyond the scope of this PR. 
 My suggestion is not to restrict to e.g. dicts. In contrast I would define a Union Type Events which contains all of the types you describe and is accepted everywhere. Numpy does similar, usually they convert them at the beginning of a function to one consistently used type. 
 I fixed the 3.5 issues, so no need to drop 3.5 support just for typechecking. | 
| @derNarr @Trybnetic  Please have a look at the refinements. | 
Fixes #33.
Changes proposed in this pull request:
Please note, that the type annotations which I added are fare from perfect.
Therefore with this PR I would love to hear your ideas and comments to improve them before merging.
There is still a lot of dirty stuff in it (
type: ignore, ...) to get things running.Some things we should think about:
Cue = str?Iterable[Event]vsIterable[Tuple[Cues, Outcomes]]vsIterable[Tuple[List[str],List[str]]] vs ...)As a rule of thumb: We should try to remove almost all
# type: ignoreand Union and minimize the number of TypeVar usages. Then we not only get typing but also nices code + api.Feel free to improve directly in this branch, I probably cannot contribute for the next 20 days.
The PR uses a develop branch as base like suggested with git-flow pattern in #104 .
Best,
David