-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean_integer_fields
cannot convert idx_...
field to integer sometimes
#111
Comments
Sorry for the late response. Thanks for raising the issue! We've had some problems related to this in #69 Generally It's really good that you're raising this because now I found this and this, which look like what we need to solve the missing value problem. I'll look into this and send a PR if it seems to work. As for having multiple values for the same index: as long as there is an array in the column, that column will have the |
Reopening because I reverted merging the PR in #115 because it introduced problems with simple queries like |
Situation:
In some of my TrialData files, there are some trials when there is one index in an
idx_...
field and other trials where there are multiple entries. For example, in my center-out behavior, there is an index for when the monkey starts holding in the center target. But if the monkey leaves the center target, the trial is on pause until he comes back, at which point theidx_ctHoldTime
value for that trial will be something like [10,50].Relatedly, there are some trials in which an event doesn't happen at all, and so I set the value for that
idx_...
field to NaN for that trial.Problem:
Because
clean_integer_fields
(used bymat2dataframe
) assumes a similar format of indices over all trials, it doesn't catch these edge case and can't convert the field to an integer. In the case of the doubled up values, they probably should be converted, and in the case of the NaNs, I suppose they should be set to some other invalid integer.Solution(?):
One solution would have been to re-code my behavior so that when the monkey leaves the center target, it would abort the trial and move on to the next one (which is unfortunately not an option for me). Relatedly, I could reorganize my TrialData structure in MATLAB to add in fake trial aborts and/or only consider the last time (neither is too appealing to me because they wouldn't capture the spirit of what the monkey was doing at the time).
Another solution might be to go trial by trial in
clean_integer_fields
rather than try to convert the whole column at once, but this strikes me as a pretty clunky solution. Still, maybe it's the simplest to implement and easy enough to do? I think the firstif
block of the function code actually does exactly this, so perhaps the second block isn't needed?For the NaN issue, I'm a bit confused. The code seems to treat
np.nan
as the "right" thing to put in for invalid indices, and yet it's not stored as an integer. Does that mean thatidx_...
fields aren't meant to be integers? The way it's currently dealt with seems to be that if the integer conversion doesn't result in the same value, it's left alone (sometimes without a warning because an array of NaNs converted to an array of ints actually doesn't throw an error)--so maybe a solution any of this is unnecessary, but it sits weirdly with me that some of myidx...
fields might be ints and some might not be.Thoughts?
The text was updated successfully, but these errors were encountered: