Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Make read_eeglab_events public #4213

Merged
merged 11 commits into from
Apr 25, 2017

Conversation

jona-sassenhagen
Copy link
Contributor

@jona-sassenhagen
Copy link
Contributor Author

Is there a way to partially share docs between functions? E.g., read_raw_eeglab calls read_eeglab_events, so they have a lot of the same kwarg docs. This simple PR is 100 LOC, most of which are copied doc strings. @Eric89GXL

@jona-sassenhagen jona-sassenhagen changed the title Make read_eeglab_events public WIP/FIX Make read_eeglab_events public Apr 20, 2017
@jona-sassenhagen jona-sassenhagen self-assigned this Apr 20, 2017
@jona-sassenhagen jona-sassenhagen changed the title WIP/FIX Make read_eeglab_events public WIP Make read_eeglab_events public Apr 20, 2017
@larsoner
Copy link
Member

larsoner commented Apr 20, 2017 via email

@codecov-io
Copy link

codecov-io commented Apr 20, 2017

Codecov Report

Merging #4213 into master will increase coverage by 0.05%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4213      +/-   ##
==========================================
+ Coverage   86.18%   86.24%   +0.05%     
==========================================
  Files         356      357       +1     
  Lines       64337    64558     +221     
  Branches     9798     9831      +33     
==========================================
+ Hits        55451    55676     +225     
  Misses       6180     6180              
+ Partials     2706     2702       -4

@jona-sassenhagen jona-sassenhagen changed the title WIP Make read_eeglab_events public MRG Make read_eeglab_events public Apr 20, 2017
@jona-sassenhagen
Copy link
Contributor Author

Ready from my end.

Copy link
Contributor

@jaeilepp jaeilepp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

"one event code per time point, so some events will be "
"lost. You can use the function `mne.io.read_eeglab_events`"
"to extract the full events array (which can then be "
"passed to e.g. `mne.Epoch`."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> mne.Epochs. And the ending parenthesis can fit to the same line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And no reason to put backticks here, this is a warning not something Sphinx will try to render


Parameters
----------
eeg : str | object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does scipy.io.loadmat return? Maybe use that instead of object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns a dict, but really what's expected is the ['EEG'] result of that dict, which will be an np.ndarray (even if it has ndim=0, which is what one of the conditionals is for below)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this works only for mat['eeg'] if it contains epochs, right? Or does it work also for raw ... ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it works for epochs. It's intended for raw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay I see. Can we raise an error then if it contains epochs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just worried that you will get an obscure error if you try to give it mat['eeg'] and it contains epochs. Does it contain the eeg.event field if it contains epochs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay cool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasmainak I hope I didn't come across as snarky here, I just really appreciate you having written the EEGLAB importer in the first place :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) I think I'm not up to date with 21st century smileys yet, but no worries in any case :)

"""Create events array from EEGLAB structure.
def read_eeglab_events(eeg, event_id=None, event_id_func='strip_to_integer',
uint16_codec=None):
r"""Create events array from EEGLAB structure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does r in the beginning do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, but it's everywhere all of a sudden! Some doc thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but do we really need it? I noticed a backslash and I think that was meant to be escaped, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I put it there because I saw it in the other docstrings. I can remove it, but it's more consistent with the others in that file this way ...

@@ -576,19 +585,74 @@ def __init__(self, input_fname, events=None, event_id=None, tmin=0,
logger.info('Ready.')


def _read_eeglab_events(eeg, event_id=None, event_id_func='strip_to_integer'):
"""Create events array from EEGLAB structure.
def read_eeglab_events(eeg, event_id=None, event_id_func='strip_to_integer',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to update whats_new as well

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

@timmidee can you see if it would work for you?

"one event code per time point, so some events will be "
"lost. You can use the function `mne.io.read_eeglab_events`"
"to extract the full events array (which can then be "
"passed to e.g. `mne.Epoch`."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And no reason to put backticks here, this is a warning not something Sphinx will try to render


Parameters
----------
eeg : str | object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns a dict, but really what's expected is the ['EEG'] result of that dict, which will be an np.ndarray (even if it has ndim=0, which is what one of the conditionals is for below)

@mmagnuski
Copy link
Member

looks good to me

@jona-sassenhagen
Copy link
Contributor Author

whats_new conflict.

Don't you love rebasing.

@jona-sassenhagen
Copy link
Contributor Author

Otherwise gtg?

@@ -381,6 +381,14 @@ def __init__(self, input_fname, montage, eog=(), event_id=None,

def _create_event_ch(self, events, n_samples=None):
"""Create the event channel."""
if len(set(events[:, 0])) != len(events[:, 0]):
warn("Some events overlap/occur at the same time sample. The "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can this happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the problem that prompted this fix: see #3938

Copy link
Member

@jasmainak jasmainak Apr 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay @timmidee does this solve your problem?

@jasmainak
Copy link
Member

you have my +1 after @timmidee has tested it and confirmed that it works

@jasmainak
Copy link
Member

Thanks a lot @jona-sassenhagen for taking a stab at this!

@jona-sassenhagen
Copy link
Contributor Author

@timmidee do you know how to check out a branch from my repo ..?

@jona-sassenhagen
Copy link
Contributor Author

(@jasmainak for some context, Tim and I sit and work on the same floor.)

@jasmainak
Copy link
Member

(@jasmainak for some context, Tim and I sit and work on the same floor.)

ah okay! just ask him IRL then if it's too much of a hassle to pull the branch :)

@jona-sassenhagen
Copy link
Contributor Author

@timmidee I won't come to the office tomorrow I think, can you perhaps share one of the original files with me?

@larsoner
Copy link
Member

You also need to put this function in doc/python_reference.rst

@jona-sassenhagen
Copy link
Contributor Author

Works on the problematic data by @timmidee (see http://nbviewer.jupyter.org/gist/jona-sassenhagen/eb57f5c5735a838db9fc14a8d9e536be), please merge if green.

@larsoner
Copy link
Member

Comment unaddressed above, please add to python_reference.rst

@jona-sassenhagen
Copy link
Contributor Author

Argh I had forgotten to save that file before git add.

events = _read_eeglab_events(eeg, event_id=event_id,
event_id_func=event_id_func)
events = read_events_eeglab(eeg, event_id=event_id,
event_id_func=event_id_func)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you read all relevant info here? don't you loose str descriptions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically what if you don't pass event_id in param can you imagine returning an event_id to know what you read ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what you're saying: this one just returns integers ..?

If there are str events and no event_id, the (un-parseable) str events are dropped with a warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could return a dictionary mapping the string descriptions to the integers, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that makes sense. The problem is the following:

  • EEGLAB events can be arbitrary strs
  • MNE stim chans or mne.Epochs(..., events, ...) must be int

The user knows what events are in the data, and in case they've forgotten, we print the str events in the warning. What we have to do is, we have to allow users to map strs to ints. So in this one, the workflow goes like this:

raw = mne.io.read_raw_eeglab(fname)
-> Warning: Events like the following will be dropped entirely:
"square", "circle". 100 event codes could not be mapped to integers.
Use the 'event_id' parameter to map such events manually.

event_id = {"square":1, "circle":2}
raw = mne.io.read_raw_eeglab(fname, event_id)
events = mne.find_events(raw)
epochs = mne.Epochs(raw, events, event_id)
for cond in event_id:
    epochs[cond].average().plot(title=cond)

Seems fine to me. Or am I missing what you're going for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and any event not caught by event_id_func or event_id is dropped after the warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is predictable indeed, but I was just thinking that it's convenient for the user to get the event_id since the string description already exists. Otherwise, they will have to construct the dictionary manually. But it's okay if you don't have the bandwidth to implement it or you think it's unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand ... the event_id needs to exist anyways? Otherwise the user can't pass it to the function that retrieves the events (read_raw_eeglab or read_events_eeglab).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of an API like this:

raw = mne.io.read_raw_eeglab(fname)
events, event_id = read_events_eeglab(fname)
epochs = mne.Epochs(raw, events, event_id)

not sure if we're talking about the same thing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how that'd work. Called without additional parameters, read_raw_eeglab and read_events_eeglab will perform a many-to-one mapping: strip every key of its non-numeric parts, and use that as the value. So it will do stuff like:

'S101' -> 101
'R101' -> 101
'101' -> 101
101 -> 101
'square' -> (dropped)

To recover square, you'd have to use event_id. To have values distinguishing between 'S101' and 'R101', you'd have to use event_id too. So in any unambiguous mapping, the user will have to construct the event_id themselves before calling the function in the first place.

@timmidee
Copy link

Apologies for slowing down the process. I muted the thread to focus on other work over the weekend. Thanks again for picking this up @jona-sassenhagen!

Question I have is whether what the warning says about passing the events from read_events_eeglab() to mne.Epochs() is strictly true. Will mne.Epochs() accept overlap in the events structure?

In the init for BaseEpochs the code reads:

          events = events[selected]
          if len(np.unique(events[:, 0])) != len(events):
               raise RuntimeError('Event time samples were not unique')

This will only produce an error if both of the overlapping events are specified as time-locking events ("selected"), which is certainly not unthinkable in my specific use-case.

I.e. the warning might confuse some users' next actions.

@jona-sassenhagen
Copy link
Contributor Author

raise RuntimeError('Event time samples were not unique')

I didn't actually know about this.

Well, it does make sense, and also this concerns events after parsing event_id.
I guess that means read_eeglab_events should still warn if there are duplicate time points, right?

(Also note the events can also be passed to e.g. linear_regression_raw, which should support the overlap.)

@timmidee
Copy link

Well, it does make sense, and also this concerns events after parsing event_id.
I guess that means read_eeglab_events should still warn if there are duplicate time points, right?

Hmmmm that would seem double at first glance, could there be a case in which read_eeglab_events() would be used without using read_eeglab_events() first? Or perhaps an extra warning from read_eeglab_events() is useful if it not only states that there were duplicate time points but specifically that some functions might not accept that.

@jona-sassenhagen
Copy link
Contributor Author

jona-sassenhagen commented Apr 24, 2017

How about this for a warning when constructing the raw object:

Warning: {n} events will be dropped because they occur on the same time sample as another event. 
`mne.io.Raw` objects store events on an event channel, which cannot represent two events on the 
same sample. You can extract the original event structure using `mne.io.eeglab.read_events_eeglab`. 
Then, you can e.g. subset the extracted events for constructing epochs.

@agramfort
Copy link
Member

agramfort commented Apr 25, 2017 via email

@timmidee
Copy link

How about this for a warning when constructing the raw object:

Better!

@jona-sassenhagen
Copy link
Contributor Author

it seems you know what you're doing :)

I think @jasmainak 's and @agramfort 's idea is something we discussed and rejected when deciding on the API. In case you care, I'm laying out the rationale again:

  • for single subjects, your solution would be more convenient
  • for multiple subjects, there would be a mapping problem. It would not be easy to guarantee the returned event_ids are comparable, or that the same id is mapped to the same condition. A reliable way to map here would essentially be to either 1. create a hashing function taking arbitrary input and encoding it as an integer, so that it is guaranteed the same event_id works for multiple datasets; or 2. creating some magical thing that aligns multiple events and event_ids. These would be possible, but don't seem elegant.
    The way we're doing it here is, we're asking the user to do a bit more work up-front - creating an event_id; but they have to create the event_id anyways for epoching. And this event_id ensures multiple datasets are mappable to each other in a way that's predictable for the user.

I'm not in principle opposed to do a function like events, event_id = read_events_eeglab(fname), but I think it's a worse solution for the much more important multi-subject case.

@agramfort
Copy link
Member

agramfort commented Apr 25, 2017 via email

@agramfort agramfort changed the title MRG Make read_eeglab_events public [MRG+1] Make read_eeglab_events public Apr 25, 2017
@agramfort
Copy link
Member

LGTM MRG+1

@larsoner larsoner merged commit 9f908e7 into mne-tools:master Apr 25, 2017
@larsoner
Copy link
Member

Thanks @jona-sassenhagen

@jona-sassenhagen
Copy link
Contributor Author

Thanks all.

@jona-sassenhagen jona-sassenhagen deleted the eeglab_events_public branch April 25, 2017 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants