Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: whitening issue when combining runs #1902

Closed
dengemann opened this issue Mar 24, 2015 · 40 comments
Closed

FIX: whitening issue when combining runs #1902

dengemann opened this issue Mar 24, 2015 · 40 comments
Assignees

Comments

@dengemann
Copy link
Member

Apparently there is an issue with whitening and cov computation if multiple SSSed runs are considered in which the numbe of components dffer. Let's say over 10 runs we have a component range of 67-71 and then compute the covariance on epochs that include data from all these runs. The covariance spectrum would indicate a rank of 91, irrespective of the method used (e.g. empirical), but clearly the resulting whitening would look inappropriate. Contrastingly, when computed runwise, the rank of the cov would reflect the number of SSS components in that run.

Any intuition? @Eric89GXL @agramfort @mshamalainen @LauriParkkonen

@larsoner
Copy link
Member

I take it the runs were not translated into the same head position?

@agramfort
Copy link
Member

agramfort commented Mar 24, 2015 via email

@dgwakeman
Copy link
Contributor

I assume the noise is so bad that SSS is necessary (personally, I avoid it and try to stick with SSPs as much as possible).

@kingjr
Copy link
Member

kingjr commented Mar 24, 2015

@Eric89GXL @agramfort @dengemann
This seems plausible indeed: I get an issue as soon as I concatenate two runs, even if I estimate the covariance on the trials of the first run (although estimating the covariance on the first run alone is ok). I'll check it out. Thanks!

@kingjr
Copy link
Member

kingjr commented Mar 24, 2015

Yep, head positions were not translated to a common head position... I just need to find a maxfilter licence now...

Thanks again!

@dengemann dengemann changed the title FIX: whitening / rank issue when combining FIX: whitening issue when combining Apr 16, 2015
@dengemann dengemann changed the title FIX: whitening issue when combining FIX: whitening issue when combining runs Apr 16, 2015
@dengemann
Copy link
Member Author

We re-analyzed the data after having used the SSS realignement. The issue remains though. Interestingly the whitening looks correct if the cov and the evoked are computed on 10% of the trials across all runs. So combining across runs does not seem to be the issue anymore but using all trials (~1900). Any thoughts, ideas, anyone?

@agramfort @Eric89GXL

@kingjr
Copy link
Member

kingjr commented Apr 16, 2015

@larsoner
Copy link
Member

Try calculating it on 10% of the trials (190), versus 10 copies of those 10% of trials (total 1900). It's possible this is a numerical precision issue that starts appearing with accumulation or something.

@kingjr
Copy link
Member

kingjr commented Apr 16, 2015

@Eric89GXL Copying 10 times 10% of the trials looks fine (i.e. it's similar case to the evoked=epochs[::10].average scenario)

@larsoner
Copy link
Member

So that makes it less likely to be an accumulation error... weird.

The eigenvalue plots and covariance plots for the two links above look pixel-for-pixel identical to me, I think there was a plotting error.

@dengemann
Copy link
Member Author

The eigenvalue plots and covariance plots for the two links above look pixel-for-pixel identical to me, I think there was a plotting error.

@Eric89GXL this is because it's the same covariance in both cases. The cov seems fine, something is to be understood about the averaging / whitening.

@larsoner
Copy link
Member

I'm confused -- I thought that the whitening matrix was entirely determined by the covariance matrix (assuming the selection of channels is identical). If that's true, it implies something about your average actually being different.

@dengemann
Copy link
Member Author

@Eric89GXL what we have is:

  • covariance computed on all trials
  • applied to 10% of the data --> 'correct'
  • applied to all data --> 'inappropriate'

something is really weird here.

@larsoner
Copy link
Member

So the whitening matrix is identical in both of those cases, or no? I can't remember if nave shows up in there or not.

@dengemann
Copy link
Member Author

So the whitening matrix is identical in both of those cases, or no?

yes should be but it's scaled by the .nave once you apply it.

@dengemann
Copy link
Member Author

It's clearly the .nave rescaling. If you force the nave to 10% all looks correct ...

@larsoner
Copy link
Member

Wait -- but using 10 copies of 190 trials looks the same as using 190 trials? Using 10 copies of the 190 trials should be the same thing as using 190 trials but bumping up your nave by a factor of 10, right?

@dengemann
Copy link
Member Author

Wait -- but using 10 copies of 190 trials looks the same as using 190 trials?

yes

Using 10 copies of the 190 trials should be the same thing as using 190 trials but bumping up your nave by a factor of 10, right?

yes.

@larsoner
Copy link
Member

... but plotting 190 trials with nave bumped up by a factor of 10 should look wrong?

@dengemann
Copy link
Member Author

... but plotting 190 trials with nave bumped up by a factor of 10 should look wrong?

it looks wrong ...
it turns out when doing the copying experiment we did not update the nave.

@dengemann
Copy link
Member Author

So, 10 times the copies of 10% looks wrong. So evidence for accumulation.

@larsoner
Copy link
Member

Actually not really, I thought accumulation would be a problem if you were computing covariances, for averaging here it should be fine (you use the same covariance). If you use 10 times the copies of 10%, it should look wrong, because the nave is artificially inflated to 1900. So we're back to where we started unfortunately.

@dengemann
Copy link
Member Author

But at least we know that one can reproduce the problem by using copies, it's not about the data itself.

@larsoner
Copy link
Member

I actually come to the opposite conclusion. Using 10 copies is equivalent to using 190 trials and changing nave to be 10x, which we expect to be wrong. It actually suggests to me that your data for some reason start looking consistent if you have enough trials, which is really weird (or maybe not if your baseline does have some underlying consistent activity in it). It seems to suggest that your data in that window are not effectively modeled as zero-mean Gaussian processes.

@larsoner
Copy link
Member

Try creating an EpochsArray with 1900 trials of np.random.RandomState(0).randn data and do an experiment, you'll see what I mean.

@larsoner
Copy link
Member

... in other words, for such simulated data, our code should work correctly. But for your data it does not. Then you can play with the EpochsArray data in different ways (e.g., add some small but non-zero component in the baseline period) and see if you can reproduce this ill behavior.

@dengemann
Copy link
Member Author

Arff.

@kingjr
Copy link
Member

kingjr commented Apr 17, 2015

I think @Eric89GXL is right. The data I was playing with were bandpass filtered a .75 - 35 Hz.

I retried with just a lowpass at 35Hz + baseline on the first 200 ms after epoch onset (stim comes 430 ms after epoch onset), and it gets a little better on the 1900 epochs average:

index

https://dl.dropboxusercontent.com/u/3518236/report_Filter_0_35Hz.html

It's still not perfect though...

@dengemann
Copy link
Member Author

the problem with this baselining is that it induces these late, probably
spurious, components. Not sure this looks so much better. I think the
butterfy plot used to look better before. With 1900 we might have to live
with the fact that we find non-Gaussian processes in the baseline that
cannot be whitened away. To be discussed.

2015-04-17 9:43 GMT+02:00 J-R King [email protected]:

I think @Eric89GXL https://github.com/Eric89GXL is right. The data I
was playing with were bandpass filtered a .75 - 35 Hz.

I retried with just a lowpass at 35Hz + baseline on the first 200 ms after
epoch onset (stim comes 430 ms after epoch onset), and it gets a little
better on the 1900 epochs average:

[image: index]
https://cloud.githubusercontent.com/assets/4881164/7197807/a47a54c2-e4e5-11e4-99f6-e5476372a6c8.png

https://dl.dropboxusercontent.com/u/3518236/report_Filter_0_35Hz.html

It's still not perfect though...


Reply to this email directly or view it on GitHub
#1902 (comment)
.

@agramfort
Copy link
Member

agramfort commented Apr 20, 2015 via email

@dengemann
Copy link
Member Author

I think we're having a "funny" dataset, with ~2000 trials SNR is so high
that extremelt weak phase sustained time-locked become visible. Not 100%
sure this is really the case though, but currently the best supported
interpretation given the evidence.

2015-04-20 13:50 GMT+02:00 Alexandre Gramfort [email protected]:

what's going on here?


Reply to this email directly or view it on GitHub
#1902 (comment)
.

@larsoner
Copy link
Member

Coming back to the original point about combining SSS runs, a similar problem happens with maxwell_filter when doing movement compensation with regularization. There are two problems actually:

  1. There are some edge artifacts at head-position boundaries.

  2. The number of components varies as a function of time.

Point (1) is usually minor, but there are ways to fix it and I plan to do so soon.

Point (2) is harder but seems to be fixed quite nicely by mne.cov.regularize. Things like "shrunk" and "factor_analysis" are not as robust to such data with a time-varying number of components AFAICT.

I am tempted to recommend that poeple use mne.cov.regularize when dealing with data that have been processed with MNE-Python's movement compensation algorithm. Thoughts @agramfort @dengemann ?

@larsoner larsoner self-assigned this Dec 20, 2016
@agramfort
Copy link
Member

agramfort commented Dec 21, 2016 via email

@dengemann
Copy link
Member Author

am tempted to recommend that poeple use mne.cov.regularize when dealing with data that have been processed with MNE-Python's movement compensation algorithm.

Ok time for action. We should not regress back to mne.cov.regularize. I'm optimistic that we can elaborate a solution to this problem.

can you tell us more? what do you get when using auto mode? what values do
you use with mne.cov.regularize?

yes please!

@larsoner
Copy link
Member

what do you get when using auto mode?

For data with time-varying rank or spatial structure, the ECD localization bias is about the same as when using an empirical noise covariance, which is 2+ cm localization bias on phantom data that is ~1mm bias with regularized cov. I think it's due to small non-zero eigenvalues. In other words, I think the automated methods are producing a "correct" result, in the sense that it is rightfully producing some small non-zero eigenvalues corresponding to the chunks of data that are only present part of the time. However those small eigenvalues effectively break the whitening step. I suspect those methods are meant to be used for data with spatially stationary noise processes or at least data with constant rank, and movement-compensated data with time-varying regularization (or other data with similar time-varying structure) will likely not work.

I can try to simulate some simple data to show this effect.

what values do you use with mne.cov.regularize?

The defaults. It probably doesn't matter much so long as the eigenvalues are prevented from getting too small during whitening.

@dengemann
Copy link
Member Author

dengemann commented Dec 21, 2016 via email

@larsoner
Copy link
Member

@dengemann you originally reported the same problem over a year and a half ago... surely you've seen this problem, too?

@dengemann
Copy link
Member Author

dengemann commented Dec 21, 2016 via email

@kingjr
Copy link
Member

kingjr commented Dec 21, 2016

(FWIW & from what I remember, shrunk covariance solved the problem in most subjects. However the issue was originally due to distinct dev_head_t + high number of trials.)

@larsoner
Copy link
Member

larsoner commented Apr 4, 2018

Moved to #3985, closing here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants