FIX: whitening issue when combining runs #1902

dengemann · 2015-03-24T17:25:19Z

Apparently there is an issue with whitening and cov computation if multiple SSSed runs are considered in which the numbe of components dffer. Let's say over 10 runs we have a component range of 67-71 and then compute the covariance on epochs that include data from all these runs. The covariance spectrum would indicate a rank of 91, irrespective of the method used (e.g. empirical), but clearly the resulting whitening would look inappropriate. Contrastingly, when computed runwise, the rank of the cov would reflect the number of SSS components in that run.

Any intuition? @Eric89GXL @agramfort @mshamalainen @LauriParkkonen

larsoner · 2015-03-24T20:19:38Z

I take it the runs were not translated into the same head position?

agramfort · 2015-03-24T20:22:03Z

indeed the intuition is that the head position is different for each run so the SSS signal subspace is sightly different and artificially increases the numerical rank of the data. not sure what to do in this case besides allowing to specify the rank manually. Or fix head position...

dgwakeman · 2015-03-24T20:23:26Z

I assume the noise is so bad that SSS is necessary (personally, I avoid it and try to stick with SSPs as much as possible).

kingjr · 2015-03-24T20:25:45Z

@Eric89GXL @agramfort @dengemann
This seems plausible indeed: I get an issue as soon as I concatenate two runs, even if I estimate the covariance on the trials of the first run (although estimating the covariance on the first run alone is ok). I'll check it out. Thanks!

kingjr · 2015-03-24T20:43:49Z

Yep, head positions were not translated to a common head position... I just need to find a maxfilter licence now...

Thanks again!

dengemann · 2015-04-16T18:52:59Z

We re-analyzed the data after having used the SSS realignement. The issue remains though. Interestingly the whitening looks correct if the cov and the evoked are computed on 10% of the trials across all runs. So combining across runs does not seem to be the issue anymore but using all trials (~1900). Any thoughts, ideas, anyone?

@agramfort @Eric89GXL

kingjr · 2015-04-16T18:54:18Z

Here s is the report for the 1900 epochs:
https://dl.dropboxusercontent.com/u/3518236/1900_epochs.html

and here for 190:
https://dl.dropboxusercontent.com/u/3518236/190_epochs.html

larsoner · 2015-04-16T18:54:43Z

Try calculating it on 10% of the trials (190), versus 10 copies of those 10% of trials (total 1900). It's possible this is a numerical precision issue that starts appearing with accumulation or something.

kingjr · 2015-04-16T19:02:24Z

@Eric89GXL Copying 10 times 10% of the trials looks fine (i.e. it's similar case to the evoked=epochs[::10].average scenario)

larsoner · 2015-04-16T19:05:38Z

So that makes it less likely to be an accumulation error... weird.

The eigenvalue plots and covariance plots for the two links above look pixel-for-pixel identical to me, I think there was a plotting error.

dengemann · 2015-04-16T19:06:52Z

The eigenvalue plots and covariance plots for the two links above look pixel-for-pixel identical to me, I think there was a plotting error.

@Eric89GXL this is because it's the same covariance in both cases. The cov seems fine, something is to be understood about the averaging / whitening.

larsoner · 2015-04-16T19:09:00Z

I'm confused -- I thought that the whitening matrix was entirely determined by the covariance matrix (assuming the selection of channels is identical). If that's true, it implies something about your average actually being different.

dengemann · 2015-04-16T19:11:49Z

@Eric89GXL what we have is:

covariance computed on all trials
applied to 10% of the data --> 'correct'
applied to all data --> 'inappropriate'

something is really weird here.

larsoner · 2015-04-16T19:12:49Z

So the whitening matrix is identical in both of those cases, or no? I can't remember if nave shows up in there or not.

dengemann · 2015-04-16T19:14:27Z

So the whitening matrix is identical in both of those cases, or no?

yes should be but it's scaled by the .nave once you apply it.

dengemann · 2015-04-16T19:17:46Z

It's clearly the .nave rescaling. If you force the nave to 10% all looks correct ...

larsoner · 2015-04-16T19:18:18Z

Wait -- but using 10 copies of 190 trials looks the same as using 190 trials? Using 10 copies of the 190 trials should be the same thing as using 190 trials but bumping up your nave by a factor of 10, right?

dengemann · 2015-04-16T19:19:15Z

Wait -- but using 10 copies of 190 trials looks the same as using 190 trials?

yes

Using 10 copies of the 190 trials should be the same thing as using 190 trials but bumping up your nave by a factor of 10, right?

yes.

larsoner · 2015-04-16T19:21:41Z

... but plotting 190 trials with nave bumped up by a factor of 10 should look wrong?

dengemann · 2015-04-16T19:23:37Z

... but plotting 190 trials with nave bumped up by a factor of 10 should look wrong?

it looks wrong ...
it turns out when doing the copying experiment we did not update the nave.

dengemann · 2015-04-16T19:25:08Z

So, 10 times the copies of 10% looks wrong. So evidence for accumulation.

larsoner · 2015-04-16T19:26:27Z

Actually not really, I thought accumulation would be a problem if you were computing covariances, for averaging here it should be fine (you use the same covariance). If you use 10 times the copies of 10%, it should look wrong, because the nave is artificially inflated to 1900. So we're back to where we started unfortunately.

dengemann · 2015-04-16T19:28:20Z

But at least we know that one can reproduce the problem by using copies, it's not about the data itself.

larsoner · 2015-04-16T19:32:16Z

I actually come to the opposite conclusion. Using 10 copies is equivalent to using 190 trials and changing nave to be 10x, which we expect to be wrong. It actually suggests to me that your data for some reason start looking consistent if you have enough trials, which is really weird (or maybe not if your baseline does have some underlying consistent activity in it). It seems to suggest that your data in that window are not effectively modeled as zero-mean Gaussian processes.

larsoner · 2015-04-16T19:33:07Z

Try creating an EpochsArray with 1900 trials of np.random.RandomState(0).randn data and do an experiment, you'll see what I mean.

larsoner · 2015-04-16T19:33:59Z

... in other words, for such simulated data, our code should work correctly. But for your data it does not. Then you can play with the EpochsArray data in different ways (e.g., add some small but non-zero component in the baseline period) and see if you can reproduce this ill behavior.

dengemann · 2015-04-16T19:36:47Z

Arff.

kingjr · 2015-04-17T07:43:26Z

I think @Eric89GXL is right. The data I was playing with were bandpass filtered a .75 - 35 Hz.

I retried with just a lowpass at 35Hz + baseline on the first 200 ms after epoch onset (stim comes 430 ms after epoch onset), and it gets a little better on the 1900 epochs average:

https://dl.dropboxusercontent.com/u/3518236/report_Filter_0_35Hz.html

It's still not perfect though...

dengemann · 2015-04-17T07:47:22Z

the problem with this baselining is that it induces these late, probably
spurious, components. Not sure this looks so much better. I think the
butterfy plot used to look better before. With 1900 we might have to live
with the fact that we find non-Gaussian processes in the baseline that
cannot be whitened away. To be discussed.

2015-04-17 9:43 GMT+02:00 J-R King [email protected]:

I think @Eric89GXL https://github.com/Eric89GXL is right. The data I
was playing with were bandpass filtered a .75 - 35 Hz.

I retried with just a lowpass at 35Hz + baseline on the first 200 ms after
epoch onset (stim comes 430 ms after epoch onset), and it gets a little
better on the 1900 epochs average:

[image: index]
https://cloud.githubusercontent.com/assets/4881164/7197807/a47a54c2-e4e5-11e4-99f6-e5476372a6c8.png

https://dl.dropboxusercontent.com/u/3518236/report_Filter_0_35Hz.html

It's still not perfect though...

—
Reply to this email directly or view it on GitHub
#1902 (comment)
.

agramfort · 2015-04-20T11:50:57Z

what's going on here?

dengemann · 2015-04-20T12:16:58Z

I think we're having a "funny" dataset, with ~2000 trials SNR is so high
that extremelt weak phase sustained time-locked become visible. Not 100%
sure this is really the case though, but currently the best supported
interpretation given the evidence.

2015-04-20 13:50 GMT+02:00 Alexandre Gramfort [email protected]:

what's going on here?

—
Reply to this email directly or view it on GitHub
#1902 (comment)
.

larsoner · 2016-12-20T20:04:26Z

Coming back to the original point about combining SSS runs, a similar problem happens with maxwell_filter when doing movement compensation with regularization. There are two problems actually:

There are some edge artifacts at head-position boundaries.
The number of components varies as a function of time.

Point (1) is usually minor, but there are ways to fix it and I plan to do so soon.

Point (2) is harder but seems to be fixed quite nicely by mne.cov.regularize. Things like "shrunk" and "factor_analysis" are not as robust to such data with a time-varying number of components AFAICT.

I am tempted to recommend that poeple use mne.cov.regularize when dealing with data that have been processed with MNE-Python's movement compensation algorithm. Thoughts @agramfort @dengemann ?

agramfort · 2016-12-21T10:08:10Z

Point (2) is harder but seems to be fixed quite nicely by mne.cov.regularize. Things like "shrunk" and "factor_analysis" are not as robust to such data with a time-varying number of components AFAICT.

can you tell us more? what do you get when using auto mode? what values do you use with mne.cov.regularize?

dengemann · 2016-12-21T10:11:08Z

am tempted to recommend that poeple use mne.cov.regularize when dealing with data that have been processed with MNE-Python's movement compensation algorithm.

Ok time for action. We should not regress back to mne.cov.regularize. I'm optimistic that we can elaborate a solution to this problem.

can you tell us more? what do you get when using auto mode? what values do
you use with mne.cov.regularize?

yes please!

larsoner · 2016-12-21T14:40:40Z

what do you get when using auto mode?

For data with time-varying rank or spatial structure, the ECD localization bias is about the same as when using an empirical noise covariance, which is 2+ cm localization bias on phantom data that is ~1mm bias with regularized cov. I think it's due to small non-zero eigenvalues. In other words, I think the automated methods are producing a "correct" result, in the sense that it is rightfully producing some small non-zero eigenvalues corresponding to the chunks of data that are only present part of the time. However those small eigenvalues effectively break the whitening step. I suspect those methods are meant to be used for data with spatially stationary noise processes or at least data with constant rank, and movement-compensated data with time-varying regularization (or other data with similar time-varying structure) will likely not work.

I can try to simulate some simple data to show this effect.

what values do you use with mne.cov.regularize?

The defaults. It probably doesn't matter much so long as the eigenvalues are prevented from getting too small during whitening.

dengemann · 2016-12-21T15:04:18Z

I don't see why our approach should not be able to learn the right amount of regularization if correctly parametrized. Maybe we need to chat on some of our channels these days.

…

On Wed, 21 Dec 2016 at 15:40, Eric Larson ***@***.***> wrote: what do you get when using auto mode? For data with time-varying rank or spatial structure, the ECD localization bias is about the same as when using an empirical noise covariance, which is 2+ cm localization bias on phantom data that is ~1mm bias with regularized cov. I think it's due to small non-zero eigenvalues. In other words, I think the automated methods are producing a "correct" result, in the sense that it is rightfully producing some small non-zero eigenvalues corresponding to the chunks of data that are only present part of the time. However those small eigenvalues effectively break the whitening step. I suspect those methods are meant to be used for data with spatially stationary noise processes or at least data with constant rank, and movement-compensated data with time-varying regularization (or other data with similar time-varying structure) will likely not work. I can try to simulate some simple data to show this effect. what values do you use with mne.cov.regularize? The defaults. It probably doesn't matter much so long as the eigenvalues are prevented from getting too small during whitening. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1902 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB0fiugHrsvG8M2h6kn3dA4NBjsN-Cyyks5rKTppgaJpZM4Dz-OA> .

larsoner · 2016-12-21T15:16:30Z

@dengemann you originally reported the same problem over a year and a half ago... surely you've seen this problem, too?

dengemann · 2016-12-21T15:52:34Z

Yes I'v sat together with JR when filing the issue. My point is that if we understand now better what's the matter we can eventually handle it, the autocov may need a different handling here.

…

On Wed, 21 Dec 2016 at 16:16, Eric Larson ***@***.***> wrote: @dengemann <https://github.com/dengemann> you originally reported the same problem over a year and a half ago... surely you've seen this problem, too? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1902 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB0figRweoKpqqVREemHaofTOtVAx98Qks5rKULOgaJpZM4Dz-OA> .

kingjr · 2016-12-21T23:45:53Z

(FWIW & from what I remember, shrunk covariance solved the problem in most subjects. However the issue was originally due to distinct dev_head_t + high number of trials.)

larsoner · 2018-04-04T13:58:19Z

Moved to #3985, closing here

dengemann changed the title ~~FIX: whitening / rank issue when combining~~ FIX: whitening issue when combining Apr 16, 2015

dengemann changed the title ~~FIX: whitening issue when combining~~ FIX: whitening issue when combining runs Apr 16, 2015

larsoner self-assigned this Dec 20, 2016

larsoner mentioned this issue Jun 20, 2017

WIP: some love for our covariance code #4329

Closed

choldgraf mentioned this issue Apr 4, 2018

Documentation improvements #3985

Closed

19 tasks

larsoner closed this as completed Apr 4, 2018

drammock mentioned this issue Nov 19, 2020

DOC: 2019-09 documentation TODO list #6794

Open

20 tasks

larsoner mentioned this issue Aug 30, 2023

update rank estimation in compute_source_psd() #11901

Open

FIX: whitening issue when combining runs #1902

FIX: whitening issue when combining runs #1902

Comments

dengemann commented Mar 24, 2015

larsoner commented Mar 24, 2015

agramfort commented Mar 24, 2015 via email

dgwakeman commented Mar 24, 2015

kingjr commented Mar 24, 2015

kingjr commented Mar 24, 2015

dengemann commented Apr 16, 2015

kingjr commented Apr 16, 2015

larsoner commented Apr 16, 2015

kingjr commented Apr 16, 2015

larsoner commented Apr 16, 2015

dengemann commented Apr 16, 2015

larsoner commented Apr 16, 2015

dengemann commented Apr 16, 2015

larsoner commented Apr 16, 2015

dengemann commented Apr 16, 2015

dengemann commented Apr 16, 2015

larsoner commented Apr 16, 2015

dengemann commented Apr 16, 2015

larsoner commented Apr 16, 2015

dengemann commented Apr 16, 2015

dengemann commented Apr 16, 2015

larsoner commented Apr 16, 2015

dengemann commented Apr 16, 2015

larsoner commented Apr 16, 2015

larsoner commented Apr 16, 2015

larsoner commented Apr 16, 2015

dengemann commented Apr 16, 2015

kingjr commented Apr 17, 2015

dengemann commented Apr 17, 2015

agramfort commented Apr 20, 2015 via email

dengemann commented Apr 20, 2015

larsoner commented Dec 20, 2016

agramfort commented Dec 21, 2016 via email

dengemann commented Dec 21, 2016

larsoner commented Dec 21, 2016

dengemann commented Dec 21, 2016 via email

larsoner commented Dec 21, 2016

dengemann commented Dec 21, 2016 via email

kingjr commented Dec 21, 2016

larsoner commented Apr 4, 2018