-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: More features for ReceptiveField #3884
Comments
Hi Eric, would here be the place to suggest another addition to I've been using it a bit lately, especially for backward models (ie stimulus reconstruction), and there's something that (I think) is missing that would be useful. You may have seen this paper by Haufe et al. (2014), where they show how to forward-transorm a backward model (for which the model weights are not easily interpretable). Ed Lalor's mTRF toolbox does this (see fig4E here), but unless I'm mistaken there's no quick way to do this directly in MNE/sklearn with the current code ? Ps: pinging @choldgraf as well since we've briefly discussed this on gitter |
it's just a matter of multiplying the coef_ by covariance of the design.
patterns = np.dot(X.T.dot(X), coef_)
maybe you can give it a try? @larsoner and me will offer feedback for sure.
|
I agree it doesn't seem that hard to implement. The only issue that I foresee might be to deal with the intercept before |
you may also need to inverse Y:i.e. Sx W Sy^-1:
np.cov(X.T).dot(coef_.T.dot(np.linalg.pinv(np.cov((Y-Y.mean(0)).T)))).T
…On 4 September 2017 at 09:09, Alexandre Gramfort ***@***.***> wrote:
it's just a matter of multiplying the coef_ by covariance of the design.
patterns = np.dot(X.T.dot(X), coef_)
maybe you can give it a try? @larsoner and me will offer feedback for sure.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3884 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEp7DIU6Z21d2mFAiCpFHfqfMVD8xK9Lks5se_aXgaJpZM4LcIiW>
.
|
thanks for giving it a try
|
+1 for having that feature computed upon request. |
@nbara would you like to give it a shot? Feel free to ping me for a review if you like. Quick questions:
|
We should store the
We should just compute it when doing |
Yeah I think so too. Can we think of edge cases where we'd regret this? E.g. gigantic X matrices? Could be a silent memory issue if the object is suddenly storing an |
I would recommend to add the feature, and make it optional once we identify
a clear usecase where memory is a problem.
…On 4 September 2017 at 13:23, Chris Holdgraf ***@***.***> wrote:
We should just compute it when doing fit(...). It will be a very small
cost relative to the fitting operation I think.
Yeah I think so too. Can we think of edge cases where we'd regret this?
E.g. gigantic X matrices? Could be a silent memory issue if the object is
suddenly storing an N x N matrix if N is really large...
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3884 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEp7DLnk4yD7nN6a8Mws-lRDpSiS-B0Vks5sfDH7gaJpZM4LcIiW>
.
|
The covariance (what we'd need to store) is just |
that's fine w/ me - just bringing it up as worth thinking about. My concern is more that users won't necessarily think that this will happen intuitively, so it's silently storing an N x N matrix in each model object...what if somebody doesn't know about this and is doing cross-validation with 100s of models, storing the object each time? I think it'd be fine to solve this with a good mention in the docstring for now tho. |
Actually no, ...
... from what I can tell, if one chooses I've just given a jab at this. Here's a gist with a function that does the job (not fully tested yet but seems to give sensible weights). It's not lightweight at all since I've got to PS: Feel free to comment on the gist if you don't want to bloat this thread. |
It probably does not need to be reshaped. See how The best thing to do is probably to open a WIP PR, rather than a gist. Let's get something that works properly with unit tests, then worry about addressing the bottlenecks. |
any movement on this? @nbara wanna give a shot at a PR? |
Yes I'm happy to try, but it's likely that I'll be be needing some help at some point as I'm not sure a) how to make the code work for TimeDelayingRidge and b) what a proper unit test for this would be. |
Happy to help out :-)
…On Sat, Sep 9, 2017, 8:53 AM Nicolas Barascud ***@***.***> wrote:
Yes I'm happy to try, but it's likely that I'll be be needing some help at
some point as I'm not sure a) how to make the code work for
TimeDelayingRidge and b) what a proper unit test for this would be.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3884 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABwSHYScUvZDDx2HW3tU_08eYpQyAkf6ks5sgrSFgaJpZM4LcIiW>
.
|
Ok starting to work on it now. Are we sure we want to have this computed by default in |
sounds good - I think it would need to be computed in |
I guess another option would be to have a special function for this, e.g.,:
|
what's the cost compared to estimation of coef_ ? if it's small let's
compute it all the time.
|
Yeah, something like this. Initially I imagined this becoming a class method, such as
Depends how well I manage to code this :) |
I don't have the exact usescases and constraints in mind, but just to be sure we are on the same line of thoughts, here is what we do for classic decoding analyses: # don't compute patterns if not necessary
clf = Ridge()
clf.fit(X, y) to get the patterns clf = mne.decoding.LinearModel(Ridge())
clf.fit(X, y)
clf.patterns_ to have this in a pipeline clf = make_pipeline(StandardScaler(), LinearModel(Ridge())
clf.fit(X, y)
mne.decoding.get_coef(clf, 'patterns_', inverse_transform=True) And with mne wrapper compatibility e.g clf = mne.decoding.SlidingEstimator(LinearModel(Ridge()))
clf.fit(X, y)
get_coef(clf, 'patterns_') In other words, we classically don't compute the patterns (cov etc) at fitting; if users want to have the patterns, they use the It would be great to have an analogous API. |
@kingjr so the idea is that the only thing that |
@choldgraf Just ran a few quick tests. Using your tutorial data, I spend an extra ~1.3 seconds in |
If the current code issues Ridge, the auto and cross covariances need to be
recalculated, so it will add time. With TDR (or a custom Ridge) where these
are stored I doubt it will add much time
|
I am fine exposing a patterns_ attribute in term of API
|
@agramfort would patterns_ only be for the inverted coefficients? In that case, is the word "patterns" standard in the eeg/meg decoding world? Just wanna make sure we're being semantically consistent here... |
yes patterns_ is standard these days. fillters_ is semantically the same
as coef_
|
so
and
? I'm +1 on calling it Can we think of any other situation where we'd want to do this set of operations? (storing covariance of X/y in the call to fit, and then using them to invert coefficients). If so, then I think that's a good case to make this another class that can wrap around |
whatever works but KISS
|
@choldgraf @nbara I've been thinking about it and I think it makes more sense to use positive lags in the forward/encoding (stimulus->brain) models than negative lags. For example when X is speech and y is brain/EEG (which it is in both of our examples), the system It's not too late to fix this as we haven't released. WDYT? |
the way I think about it is in terms of experiment time. Where do those features in I think the problem is that when we call them "time delays" (or lags), we're not saying "where relative to Personally, I think it's more intuitive to use negative numbers to represent "things that happen in the past", but if the field has standards around positive numbers, I'm not going to bikeshed this one. I can see a case for both. I don't wanna get into a situation like EEG has where half of their plots have the y-axis flipped, that is just the dumbest thing ever. :-) |
I think the standard might be to use positive numbers for causal behavior in forward models. Do you agree? I am no expert here. The Lalor paper seems to present their forward model that way, at least. |
It might be that by convention STRF plots are sometimes (probably half the time :) ) shown with negative lags, but in terms of the mathematics of the model being convolution of a signal (speech) with a kernel (STRF) to get an output (brain activation), the lags should be positive I think. |
I am not really interested in what is technically correct, as much as what is intuitively correct to most people (the vast majority of which don't really know what convolution is mathematically). Re: standards, here's a quick lit summary:
so in short, I think you're right, we should use positive time lags :-) |
(FWIW, my initial intention was to find papers that used negative lags in their plots, but it was actually kind of difficult...including for my own paper...which tells me that I'm probably wrong in my intuition haha) |
Okay, I'll mark this one for 0.15 so we change the tmin/tmax def assuming nobody else has objections. I should be able to do it this week. |
+1 for positive lags. If you think of the TRF (no S) when a stimulus is a bunch of clicks (so, x = click train, y = EEG signal), then the brain response should be at positive lags, because a click causes the brain signal to wobble after the click happens. I agree that it gets a little hairier with the STRF, but positive lags still make by far the most sense to me. |
(And sorry in advance to any early |
@larsoner this was a good catch before 0.15...it would have been way more annoying to fix this after users had started incorporating it using the pip version. IMO when it's on |
+1 for positive lags as well |
looks like we're on the same page...in that case: https://youtu.be/U3HMZsDioB8?t=4m17s |
good we don't release often enough :)
|
@nbara are you still using |
... oh wait, looks like the |
:) yup ! I even added a small section in the tutorial for this I haven't used RF much lately but one feature I remember missing is the ability to subsample the lags/delays, or to provide a list of lags instead of |
score
method for use withGridSearchCV
Epochs
classplot
method that is smart enough to at least deal with 1D (STA) and 2D (STRF) plotting, probably usingNonUniformImage
class rather thanpcolormesh
linear_regression_raw
,XDawn
(categorical data, Add a bit of support for categorical data toTimeDelayedRidge
? #4940)The text was updated successfully, but these errors were encountered: