Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Ridge Regression #205

Open
Nosferican opened this issue Dec 14, 2017 · 9 comments
Open

Support for Ridge Regression #205

Nosferican opened this issue Dec 14, 2017 · 9 comments

Comments

@Nosferican
Copy link
Contributor

Allow to pass a semi-positive matrix such that the objective function is

β = inv(X.'W * X + Γ) * X.'W * y
@nalimilan
Copy link
Member

Can you develop what concrete changes you suggest? Shouldn't/couldn't it be implemented in a separate package?

@ararslan
Copy link
Member

I think there already is a package with ridge regression, though I can't recall offhand which one it is

@Nosferican
Copy link
Contributor Author

There is this implementation multivariatestatsjl, but is only for linear not GLM models. I haven't seen any for Ridge logistic regression for example.

@dmbates
Copy link
Contributor

dmbates commented Dec 14, 2017

I'm not sure you would want that particular formula for the GLM case. As the name implies, the Iteratively Reweighted Least Squares (IRLS) algorithm iterates on the W matrix. Would a fixed value of Γ make sense?

@Nosferican
Copy link
Contributor Author

Nosferican commented Dec 16, 2017

@dmbates I think this article might have the answer (if not the references or reaching out to the authors). Wikipedia also has the formula used for Ridge Poisson Regression.

This implementation has IRLS with Ridge for logistic.

@Nosferican
Copy link
Contributor Author

Ideally, rather than implementing just Ridge, implementing Bayesian GLM would provide the additional features.

@hung-q-ngo
Copy link
Contributor

Would support for logistic regression with l1-regularization be part of this issue (special case of Bayesian GLM), or should it be a separate issue? (Here's an algorithm.)

@hung-q-ngo
Copy link
Contributor

Perhaps instead of going full-blown Bayesian, support for elastic net regularization would be great. The choices of features included with H2O.ai's GLM implementation seems to be practical.

@Nosferican
Copy link
Contributor Author

Regularizations such as Ridge and LASSO are special cases of maximum a posterior (MAP) which under the special case of uninformative priors become maximum likelihood estimators (MLE). I do have plans to develop the MAP framework eventually as I believe most of the tools are available. However, it still requires a bit more work to generalize it (e.g., using MCMC). Ridge in particular is a strange mutant as it requires the standardization of the linear predictor assuming normality and a standard normal for all priors. Still it has the nice feature to be scale-less. LASSO can be used with specialized solvers, but the general cases do require some work (e.g., group LASSO and all its variants). I ain't support sold on elastic-net, but is doable with the MAP framework. MAP is still a quasi-Bayesian approach as it doesn't really estimate the whole distribution, but is computationally feasible and gets you the essential aspects of Bayesian inference. After getting basic support for most of regression analysis and whatnot, the other applications can be implemented as interest builds for each.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants