Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve consistency of model API #280

Open
nalimilan opened this issue Jun 30, 2017 · 4 comments
Open

Improve consistency of model API #280

nalimilan opened this issue Jun 30, 2017 · 4 comments

Comments

@nalimilan
Copy link
Member

Our current StatisticalModel API is heavily inspired from R, adapted to match Julia rules (plus some improvements), but it is not really consistent (since R isn't either). Below is a summary of the situation. Names are considered OK if they are either a standard abbreviation, or are spelled in full (and the name is not too long).

Names consistent with R and OK: coef, nobs, deviance, confint, fitted, residuals, predict, aic, bic, aicc, r2/, adjr2, adjr², stderr

Names that do not exist in R and are OK: aicc, r2/, adjr2, adjr², stderr, nulldeviance

Names different from R for good reasons: dof, dof_residual (to avoid two-letter objects, limiting confusion with df = DataFrame(...))

Names not consistent with R for bad reasons, and proposed change:

  • loglikelihood -> loglik (shorter, relatively standard abbreviation, and consistent with R)
  • nullloglikelihood -> null_loglik (shorter and more readable; if changed, nulldeviance should be called null_deviance)

Names consistent with R which should be changed:

  • vcov -> cov
  • model_response -> response ("model" is redundant with multiple dispatch)

Name to add and which does not exist in R: params, param or parameters. The choice depends on whether we accept abbreviations, or prefer full names everywhere. Names which should be changed accordingly include potentially coef, loglikelihood/nullloglikelihood, nobs, adjr2/adjr², confint and dof.

@rofinn
Copy link
Member

rofinn commented Jun 30, 2017

I like all of this except that I'm not a fan of the lik convention in R (maybe like?). Also, seems a little weird to have null_loglik and null_deviance when everything else seems to skip the _. Maybe we could make null a keyword which defaults to false?

@rofinn
Copy link
Member

rofinn commented Jun 30, 2017

This is probably a separate issue, but would it make sense to merge the StatisticalModel API into StatsModels.jl?

@nalimilan
Copy link
Member Author

I like all of this except that I'm not a fan of the lik convention in R (maybe like?). Also, seems a little weird to have null_loglik and null_deviance when everything else seems to skip the _. Maybe we could make null a keyword which defaults to false?

Yeah. I guess whether to abbreviate loglikelihooddepends on what we decide to do for other abbreviations. Regarding the _, I suggested this mainly because nullloglikelihood is really weird, but OTOH it isn't used that frequently. The idea of replacing it with a parameter is certainly worth considering.

This is probably a separate issue, but would it make sense to merge the StatisticalModel API into StatsModels.jl?

We've discussed this in the past, but I think it's still an open question. I think @kleinschmidt wanted to put only table-related functions in that package, which would mean modeling packages wouldn't even have to depend on it: it would just handle the transformation of formulas + data to a model matrix. Anyway, not really the place to discuss this.

@nalimilan
Copy link
Member Author

See also JuliaStats/Roadmap.jl#4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants