-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional Methods for RegressionModel #300
Comments
Regarding |
What would the desired behavior of |
CoefnamesI think each package can define the coefnames per their structs.
In case the model does not have the method I would prefer to have it not implemented. If a function wants to use a proxy such as "X1, X2, ..." then it can run it in the function rather than have it as an implemented method for objects without the information. Alternatively, it could be the default for Wald TestI use it during |
Actually, I think it would be better to return an A model type could still return a plain
I meant, could you show some examples of what the |
The simplest case could be something like:
Given the
More advanced iterations could have additional items such as the restrictions:
It needs not to be a dictionary as the results could be a tuple for example. The tricky part is developing a nice syntax to construct the restriction matrix based on some parsable syntax such as "X3 = 2.5 * X4 - X2" yielding (assuming X1 - X4 variables)
and checking that it is a valid construct (no redundant restrictions, etc.). For now I think it would be best to just have the joint-significance specified as including the variable and alternatively passing a restriction matrix rather than a list of coefficients for more elaborated linear combinations. |
Sorry, I had misread your description, I thought the I think it makes sense to start with a simple implementation, as long as it can be extended later without signature conflicts (e.g. formulas would be natural to use here). Can you have a look at what APIs other statistical software use for this test? Regarding the question of how to store coef names, I've just filed JuliaStats/StatsModels.jl#32, which is related in that it would allow accessing coefficient names from a Wald test function defined in StatsBase. |
Sounds good. I believe a We could probably have an abstract type for tests (I am working on a submodule for Econometrics.jl with tests for heteroscedasticity, multicollinearity, consistency of estimator, etc.). A sub-abstract could manage the Wald test, the likelihood-ratio test and the Lagrange multiplier tests as these three are asymptotically equivalent and tell different aspects of the same question. Stata, for example, implements these methods using the following syntax test: linear combinations,
Likelihood tests are done through lrtest. Matlab computes the Wald test by passing the arguments as matrices (see documentation) Not entirely sure how would the formulae work the most intuitively... I think the easiest would be to have (1) all, but intercept [default], (2) provide a |
Adding to this is two requests stemming from JuliaStats/GLM.jl#259 We should separate the number of rows in the matrix used for estimation from the sum of the weights in the weight vector. Current Julia behavior returns In Stata:
In R the
|
How about
|
Thats what milan suggested earlier, and I think it is the right solution. But would have to go into an agreed upon model api. I cross-posted here because this seems like the relevant StatsBase issue. I feel like this needs buy-in from various people who create Thanks. |
I would like to Feature Request the following methods:
model_matrix
would return a the design matrix as aMatrix{Float64}
.This would allow various packages to access the field with a standard API instead of hunting down the structure of the structs.
coefnames
would return the names of the coefficients asVector{Symbol}
.It mirrors
DataFrames.coefnames(obj::DataFrames.ModelFrame)
wald_test
would return the Wald statistic for those restrictions as aDict{Symbol,Any}
with keys: Statistic (Value of test), p_value (p-value based on the F-Distribution with number of restrictions and residual degrees of freedom), and possibly some way to display the restrictions.It would allow for a standard and automated default method rather than have each package define it (redundant code). Currently the most useful syntax is to especially which coefficients are jointly significant (different from zero) with the default being all, but the intercept if included. However, for being comprehensive a different syntax would have to be worked in order to allow for any valid linear combination restrictions, (e.g.,
coef(obj)[1] == coef(obj)[2] and coef(obj)[3] == 2 * coef(obj)[4]
). Therdf
keyword parameter allows for degrees of freedom adjustments commonly used with cluster-robust variance-covariance estimators.For example,
The text was updated successfully, but these errors were encountered: