Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame not defined error when including interval keyword argument #159

Closed
mthelm85 opened this issue Oct 4, 2019 · 2 comments · Fixed by #160
Closed

DataFrame not defined error when including interval keyword argument #159

mthelm85 opened this issue Oct 4, 2019 · 2 comments · Fixed by #160

Comments

@mthelm85
Copy link

mthelm85 commented Oct 4, 2019

I'm getting a DataFrame not defined error when attempting to use the interval keyword argument in the predict function.

Code to reproduce:

using DataFrames, GLM

data = DataFrame(X=[1,2,3], Y=[2,4,7])
ols = lm(@formula(Y ~ X), data)
new_data = DataFrame(X=[5])

# Without the interval kwarg

julia> predict(ols, new_data)
1-element Array{Union{Missing, Float64},1}:
 11.833333333333343

# With the interval kwarg

julia> predict(ols, new_data, interval=:prediction)
ERROR: UndefVarError: DataFrame not defined
Stacktrace:
 [1] _return_predictions(::NamedTuple{(:prediction, :lower, :upper),Tuple{Array{Float64,1},Array{Float64,2},Array{Float64,2}}}, ::BitArray{1}, ::Int64) at C:\Users\mthel\.julia\packages\StatsModels\Kz7By\src\statsmodel.jl:153
 [2] #predict#74(::Base.Iterators.Pairs{Symbol,Symbol,Tuple{Symbol},NamedTuple{(:interval,),Tuple{Symbol}}}, ::typeof(predict), ::StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}, ::DataFrame) at C:\Users\mthel\.julia\packages\StatsModels\Kz7By\src\statsmodel.jl:166
 [3] (::getfield(StatsBase, Symbol("#kw##predict")))(::NamedTuple{(:interval,),Tuple{Symbol}}, ::typeof(predict), ::StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}, ::DataFrame) at .\none:0
 [4] top-level scope at REPL[11]:1
@nalimilan
Copy link
Member

nalimilan commented Oct 4, 2019

Good catch. @kleinschmidt you thought StatsModels didn't depend on DataFrames anymore, but you cheated!

It shouldn't be too hard to fix by using Tables.materializer on the input data to return the same type of table object, without depending on DataFrames. We would return a NamedTuple when a matrix is passed. I can fix this when getting rid of TableRegressionModel (#32), but we may want to apply a simpler fix before that.

Tests don't catch it probably because they load DataFrames. Designing a test that would catch the error sounds hard.

@nalimilan nalimilan transferred this issue from JuliaStats/GLM.jl Oct 4, 2019
@nalimilan
Copy link
Member

See #160 for the fix, and JuliaStats/GLM.jl#335 for new tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants