Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latency regressions #37915

Closed
JeffBezanson opened this issue Oct 7, 2020 · 11 comments
Closed

latency regressions #37915

JeffBezanson opened this issue Oct 7, 2020 · 11 comments
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version

Comments

@JeffBezanson
Copy link
Member

In the past month or so the time to first plot has gotten significantly worse (for me, about 6.2 seconds to ~7.3 seconds). I've identified two main culprits:

  1. Printf rewrite (31d7210)
    TTFP goes from 6.9 seconds to 7.9 seconds. Maybe it needs more precompile statements?

  2. LLVM upgrade (of course)
    Moving to LLVM 10 changed the TTFP from 7 seconds to ~7.4 seconds.

@JeffBezanson JeffBezanson added regression Regression in behavior compared to a previous version compiler:latency Compiler latency labels Oct 7, 2020
@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2020

2\. LLVM upgrade (of course)
    Moving to LLVM 10 changed the TTFP from 7 seconds to ~7.4 seconds.

#37842 should help with that, since lots of the performance fixes after the Rust blogpost went into that.

Is there a way we could automatically track latency and compiler performance? That seems a bit tricky since doing it rigorously involves restarting Julia (or some other way of wiping the cache). Maybe adding creation time for the sysimage to BaseBenchmarks could stand in as a proxy.

@KristofferC
Copy link
Member

JuliaCI/BaseBenchmarks.jl#252 is one attempt at that.

@KristofferC
Copy link
Member

KristofferC commented Oct 7, 2020

Printf rewrite (31d7210) TTFP goes from 6.9 seconds to 7.9 seconds. Maybe it needs more precompile statements?

Showoff.jl used a lot of the Grisu internals so it wasn't trivial for them to move to the new Printf so they now use https://github.com/JuliaAttic/Grisu.jl (which is of course not precompiled). Executing:

julia> using Showoff

julia> Showoff.showoff([1.0 2.0], :auto)

takes a noticeable time (~1 second) until it returns to the prompt (timing with @time seems to underestimate this). In e.g. 1.5 it seems to be a lot faster.

Edit: Hm, it might just be the latency of displaying of the matrix that increased..

@timholy
Copy link
Member

timholy commented Oct 7, 2020

Some of that is inference time, though:

julia> using SnoopCompileCore

julia> tinf = @snoopi begin
           using Showoff
           Showoff.showoff([1.0 2.0], :auto)
       end
4-element Vector{Tuple{Float64, Core.MethodInstance}}:
 (3.600120544433594e-5, MethodInstance for eltype(::Type{Vector{Base.HasShape{1}}}))
 (0.0061800479888916016, MethodInstance for resize_nthreads!(::Vector{Vector{Grisu.Bignums.Bignum}}, ::Vector{Grisu.Bignums.Bignum}))
 (0.03189992904663086, MethodInstance for hcat(::Float64, ::Float64))
 (0.13740992546081543, MethodInstance for showoff(::Matrix{Float64}, ::Symbol))

and could be precompiled if we knew what we wanted.

EDIT: I really should start saying "preinferred" since there's a difference...

@KristofferC
Copy link
Member

KristofferC commented Oct 7, 2020

Top inferred:

(0.05352616310119629, MethodInstance for _parse_colorant(::String))
 (0.06059598922729492, MethodInstance for union!(::Set{Symbol}, ::Base.KeySet{Symbol, Dict{Symbol, Any}}))
 (0.06485390663146973, MethodInstance for +(::Matrix{Measures.AbsoluteLength}, ::Matrix{Measures.AbsoluteLength}))
 (0.08184194564819336, MethodInstance for (::Base.var"#maximum##kw")(::NamedTuple{(:dims,), Tuple{Int64}}, ::typeof(maximum), ::Matrix{Measures.AbsoluteLength}))
 (0.09086298942565918, MethodInstance for recipe_pipeline!(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol, Any}, ::Tuple{Matrix{Float64}}))
 (0.1283719539642334, MethodInstance for showoff(::Vector{Float64}, ::Symbol))
 (0.13596415519714355, MethodInstance for collect_similar(::Vector{String}, ::Base.Generator{Vector{String}, Plots.var"#346#347"{typeof(Plots.convert_sci_unicode)}}))
 (0.19153594970703125, MethodInstance for apply_recipe(::Dict{Symbol, Any}, ::Type{RecipesPipeline.SliceIt}, ::Nothing, ::Matrix{Float64}, ::Nothing))
 (0.2347571849822998, MethodInstance for abstract_call_builtin(::Core.Compiler.NativeInterpreter, ::typeof(getfield), ::Vector{Any}, ::Vector{Any}, ::Core.Compiler.InferenceState, ::Int64))
 (0.522068977355957, MethodInstance for abstract_call_gf_by_type(::Core.Compiler.NativeInterpreter, ::Any, ::Vector{Any}, ::Any, ::Core.Compiler.InferenceState, ::Int64))
 (0.5291898250579834, MethodInstance for rand(::Int64, ::Int64))

rand(::Int64, ::Int64) is interesting one, why is that one so slow? What does it mean when Compiler stuff like abstract_call_builtin gets inferred?

@timholy
Copy link
Member

timholy commented Oct 7, 2020

I don't see those abstract_call_*s:

julia> tinf = @snoopi tmin=0.01 begin
           using Plots
           display(plot(rand(5)))
       end
28-element Vector{Tuple{Float64, Core.MethodInstance}}:
 (0.010251998901367188, MethodInstance for _any(::typeof(identity), ::Base.Generator{Vector{Plots.Series}, typeof(Plots.hascolorbar)}, ::Colon))
 (0.012759923934936523, MethodInstance for get!(::Base.var"#151#152"{Set{Symbol}}, ::Dict{Symbol, Set{Symbol}}, ::Symbol))
 (0.012845039367675781, MethodInstance for map(::Function, ::Vector{Float64}))
 (0.013175010681152344, MethodInstance for #any#678(::Function, ::typeof(any), ::Function, ::Vector{Float64}))
 (0.01339411735534668, MethodInstance for setindex!(::Dict{Any, Plots.Subplot}, ::Plots.Subplot{Plots.GRBackend}, ::Symbol))
 (0.013488054275512695, MethodInstance for setindex!(::Dict{Symbol, PlotThemes.PlotTheme}, ::PlotThemes.PlotTheme, ::Symbol))
 (0.014439105987548828, MethodInstance for setindex!(::Dict{Plots.Subplot, Any}, ::Dict{Symbol, Any}, ::Plots.Subplot{Plots.GRBackend}))
 (0.01650714874267578, MethodInstance for _mapreducedim!(::typeof(identity), ::typeof(max), ::Matrix{Measures.AbsoluteLength}, ::Matrix{Measures.AbsoluteLength}))
 (0.016971111297607422, MethodInstance for Vector{Plots.Subplot}(::Vector{Plots.Subplot{Plots.GRBackend}}))
 (0.017770051956176758, MethodInstance for _plot!(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol, Any}, ::Tuple{Vector{Float64}}))
 (0.020716190338134766, MethodInstance for _mapreduce_dim(::Function, ::Function, ::Base._InitialValue, ::Matrix{Measures.AbsoluteLength}, ::Int64))
 (0.02102208137512207, MethodInstance for get_color_palette(::ColorPalette, ::Int64))
 (0.021051883697509766, MethodInstance for sum(::Matrix{Measures.AbsoluteLength}))
 (0.025829076766967773, MethodInstance for *(::Int64, ::Vector{Measures.AbsoluteLength}))
 (0.028828144073486328, MethodInstance for apply_recipe(::Dict{Symbol, Any}, ::Vector{Float64}))
 (0.03306078910827637, MethodInstance for (::Plots.var"#gr_set_font##kw")(::NamedTuple{(:halign, :valign, :rotation, :color), Tuple{Symbol, Symbol, Int64, RGB{FixedPointNumbers.N0f8}}}, ::typeof(Plots.gr_set_font), ::Plots.Font, ::Plots.Subplot{Plots.GRBackend}))
 (0.03736686706542969, MethodInstance for gr_add_legend(::Plots.Subplot{Plots.GRBackend}, ::NamedTuple{(:w, :h, :dy, :leftw, :textw, :rightw, :xoffset, :yoffset, :width_factor), NTuple{9, Float64}}, ::Vector{Float64}))
 (0.04241609573364258, MethodInstance for maximum(::Vector{Measures.AbsoluteLength}))
 (0.044959068298339844, MethodInstance for foreach(::Function, ::Vector{Dict{Symbol, Any}}, ::Vector{Dict{Symbol, Any}}))
 (0.04762101173400879, MethodInstance for optimize_ticks_typed(::Float64, ::Float64, ::Bool, ::Vector{Tuple{Float64, Float64}}, ::Int64, ::Int64, ::Int64, ::Float64, ::Float64, ::Float64, ::Float64, ::Bool, ::Nothing))
 (0.05859208106994629, MethodInstance for +(::Matrix{Measures.AbsoluteLength}, ::Matrix{Measures.AbsoluteLength}))
 (0.07781410217285156, MethodInstance for (::Base.var"#maximum##kw")(::NamedTuple{(:dims,), Tuple{Int64}}, ::typeof(maximum), ::Matrix{Measures.AbsoluteLength}))
 (0.0782470703125, MethodInstance for warn_on_recipe_aliases!(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol, Any}, ::Symbol, ::Vector{Float64}))
 (0.13669705390930176, MethodInstance for plot(::Vector{Float64}))
 (0.13689613342285156, MethodInstance for showoff(::Vector{Float64}, ::Symbol))
 (0.1400148868560791, MethodInstance for apply_recipe(::Dict{Symbol, Any}, ::Type{RecipesPipeline.SliceIt}, ::Nothing, ::Vector{Float64}, ::Nothing))
 (0.14041399955749512, MethodInstance for collect_similar(::Vector{String}, ::Base.Generator{Vector{String}, Plots.var"#346#347"{typeof(Plots.convert_sci_unicode)}}))
 (0.17162108421325684, MethodInstance for warn_on_recipe_aliases!(::Plots.Plot{Plots.GRBackend}, ::Vector{RecipeData}, ::Symbol, ::Vector{Float64}))

The absence of _parse_colorant is JuliaGraphics/Colors.jl#445.

@KristofferC
Copy link
Member

KristofferC commented Oct 7, 2020

I had using Plots outside of @snoopi and.

tinf = @snoopi begin
    (d = plot(rand(2,2)); display(d));
end

When I move the rand(2,2) outside of @snoopic the MethodInstance for rand(::Int64, ::Int64)) gets replaced with

 (0.5193078517913818, MethodInstance for setproperty!(::Plots.CurrentBackend, ::Symbol, ::Plots.GRBackend))

@giordano
Copy link
Contributor

julia master, with LLVM 10:

% julia-master --project=. --startup-file=no -e '@time using Plots'
  4.028043 seconds (6.51 M allocations: 470.491 MiB, 4.07% gc time, 16.62% compilation time)

with LLVM 11 (#37842):

% julia-master --project=. --startup-file=no -e '@time using Plots'
  3.392166 seconds (5.97 M allocations: 431.075 MiB, 3.46% gc time, 17.22% compilation time)

@JeffBezanson
Copy link
Member Author

Unfortunately I don't see any improvement in the time to make the first plot itself.

@JeffBezanson
Copy link
Member Author

For me @time display(plot(rand(10))) is ~7.4 seconds both before and after upgrading to LLVM 11.

@vtjnash vtjnash closed this as completed Jan 28, 2021
@vtjnash
Copy link
Member

vtjnash commented Jan 28, 2021

Plots is a moving target (and very slow right now), but we'll keep working on it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

6 participants