Improve performance #1

tisztamo · 2021-03-29T10:54:02Z

A possible fix of the missing performance gain.

julia> demo_sum()
Baseline:
  4.649 s (0 allocations: 0 bytes)
Catwalk defaults:
  4.210 s (5607369 allocations: 116.47 MiB)
Catwalk tuned:
  4.007 s (2775829 allocations: 55.60 MiB)

The problem was that

@inline next(rf::R_{Map}, result, input) = next(inner(rf), result, xform(rf).f(input))

was called before the Catwalked method of next, resulting in a non-jitted dynamic dispatch.

I am not sure though if what I did is reasonable in the larger context, but I hope you can fix it based on this.

Also, the default batch size was too small, so I have increased it to 1e6, which may be more than ideal, more tests are needed.

When testing with @btime, initial overhead should be small, but I see a small amount of compilation in every Catwalked run, thats why the tested runtimes have to be several seconds. I will check that, but I like to test cold runs with @time anyway, because Catwalk adds significant compiling overhead, and not measuring it seems unfair.

tkf · 2021-03-29T19:58:31Z

The problem was that
@inline next(rf::R_{Map}, result, input) = next(inner(rf), result, xform(rf).f(input))
was called before the Catwalked method of next, resulting in a non-jitted dynamic dispatch.

Oooh, I see. Unfortunately, the patch in this PR is not generic enough since, in general, we can't assume any structure outer/left to OptimizeInner() (inner/right, too). For example, it's reasonable to have

xs |> Filter(x -> x > 0) |> Map(type_instability) |> OptimizeInner() |> Map(asint)

(But it does help me understand the problem. Thanks!)

I'm not sure what's the best strategy, though. I think we need something like

@please_inline Transducers.next(rf::R_{OptimizeXF}, acc, @nospecialize(input)) = ...

in the Julia compiler to fully solve the problem; i.e., the compiler inlines this even though input cannot be inferred.

Meanwhile, maybe I should stop trying to support OptimizeInner(). Maybe it's still possible to support the case where the instability happens on the iterator side

(type_instability(x) for x in xs) |> Map(asint)
#                                    ----------
#                                    JIT'ed

tisztamo · 2021-03-30T13:27:44Z

Yeah, I had the fear that simply eliminating that call is not the way to go...

About the compiler support: I struggle a lot with inlining, and the possibility to force it would result in measurable performance improvements in the original target of Catwalk.jl, but I never was brave enough to ask for it...

Forcing inlining from the call site seems a bit less risky in terms of accidental compilation overhead, and now we have a real use case. Do you think it is time to open an issue?

tkf · 2021-03-30T20:56:28Z

My guess is that many Julia programmers wished there was a forced/more controllable inlining macro at least once. I couldn't find it in the issue tracker, though, which is kinda strange. Maybe everyone assumed there is already one 😄 . So yeah, I think it'd be nice to have an issue for this.

tisztamo · 2021-06-28T10:49:31Z

Great news, @tkf : JuliaLang/julia#41328 allows forced inlining! I have tested this case (only on non-folds, non-catwalk sample code for now, I have package installation issues after compiling 1.8-dev).

tkf · 2021-06-29T06:25:18Z

Thanks! Yeah, that's great news, esp. for packages heavily depend on higher-order function like Transducers.

Improve performance

fd0c77e

tisztamo mentioned this pull request Apr 1, 2021

More control over inlining JuliaLang/julia#40292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance #1

Improve performance #1

tisztamo commented Mar 29, 2021

tkf commented Mar 29, 2021

tisztamo commented Mar 30, 2021

tkf commented Mar 30, 2021

tisztamo commented Jun 28, 2021 •

edited

Loading

tkf commented Jun 29, 2021

Improve performance #1

Are you sure you want to change the base?

Improve performance #1

Conversation

tisztamo commented Mar 29, 2021

tkf commented Mar 29, 2021

tisztamo commented Mar 30, 2021

tkf commented Mar 30, 2021

tisztamo commented Jun 28, 2021 • edited Loading

tkf commented Jun 29, 2021

tisztamo commented Jun 28, 2021 •

edited

Loading