Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve cat design / performance #49322

Merged
merged 2 commits into from
Jul 13, 2023
Merged

improve cat design / performance #49322

merged 2 commits into from
Jul 13, 2023

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Apr 11, 2023

This used to make a lot of references to design issues with the SparseArrays package (#2326 / #20815), which result in a non-sensical dispatch arrangement, and contribute to a slow loading experience do to the illogical Unions that must be checked by subtyping.

Requires similar those issues to be fixed in SparseArrays first (JuliaSparse/SparseArrays.jl#384) before merging this.

It is hard to get a reliable measure of the exact impact, since that measurement fluctuates a bit between builds due to other factors. But we can see this uses a bit less memory now, and I had instrumented it previously to measure that this cost 0.5s of load time, and that cost went do pretty much to zero after this change.

julia> @time using OmniPackage
 26.654442 seconds (20.52 M allocations: 1.322 GiB, 9.15% gc time, 29.93% compilation time: 86% of which was recompilation) # before
 26.697070 seconds (20.38 M allocations: 1.322 GiB, 8.54% gc time, 27.20% compilation time: 85% of which was recompilation) # after

base/abstractarray.jl Outdated Show resolved Hide resolved
@JeffBezanson JeffBezanson added domain:arrays [a, r, r, a, y, s] compiler:latency Compiler latency labels Apr 12, 2023
@vtjnash vtjnash marked this pull request as ready for review April 20, 2023 18:17
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Apr 20, 2023

Seems like SparseArrays has been broken on main for awhile, so we are waiting for this to be fixed before we can move ahead with this PR: JuliaSparse/SparseArrays.jl#363 (comment). It recently switched to using the JLL, but seems to still simultaneously be vendering its own copy of the JLL which is broken.

@rayegun
Copy link
Member

rayegun commented May 5, 2023

I'm going to be trying to fix the SuiteSparse issue this weekend, I'm not sure when Tim Davis would release the update with my fix but he's pretty responsive.

@ViralBShah
Copy link
Member

ViralBShah commented Jul 10, 2023

#48977 is merged.

@vtjnash vtjnash force-pushed the jn/better-cats branch 3 times, most recently from 364419e to 92ec9d4 Compare July 12, 2023 19:51
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jul 12, 2023

I am a bit skeptical of these numbers, but initial measurements indicate this saves about 3 seconds of load time (15%)

 25.983725 seconds (26.35 M allocations: 1.573 GiB, 5.33% gc time, 7.27% compilation time: 65% of which was recompilation) # master
 23.550038 seconds (22.45 M allocations: 1.417 GiB, 5.00% gc time, 6.67% compilation time: 58% of which was recompilation) # PR

This used to make a lot of references to design issues with the
SparseArrays package (#2326 /
#20815), which result in a
non-sensical dispatch arrangement, and contribute to a slow loading
experience do to the nonsense Unions that must be checked by subtyping.
@vtjnash vtjnash merged commit 5a922fa into master Jul 13, 2023
1 check passed
@vtjnash vtjnash deleted the jn/better-cats branch July 13, 2023 18:24
@ViralBShah ViralBShah added the backport 1.10 Change should be backported to the 1.10 release label Jul 14, 2023
@ViralBShah
Copy link
Member

ViralBShah commented Jul 14, 2023

@vtjnash Is this ok to backport to 1.10? We may need to do that since the sparse hvcat changes will get pulled in with all the other SparseArrays PRs on SparseArrays master.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jul 14, 2023

Sure. It was ready before that, but just was waiting on SparseArrays. The performance boost to loading and running seemed worth it for backport.

@benlorenz
Copy link
Contributor

Since we are seeing some errors when testing Oscar with julia nightly after this was merged, is it intended that this will cause anything that

  • is derived from AbstractSparseMatrix and
  • does not have a custom hcat / vcat function

to be converted to SparseMatrixCSC when using vcat or hcat?

We have a custom boolean sparse matrix (it stores only true the values and prints as their indices) in Oscar:

julia> IM = IncidenceMatrix([[1, 3, 7], [4, 5, 6]])
2×7 IncidenceMatrix
[1, 3, 7]
[4, 5, 6]


julia> IM isa SparseArrays.AbstractSparseArray
true

julia> vcat(IM,IM)
4×7 SparseMatrixCSC{UInt8, Int64} with 12 stored entries:
 0x01       0x01                 0x01
                0x01  0x01  0x01     
 0x01       0x01                 0x01
                0x01  0x01  0x01     

Previously this went through the base implementation for abstract arrays and returned an IncidenceMatrix (e.g. on julia 1.10-alpha1):

julia> vcat(IM,IM)
4×7 IncidenceMatrix
[1, 3, 7]
[4, 5, 6]
[1, 3, 7]
[4, 5, 6]

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jul 17, 2023

KristofferC pushed a commit that referenced this pull request Jul 18, 2023
This used to make a lot of references to design issues with the
SparseArrays package (#2326 /
#20815), which result in a
non-sensical dispatch arrangement, and contribute to a slow loading
experience do to the nonsense Unions that must be checked by subtyping.

(cherry picked from commit 5a922fa)
KristofferC added a commit that referenced this pull request Jul 24, 2023
Backported PRs:
- [x] #50411 <!-- Fix weird dispatch of * with zero arguments -->
- [x] #50202 <!-- Remove dynamic dispatch from _wait/wait2 -->
- [x] #50064 <!-- Fix numbered prompt with input only with comment -->
- [x] #50026 <!-- Store heapsnapshot files in tempdir() instead of
current directory -->
- [x] #50402 <!-- Add CPU feature helper function -->
- [x] #50387 <!-- update newpages pointer after actually sweeping pages
-->
- [x] #50424 <!-- avoid potential type-instability in _replace_(str,
...) -->
- [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some
cases -->
- [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized -->
- [x] #50466 <!-- relax assertion involving pg->nold to reflect that it
may be a bit in… -->
- [x] #50490 <!-- Fix compat annotation for italic printstyled -->
- [x] #50488 <!-- fix typo in `Base.isassigned` with `Tridiagonal` -->
- [x] #50476 <!-- Profile: Add specifying dir for `take_heap_snapshot`
and handling if current dir is unwritable -->
- [x] #50461 <!-- fix typo in the --gcthreads argument description -->
- [x] #50528 <!-- ssair: Correctly handle stmt insertion at end of basic
block -->
- [x] #50533 <!-- ensure internal_obj_base_ptr checks whether objects
past freelist pointer are in freelist -->
- [x] #49322 <!-- improve cat design / performance -->
- [x] #50540 <!-- gc: remove over-eager assertion -->
- [x] #50542 <!-- gf: remove unnecessary assert cycle==depth -->
- [x] #50559 <!-- Expand kwcall lowering positional default check to
vararg -->
- [x] #50058 <!-- Add unwrapping mechanism for triangular mul and solves
-->
- [x] #50551 <!-- typeintersect: also record chained `innervars` -->
- [x] #50552 <!-- read(io, Char): fix read with too many leading ones
-->
- [x] #50541 <!-- precompile: ensure globals are not accidentally
created where disallowed -->
- [x] #50576 <!-- use atomic compare exchange when setting the GC
mark-bit -->
- [x] #50578 <!-- gf: make method overwrite/delete an error during
precompile -->
- [x] #50516 <!-- Fix visibility of assert on GCC12/13 -->
- [x] #50597 <!-- Fix memory corruption if task is launched inside
finalizer -->
- [x] #50591 <!-- build: fix various makefile bugs -->
- [x] #50599 <!-- faster invalid object lookup in conservative gc -->
- [x] #50634 <!-- 🤖 [master] Bump the SparseArrays stdlib from b4b0e72
to 99c99b4 -->
- [x] #50639 <!-- Backport LLVM patches to fix various issues. -->
- [x] #50546 <!-- Revert storage of method instance in LineInfoNode -->
- [x] #50631 <!-- Shift DCE pass to optimize imaging mode code better
-->
- [x] #50525 <!-- only check that values are finite in `generic_lufact`
when `check=true` -->
- [x] #50587 <!-- isassigned for ranges with BigInt indices -->
- [x] #50144 <!-- Page based heap size heuristics -->


Need manual backport:
- [ ] #50595 <!-- Rename ENV variable `JULIA_USE_NEW_PARSER` ->
`JULIA_USE_FLISP_PARSER` -->



Non-merged PRs with backport label:
- [ ] #50637 <!-- Remove SparseArrays legacy code -->
- [ ] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [ ] #50598 <!-- only limit types in stack traces in the REPL -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [ ] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [ ] #50172 <!-- print feature flags used for matching pkgimage -->
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency domain:arrays [a, r, r, a, y, s]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants