Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Memory Usage #626

Merged
merged 2 commits into from
Jun 21, 2024
Merged

Conversation

DanielPointon
Copy link
Contributor

@DanielPointon DanielPointon commented Mar 12, 2024

Contains the following:

Dynamic memory allocation and deallocation for Homomorphism
Dynamic memory allocation and deallocation for Cliques
Subtle bug fix for Bliss Graphs when N > 512/3 (see the old allocation of BLISS_GRAPH[i] vs the new in homos.c)
Support for Digraph homomorphism with N>512
Support for Bitarray operations with N > 512
Support for flexible bitarray lookup sizes

Logic for memory allocation is as follows:

  • All internal structures have a minimum required size which is some increasing function of either Max(Vertices(Digraph1), Vertices(Digraph2)) or Vertices(Digraph1), depending on the operation taking place
  • If Max(Vertices(Digraph1), Vertices(Digraph2)) > current size or Vertices(Digraph1) > current size , increase the current size of the structures, free and re-reallocate them.

Logic for bitarrays:

  • The first time a bitarray is created, we create lookups of size 513 containing some common calculations. If needed, these can be resized by the user. If a user requests a bitarray with size > 513, we calculate values on the fly.

@DanielPointon DanielPointon marked this pull request as draft March 12, 2024 23:35
@DanielPointon
Copy link
Contributor Author

DanielPointon commented Mar 13, 2024

MAXVERTS > 512 now supported!! :)

And I got some test results

On my branch: 824MB of memory 
On main branch: 2.4GB of memory

@DanielPointon DanielPointon changed the title Memory changes 2 [Draft] Reduce Memory Usage Mar 13, 2024
@james-d-mitchell james-d-mitchell added the refactor Label for PRs or issues related to refactoring code label Mar 14, 2024
@DanielPointon
Copy link
Contributor Author

@james-d-mitchell is there any chance you're able to link me to the configuration for the "coverage" check run on azure? It's failing with a weird error that I can't reproduce locally, and I'd like to know exactly what config the system is using so I can debug

https://dev.azure.com/jdm3/jdm3/_build/results?buildId=3162&view=logs&j=1117f030-3e3a-5822-cfd7-97e776c1f4cb

I had a look in the .github/workflows folder but couldn't see any thing useful

@james-d-mitchell
Copy link
Member

@DanielPointon DanielPointon marked this pull request as ready for review March 25, 2024 04:24
@DanielPointon DanielPointon changed the title [Draft] Reduce Memory Usage Reduce Memory Usage Mar 25, 2024
@james-d-mitchell james-d-mitchell added the rerun-ci A label for PR's to indicate that the ci should be rerun (possibly after a rebase) label Mar 27, 2024
@james-d-mitchell
Copy link
Member

Also rebased this onto main and force pushed, this should fix the ci.

@james-d-mitchell james-d-mitchell removed the rerun-ci A label for PR's to indicate that the ci should be rerun (possibly after a rebase) label Mar 28, 2024
@DanielPointon
Copy link
Contributor Author

Also rebased this onto main and force pushed, this should fix the ci.

It did, thanks! Ignoring the coverage test this looks good to go

@james-d-mitchell
Copy link
Member

Thanks @DanielPointon I think we can ignore the code coverage, unless you can see some places to add tests that test some lines that aren't tested, if you can, them please do, if not, then forget about it

@james-d-mitchell
Copy link
Member

@DanielPointon I started reviewing this but realised that there are a few issues, and so I fixed compile warnings in digraphs in general in #633, and hoped you could rebase your PR on top of those.

@DanielPointon
Copy link
Contributor Author

@DanielPointon I started reviewing this but realised that there are a few issues, and so I fixed compile warnings in digraphs in general in #633, and hoped you could rebase your PR on top of those.

Lol, 17 merge conflicts. I've rebased :) Will look at fixing compile warnings now

@DanielPointon
Copy link
Contributor Author

@DanielPointon I started reviewing this but realised that there are a few issues, and so I fixed compile warnings in digraphs in general in #633, and hoped you could rebase your PR on top of those.

Lol, 17 merge conflicts. I've rebased :) Will look at fixing compile warnings now

@james-d-mitchell I think I've fixed all the compile warnings I caused, apologies if there's anything egregious I missed, the VIP is my first time writing C :)

@james-d-mitchell
Copy link
Member

james-d-mitchell commented Apr 4, 2024

Unfortunately, the changes in this PR appear to leak memory, here's the output of valgrind:

mem_leak.txt

Thanks to @Joseph-Edwards for running valgrind.

I will fix this.

@DanielPointon
Copy link
Contributor Author

Unfortunately, the changes in this PR appear to leak memory, here's the output of valgrind:

mem_leak.txt

Thanks to @Joseph-Edwards for running valgrind.

I will fix this.

Ah, sorry! I've made a quick push to fix a few of the errors: DanielPointon@45c928a

I can revert if it gets in the way of your debugging

@Joseph-Edwards
Copy link
Collaborator

This is the valgrind output with those changes:

mem_leak_2.txt

@DanielPointon
Copy link
Contributor Author

@Joseph-Edwards thanks! @james-d-mitchell if you're okay to hold off until Monday I think I can fix this up over the weekend and save you some time.

@DanielPointon
Copy link
Contributor Author

DanielPointon commented Apr 7, 2024

...
==4226== LEAK SUMMARY:
==4226==    definitely lost: 0 bytes in 0 blocks
==4226==    indirectly lost: 0 bytes in 0 blocks
==4226==      possibly lost: 0 bytes in 0 blocks
==4226==    still reachable: 294,718 bytes in 239 blocks
==4226==         suppressed: 0 bytes in 0 blocks

@james-d-mitchell all memory leaks now fixed, essentially the issue was that some structures have a 'size' parameter associated with them which is continuously updated, but the size of underlying structures doesn't always match this, which the free functions assumed. I've also rebased on main and added the functions to manually free cliques and homomorphism data.

@james-d-mitchell
Copy link
Member

Awesome thanks @DanielPointon and @Joseph-Edwards !

@Joseph-Edwards
Copy link
Collaborator

Unfortunately, it seems like there are still memory leaks when testing standard/grahom.tst:
mem_leak_3.txt

@DanielPointon
Copy link
Contributor Author

DanielPointon commented Apr 17, 2024

@Joseph-Edwards

==3459== LEAK SUMMARY:
==3459==    definitely lost: 0 bytes in 0 blocks
==3459==    indirectly lost: 0 bytes in 0 blocks
==3459==      possibly lost: 0 bytes in 0 blocks

Fixed! Sorry about that. C is hard.

@Joseph-Edwards
Copy link
Collaborator

@DanielPointon no need to apologise! I took the liberty of Valgrinding tst/teststandard.g and that seems to be leak free too. Nice job!

@james-d-mitchell
Copy link
Member

I just rebased this (and squashed it) on to the current main branch, this was more or less difficult, and so might contain some mistakes!

@james-d-mitchell
Copy link
Member

james-d-mitchell commented Jun 20, 2024

With the changes in this PR, I'm seeing an approx. 27% 9% loss in performance in the test file tst/extreme/grahom.tst with this branch:

gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 73179
true
gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 74211
true

This was tested after I commented out the lines:

 # Wipe internal structures for homos and cliques
   DIGRAPHS_FREE_HOMOS_DATA();
   DIGRAPHS_FREE_CLIQUES_DATA();

in DIGRAPHS_StopTest. So, it seems that the loss of performance is not related to the actual de/allocation of memory.

With the main branch` I get:

gap> DigraphsTestInstall();
Digraphs package: testinstall.tst
msecs: 53
true
gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 66923
true
gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 66778
true

I'll need to investigate a bit further to see why this might be the case. I'm more or less happy to accept a small loss of performance with these changes, ~~but I'm not sure that I'd class ~27% as small. ~~

UPDATE: I added an addition test at the end of extereme/grahom.tst which accounted for most of the change in performance, I think I can live with a 9% decrease in perf!

@james-d-mitchell
Copy link
Member

@Joseph-Edwards any chance you can run this through valgrind again? I think I might have inadvertently force pushed and deleted some of Daniel's fixes :(

@Joseph-Edwards
Copy link
Collaborator

Joseph-Edwards commented Jun 20, 2024

@james-d-mitchell will the new Valgrind action suffice? If no, I'll happily run it again.

I also think I have a local copy of the old branch. If that'd be helpful to see, I can push it to my fork

@james-d-mitchell
Copy link
Member

Thanks @Joseph-Edwards, if you could push the old branch to your fork that'd be awesome!

Also the new valgrind action would be just fine, where is that exactly?

@Joseph-Edwards
Copy link
Collaborator

Pushed to https://github.com/Joseph-Edwards/Digraphs

And hopefully you should be able to see the Valgrind action in the actions tab. It should be callable somehow, but I'm not sure how off the top of my head.

@james-d-mitchell
Copy link
Member

Thanks @Joseph-Edwards I figured out how to call the valgrind job, and thanks for the push too!

This commit combines all the changes by @DanielPointon, with a few
cleanups by me on top. The aim of this commit it to make the memory
allocation more dynamic in the kernel extension of Digraphs for
computing homomorphisms and cliques. Namely, instead of allocating
memory for a bunch of fixed sized (512) arrays we only allocate the
memory required to perform any requested computations, and we
reuse that if possible for subsequent computations, and only allocate
more memory if there isn't enough allocated previously. This commit also
exposes two functions DIGRAPHS_FREE_* that can be used to release the
allocated memory for computing homomorphisms or cliques.

There might be a small performance penalty for these changes, for
example, with these changes:

gap> Test("~/digraphs/tst/extreme/cliques.tst");
Digraphs package: standard/cliques.tst
msecs: 2848
true
gap> Test("~/digraphs/tst/extreme/cliques.tst");
Digraphs package: standard/cliques.tst
msecs: 2923
true
gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 65177
true
gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 65956
true

and before:

gap> Test("~/digraphs/tst/extreme/cliques.tst");
Digraphs package: standard/cliques.tst
msecs: 2861
true
gap> Test("~/digraphs/tst/extreme/cliques.tst");
Digraphs package: standard/cliques.tst
msecs: 2928
true
gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 64881
true
gap> Test("~/digraphs/tst/extreme/grahom.tst");
Digraphs package: extreme/grahom.tst
msecs: 64997
true
@james-d-mitchell james-d-mitchell merged commit 4c8aa6d into digraphs:main Jun 21, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Label for PRs or issues related to refactoring code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants