Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memleaks in work with struct types #1885

Merged
merged 4 commits into from
Jul 8, 2021

Conversation

viktormalik
Copy link
Contributor

@viktormalik viktormalik commented Jun 18, 2021

Using std::shared_ptr to store the inner struct in SizedType causes memleaks when working with recursive types (due to cycles in shared pointers). This PR replaces the std::shared_ptr by a raw pointer.

Fixes #1879.

Checklist
  • Language changes are updated in docs/reference_guide.md
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

src/struct.h Outdated Show resolved Hide resolved
@fbs fbs added this to the 0.13.0 milestone Jun 19, 2021
Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I see in SemanticAnalyser::visit(Tuple &tuple) I see an unconditional CreateTuple(). Doesn't this cause us to store 10x the values in StructManager::tuples_? If so, anything we can do to avoid that?

src/struct.cpp Show resolved Hide resolved
src/types.h Outdated Show resolved Hide resolved
@viktormalik
Copy link
Contributor Author

So I see in SemanticAnalyser::visit(Tuple &tuple) I see an unconditional CreateTuple(). Doesn't this cause us to store 10x the values in StructManager::tuples_? If so, anything we can do to avoid that?

You're right, this is unnecessary. Two solutions come to mind:

  1. Only store the type in the first/last iteration of the semantic analyser, which is easy, though I'm not sure if it's correct.
  2. Change StructManager::tuples_ into a hash map, then we'd have each tuple type stored only once. The hash could be the string representation of the SizedType (obtained by operator<<), which should be fine for this purpose, I hope.

@danobi
Copy link
Member

danobi commented Jun 22, 2021

Option 1 sounds incorrect although I haven't thought about it too hard.

Option 2 sounds reasonable to me. Couple thoughts on option 2:

  • Using a string representation as key would work. But maybe also check if we can implement std::hash for SizedType so it could just be a a std::unordered_set. std::unique_ptr already implements std::hash: https://en.cppreference.com/w/cpp/memory/unique_ptr/hash . SizedType being hashable sounds generally useful to have (though I wouldn't go crazy over this to
  • Tuples types may change at each pass. We'll still be leaving behind the old, unused tuple types. Out of scope for this PR but I wonder if we can get rid of is_final_pass() altogether. It not only is not guaranteed to work (what if something takes 11 passes to resolve?) but causes subtle issues like this.

@viktormalik
Copy link
Contributor Author

  • Using a string representation as key would work. But maybe also check if we can implement std::hash for SizedType so it could just be a a std::unordered_set. std::unique_ptr already implements std::hash: https://en.cppreference.com/w/cpp/memory/unique_ptr/hash . SizedType being hashable sounds generally useful to have (though I wouldn't go crazy over this to

I implemented std::hash for SizedType so now StructManager::tuples_ is an unordered_set. It seems to work without problems but it definitely needs a detailed review.

  • Tuples types may change at each pass. We'll still be leaving behind the old, unused tuple types. Out of scope for this PR but I wonder if we can get rid of is_final_pass() altogether. It not only is not guaranteed to work (what if something takes 11 passes to resolve?) but causes subtle issues like this.

I'm not sure if getting rid of is_final_pass would help here as we leave unused types behind in cases when the type changes between passes, which would still happen.

Still, the approach of running a fixed number of 10 iterations is not very good, but that's for another discussion. We could be able to implement a more "clever" AST pass that would not re-visit nodes for which the state didn't change in 2 successive iterations. Then we could get rid of is_final_pass as we'd just run the semantic analysis until the fixpoint is found.

@danobi
Copy link
Member

danobi commented Jun 29, 2021

I'm not sure if getting rid of is_final_pass would help here as we leave unused types behind in cases when the type changes between passes, which would still happen.

So I think is_final_pass() exists b/c maps are not always declared before they are used (needs citation). So in theory, if we could figure out the exact types of all the maps before we figure out the rest of the types, we only need a single pass, as non-maps are declared + defined at the same time.

But I agree, we should discuss this somewhere else.

@fbs
Copy link
Member

fbs commented Jun 29, 2021

I originally tagged this for the 0.13 release but I'm hoping to do that release on the first. Should we try and get this in or just leave it for the next release? The leak isn't too bad

src/types.cpp Outdated Show resolved Hide resolved
src/types.cpp Outdated Show resolved Hide resolved
@danobi
Copy link
Member

danobi commented Jun 29, 2021

I originally tagged this for the 0.13 release but I'm hoping to do that release on the first. Should we try and get this in or just leave it for the next release? The leak isn't too bad

If we can get it in by the 1st that'd be nice. If not, not a huge deal IMO

@viktormalik
Copy link
Contributor Author

I originally tagged this for the 0.13 release but I'm hoping to do that release on the first. Should we try and get this in or just leave it for the next release? The leak isn't too bad

If we can get it in by the 1st that'd be nice. If not, not a huge deal IMO

Updated based on @danobi's review. Should be ready for merge if we want to get this in.

Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed on IRC: this PR has potential to cause subtle bugs. So better if we let this get tested through another development cycle to iron out issues.

@danobi danobi modified the milestones: 0.13.0, 0.14.0 Jun 30, 2021
CHANGELOG.md Outdated Show resolved Hide resolved
Using std::shared_ptr to store the inner struct in SizedType causes
memleaks when working with recursive types (due to cycles in shared
pointers). This commit replaces the std::shared_ptr by a raw pointer.

Since we use the same pointer to store the inner structure of tuples, we
move the tuple definitions into StructMap.
Until now, array and tuple types were compred by size only. As this may
be imprecise, compare them by the inner structure.
This allows to use SizedType and related types (such as Struct) in
std::unordered_set/map. Also will be handy once we implement type
uniqueness.
Store tuple types inside an unordered_set instead of a vector. Thanks to
this, we don't store a new tuple type in each iteration of the semantic
analyser.

This required to re-implement std::hash and std::equal_to for
std::unique_ptr<Struct>.
@viktormalik viktormalik force-pushed the struct-memleak-fix branch from db75183 to cd21f1e Compare July 2, 2021 05:48
@danobi danobi merged commit 8b32ed5 into bpftrace:master Jul 8, 2021
@viktormalik viktormalik deleted the struct-memleak-fix branch November 24, 2021 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mem leaks
3 participants