Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in use of clone_for_plugins within plugins (callbacks) #159

Closed
mmghannam opened this issue Oct 21, 2024 · 11 comments · Fixed by #162
Closed

Memory leak in use of clone_for_plugins within plugins (callbacks) #159

mmghannam opened this issue Oct 21, 2024 · 11 comments · Fixed by #162

Comments

@mmghannam
Copy link
Member

mmghannam commented Oct 21, 2024

An example of the memory leak can be found in the tests of branchrule.rs in first_branching_rule

@Andful
Copy link

Andful commented Oct 21, 2024

Ah, I think I know what it is going on. It is related to Rc/Arc 😅. There is a cyclic reference. The callback is not freed because Model is not being freed. Model is not freed because there is a reference in the callback. Memory leaks are still safe in rust, but ideally it would not be there.

@mmghannam
Copy link
Member Author

mmghannam commented Oct 22, 2024

I'm not sure I agree. The order of operations should be:

  • Creating the model with ScipPtr.
  • Creating the plugin struct and passing a clone of the Model with clone_for_plugins.
  • Passing that plugin struct in a box to an include_* method, then attached to scip is a callback that would free the memory.
  • model.solve() is called.
  • Plugins callbacks are called during the solve process.
  • When solving is done, SCIP would call the free callback of the plugin then at this point the reference count should be 1.
  • The model then goes out of scope then when dropped it should decrement the reference count to 0 and call the drop of the ScipPtr inside.

Please let me know if I'm missing something here.

@Andful
Copy link

Andful commented Oct 22, 2024

Do the branching rules get freed after the model is solved?

When I have time I will test that.

@Andful
Copy link

Andful commented Oct 22, 2024

I added the line panic!("rc-count = {}", Rc::strong_count(&solved.scip)). at 993d675 (also made scip public)

#[test]
    fn first_branching_rule() {
        let model = Model::new()
            .hide_output()
            .set_longint_param("limits/nodes", 2)
            .unwrap() // only call brancher once
            .include_default_plugins()
            .read_prob("data/test/gen-ip054.mps")
            .unwrap();

        let br = FirstBranchingRule {
            model: model.clone_for_plugins(),
        };
        let solved = model
            .include_branch_rule("", "", 100000, 1000, 1., Box::new(br))
            .solve();

        assert!(solved.n_nodes() > 1);
        panic!("rc-count = {}", Rc::strong_count(&solved.scip))
    }

and it returns:

rc-count = 29

😭

@Andful
Copy link

Andful commented Oct 22, 2024

The variable and constraints should have Weak references I think

@Andful
Copy link

Andful commented Oct 22, 2024

I also added Drop to FirstBranchingRule with a panic, and it does not panic after solving.

@mmghannam
Copy link
Member Author

I added the line panic!("rc-count = {}", Rc::strong_count(&solved.scip)). at 993d675 (also made scip public)

#[test]
    fn first_branching_rule() {
        let model = Model::new()
            .hide_output()
            .set_longint_param("limits/nodes", 2)
            .unwrap() // only call brancher once
            .include_default_plugins()
            .read_prob("data/test/gen-ip054.mps")
            .unwrap();

        let br = FirstBranchingRule {
            model: model.clone_for_plugins(),
        };
        let solved = model
            .include_branch_rule("", "", 100000, 1000, 1., Box::new(br))
            .solve();

        assert!(solved.n_nodes() > 1);
        panic!("rc-count = {}", Rc::strong_count(&solved.scip))
    }

and it returns:

rc-count = 29

😭

29 is way too much 😄 thanks that's very helpful now I can dig in where does it get incremented.
What I thought is that my implementation of branchfree inside include_branch_rule is incorrect. But this seems like a bigger issue.

@mmghannam
Copy link
Member Author

I also added Drop to FirstBranchingRule with a panic, and it does not panic after solving.

You should maybe first check if the branchfree callback is called. since that's the one responsible for freeing the trait object.

@mmghannam
Copy link
Member Author

The variable and constraints should have Weak references I think

But that would mean either runtime panics or wrapping every return value in variable and constraint methods in a result. To protect against the case that the model frees the ScipPtr before. Yes it's memory safe but not the nicest for the user.

@Andful
Copy link

Andful commented Oct 23, 2024

Or an alternative is to avoid cached variables/constraints (or cache them without the Rc<Model>), as they cause cyclic references. But I think there is still a problem with branching rules being a cyclic reference...

@mmghannam
Copy link
Member Author

Or an alternative is to avoid cached variables/constraints (or cache them without the Rc<Model>), as they cause cyclic references. But I think there is still a problem with branching rules being a cyclic reference...

Avoiding caching is something I eventually want to do, then no inconsistencies would arise between the internal SCIP state and russcip. I will add an issue to track this and update this comment later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants