Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRM-19798 fix memory leak on EntityTag.get & other calls in the DB (generic fix) #12276

Merged
merged 1 commit into from
Jun 12, 2018

Conversation

eileenmcnaughton
Copy link
Contributor

Overview

Fix memory leak that manifests (among other places) when calling api EntityTag.get

Before

Memory in use before & after 100 iterations of EntityTag.get

Start :'49,864,896'
End :'53,503,264'

After

Memory in use before & after 100 iterations of EntityTag.get

Start :'49,851,680'
End:'49,852,168'

Technical Details

The PR adds a class destructor to the DAO that frees the mysql resource. From testing it's fine if it has already been freed it gets most code paths (there is one that it does not - see below)

I dug fairly heavily into the way the DAO cache is currently being used. It builds up queries in the global
$_DB_DATAOBJECT['RESULTS'];
From my investigations these are used for iterating through a result set rather than caching the results of a query. ie. in the case of this function

function demonstrate() {
$dao = new CRM_Core_DAO_Domain();
$dao->find(TRUE)
return $dao->locales;
}
there will be an entry remaining in
$_DB_DATAOBJECT['RESULTS']
representing the mysql results - but it can only be used from that $dao. Once we have left the function the $dao will be destroyed but the object to retrieve the result set will remain in the global, unusable but memory hogging.

If the $dao is freed before it's time then $dao->fetch() will no longer retrieve results - potentially causing unpredictable errors (this is the risk of calling CRM_Core_DAO::freeResult()) as an outer loop may be disbanded. I removed these calls as they are no longer doing any good (& can do harm)

Path still not releasing memory:

   $rule = new CRM_ACL_BAO_ACL();
   $rule->query($query);

I think that is a pretty clumsy construct & we should swap to
CRM_Core_DAO::executeQuery();

Comments

This replaces #11615

We have been running #11615 in production for several months under load. In that time we observed one issue - the free result broke down when the DAO was cloned. I have updated this with a patch that replicates that issue and fixes it.

It should be noted the free result bug ALSO happens without this patch, but in obscure & unknown circumstances. It is certainly replicable by doing basically the same sequence as in the added test bue using CRM_Core_DAO::freeResult();

I submitted various versions of this change for CI & noticed no particular difference in the duration of the jenkins job. I think I saw more difference locally but did not do multiple tests on that. Definitely no worse! I was able to replicate the memory management issue in a test & submitted the test without the changes to see if that is replication on jenkins #11616


This is an alternate methodology using __destruct on the DAO object.

Note that I have found some places bypass this - ie.

```
   $rule = new CRM_ACL_BAO_ACL();
   $rule->query($query);
```

But I think that is a pretty clumsy construct & we should swap to
CRM_Core_DAO::executeQuery();
@civibot
Copy link

civibot bot commented Jun 6, 2018

(Standard links)

@eileenmcnaughton eileenmcnaughton changed the title CRM-19798 fix memory leak on EntityTag.get CRM-19798 fix memory leak on EntityTag.get & other calls in the DB (generic fix) Jun 6, 2018
@eileenmcnaughton
Copy link
Contributor Author

@mlutfy FYI - you asked on previous version if we had prod tested -this version addresses the one (obscure) prod issue we hit

@eileenmcnaughton
Copy link
Contributor Author

(note I revived this after a stackexchange question - it had been languishing on my to-do list as we have deployed it & upstreaming needed me to take the extra clone step)

@colemanw
Copy link
Member

This looks safe, well-documented and well-tested. I'm going to merge it now since we're pre-beta for 5.4 and it will get some eyes on it before the release.

@colemanw colemanw merged commit dc080b3 into civicrm:master Jun 12, 2018
@colemanw colemanw deleted the memory_more branch June 12, 2018 12:44
davialexandre added a commit to compucorp/civihr that referenced this pull request Oct 23, 2018
…FullDetails API

This API call is an optimized way to fetch all the information for a Job Contract,
which is stored in multiple different database tables. It does that by running
a single SQL query, joining all the tables and the necessary fields.

In the resultset, in order to differentiate which field belongs to which entity,
each field is prefixed with the entity name (examplei: details__<field_name>). After
the data is fetched from the database, the `HRJobContractRevision::normalizeFullDetailsResult()`
was used to loop through all the fields in the resultset, detect to which entity they
belong to and organize the data in a proper way.

Besides the fields returned by the query, the resultset object also contains some internal
fields. In the past, all of these fields were prefixed with an underscore, but on
civicrm/civicrm-core#12276 a new `resultCopies` field was added,
and since it does not start with an underscore, the logic to filter out the internal
fields stopped working. To fix that, instead of looping through all of the fields from
the resultset, we call the `toArray()` method which will return a list of fields and
values containing only those returned by the SQL query.
davialexandre added a commit to compucorp/civihr that referenced this pull request Nov 6, 2018
…FullDetails API

This API call is an optimized way to fetch all the information for a Job Contract,
which is stored in multiple different database tables. It does that by running
a single SQL query, joining all the tables and the necessary fields.

In the resultset, in order to differentiate which field belongs to which entity,
each field is prefixed with the entity name (examplei: details__<field_name>). After
the data is fetched from the database, the `HRJobContractRevision::normalizeFullDetailsResult()`
was used to loop through all the fields in the resultset, detect to which entity they
belong to and organize the data in a proper way.

Besides the fields returned by the query, the resultset object also contains some internal
fields. In the past, all of these fields were prefixed with an underscore, but on
civicrm/civicrm-core#12276 a new `resultCopies` field was added,
and since it does not start with an underscore, the logic to filter out the internal
fields stopped working. To fix that, instead of looping through all of the fields from
the resultset, we call the `toArray()` method which will return a list of fields and
values containing only those returned by the SQL query.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants