Skip to content
This repository has been archived by the owner on Dec 11, 2020. It is now read-only.

Doctrine entity population for large amount of data hits memory limit #1471

Closed
buffcode opened this issue Apr 20, 2018 · 3 comments
Closed
Assignees
Labels

Comments

@buffcode
Copy link

Currently the Doctrine populator (probably also the other populators) holds all generated entities in a variable while executing. When generating real life test data (in terms of object amount) we hit memory limits quite fast.

One possible workaround would be to replace the entity with a more lightweight Reference/proxy object.

Another workaround:
Using DoctrineFixturesBundle one could use the --append flag in order to create multiple batches of data. But splitting the generation into multiple fixtures won't help as the associations won't pick up data from previous batches. So I cannot run --append 100 times for 1.000 products resulting in 1.000.000 entities and then another run for just orders containing random (existing) products.

Any ideas?

@tomzx
Copy link
Contributor

tomzx commented Apr 21, 2018

How much RAM are we talking about? Could you provide an example snippet of code of your use case?

@buffcode
Copy link
Author

In my case the script was aborted when reaching 2 GiB RAM of the VM (php -d memory_limit=-1).

My current fixture:

use Doctrine\Bundle\FixturesBundle\Fixture;
use Doctrine\Common\Persistence\ObjectManager;

class BaseDataFixtures extends Fixture
{
    /**
     * @var \Faker\Generator
     */
    private $generator;

    public function load(ObjectManager $manager)
    {
        $this->generator = \Faker\Factory::create('de_DE');
        $this->generator->seed(44867);

        $populator = new \Faker\ORM\Doctrine\Populator($this->generator, $manager);

        $populator->addEntity(Organization::class, 10000, [
            'createdAt' => function () {
                return $this->generator->dateTimeBetween('-30 YEARS', '-1 YEAR');
            },
            'updatedAt' => function () {
                return $this->generator->optional(0.5)->dateTimeBetween('-1 YEAR');
            },
            // fill the parentOrganization
            'parentOrganization' => function ($inserted, Organization $obj) {
                return !empty($inserted[Organization::class]) ? $this->generator->optional(0.2)->randomElement($inserted[Organization::class]) : null;
            }
        ]);

        $populator->execute();
    }
}

\Faker\ORM\Doctrine\Populator::execute() adds all entries into a local variable $insertedEntities in order to pass them to other generators. There is no call to $entityManager->clear in order to free the memory (as that operation would also break persistence of the references).
As such, Populator will have 10.000 full objects in memory. One can debug this quite good by printing memory_get_usage(true) in the execute() function.

@pimjansen
Copy link
Contributor

@buffcode the memory limit hit due the generation or due to object size? I expect the last bit which is a bit how Doctrine works. We could make a choice to flush on specific batch sizes instead of the total size

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants