Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory exhaust 1024M in queued export with FromQuery method #2209

Closed
3 tasks done
gabbie-newell opened this issue May 27, 2019 · 18 comments
Closed
3 tasks done

Memory exhaust 1024M in queued export with FromQuery method #2209

gabbie-newell opened this issue May 27, 2019 · 18 comments

Comments

@gabbie-newell
Copy link

gabbie-newell commented May 27, 2019

Prerequisites

  • Able to reproduce the behaviour outside of your code, the problem is isolated to Laravel Excel.
  • Checked that your issue isn't already filed.
  • Checked if no PR was submitted that fixes this problem.

Versions

  • PHP version: 7.2.8
  • Laravel version: 5.8
  • Package version: 3.1

Description

Memory limit is 1024M. When I try to export big data (400 000 rows) with queue, I get memory exhausted error. One php artisan queue:work process uses large memory while executing export. It's about 40M from start, and 900M+ in the end of exporting. I think the FromQuery method uses large memory because when it is trying to append some rows, it loads (initialize) the temp file first (which has a big size while filling with rows).
I found some advices here to use FromIterator method instead of FromQuery. But there is no documentation and no example how to use it.
So how can I use FromIterator method with queue? Or why FromQuery method uses such large memory? Is it able to append some rows without loading the whole file? Maybe it's better to append it to the text file without loading the file itself (like echo "'First column';'Second column';\r\n'Second row';\r\n" >> export.csv linux command), then convert it to XLSX?

Steps to Reproduce

Create some table with 400 000 rows. And try to export it with queue method:

$bulk = Auth::user()->bulks()->withCount('contacts')->findOrFail($id);

(new BulkExport($bulk))->queue('exported/'.Auth::user()->id.'/'.$bulk->id.'.xlsx')->chain([
      new NotifyUserOfCompletedExport($bulk),
 ]);

Expected behavior:

The job NotifyUserOfCompletedExport($bulk) should be executed.

Actual behavior:

php artisan queue:work crashes with memory 1024M exhaust error.

Additional Information

There is code of my controller:

public function export($id)
{
    $bulk = Auth::user()->bulks()->withCount('contacts')->findOrFail($id);
    (new BulkExport($bulk))->queue('exported/'.Auth::user()->id.'/'.$bulk->id.'.xlsx')->chain([
        new NotifyUserOfCompletedExport($bulk),
    ]);
    return redirect()
        ->route('bulks.show', $bulk->id)
        ->with('status', ['Your export is executing. Please wait for "Export is ready" notification with download link.']);
}

In BulkExport file:

<?php

namespace App\Exports;

use App\Contact;
use App\Bulk;
use Carbon\Carbon;
use Illuminate\Contracts\Queue\ShouldQueue;
use Maatwebsite\Excel\Concerns\FromQuery;
use Maatwebsite\Excel\Concerns\Exportable;
use Maatwebsite\Excel\Concerns\WithHeadings;
use Maatwebsite\Excel\Concerns\WithMapping;

class BulkExport implements FromQuery,WithMapping,WithHeadings,ShouldQueue
{
    use Exportable;
    /**
    * @return \Illuminate\Support\Collection
    */
    private $bulk;

    public function __construct(Bulk $bulk)
    {
        $this->bulk = $bulk;
    }

    public function query()
    {
        return Contact::query()->where('bulk_id', $this->bulk->id);
    }


    public function map($contact): array
    {
        return [
            (string)$contact->number,
            (string)$contact->tries,
            $contact->date_tries?$contact->date_tries:'No tries yet',
            $contact->status_tries?implode(str_replace([0,1,2,3,4,5],['In process','Success','Incoming message','No answer','Busy','Error'],$contact->status_tries),','):'In process',
            str_replace([0,1,2,3,4,5],['In process','Success','Incoming message','No answer','Busy','Error'],$contact->status)
        ];
    }
    public function headings(): array
    {
        return [
            ['Bulk name', $this->bulk->name],
            ['Time range', 'from '.Carbon::parse($this->bulk->time_from)->format('H:i').' to '.Carbon::parse($this->bulk->time_to)->format('H:i'), 'Type', 'Basic'],
            ['Max tries', $this->bulk->max_tries, 'Tries interval',$this->bulk->tries_interval],
            ['Status', str_replace([0,1,2],['In process','Finished','Waiting'],$this->bulk->status), 'Progress', ceil($this->bulk->processed*100/$this->bulk->contacts_count).' %'],
            ['Remaining cost', $this->bulk->max_cost.' $', 'Cost', $this->bulk->finished_at?(string)$this->bulk->final_cost.' $':'Unknown'],
            ['Created at', $this->bulk->created_at, 'Finished at', $this->bulk->finished_at?$this->bulk->finished_at:'Unknown'],
            ['Finals short'],
            $this->bulk->status==1?
                [   'Success level',
                    ceil($this->bulk->contacts()->whereIn('status',[1,2])->count() * 100 / $this->bulk->contacts_count).' %',
                    'Successful',
                    (string)$this->bulk->contacts()->where('status',1)->count()
                ]
                :
                ['','Not finished yet']
            ,
            $this->bulk->status==1?
                [   'Incoming',
                    (string)$this->bulk->contacts()->where('status',2)->count(),
                    'No answer',
                    (string)$this->bulk->contacts()->where('status',3)->count()
                ]
                :
                ['','','Not finished yet]
            ,
            $this->bulk->status==1?
                [   'Busy',
                    (string)$this->bulk->contacts()->where('status',4)->count(),
                    'Send error',
                    (string)$this->bulk->contacts()->where('status',5)->count()
                ]
                :
                ['','','','Not finished yet']
            ,
            ['Finals detail'],
            ['Number', 'Tries', 'Date of tries', 'Status of tries',  'Final status']
        ];
    }
}
@MohammadWaleed
Copy link

MohammadWaleed commented May 29, 2019

I'm having the same issue, tried changing the chunk size.. didn't help
The memory usage keeps increasing gradually with each AppendQueryoSheet job

@gabbie-newell
Copy link
Author

I'm having the same issue, tried changing the chunk size.. didn't help

I tried too. Didn't help. I'm planning to make the export via linux echo command to csv file. Without Laravel-Excel package. Will attach my results here later.

@patrickbrouwers
Copy link
Member

There's currently not a whole lot we can do about the increasing memory size. Unfortunately PhpSpreadsheet doesn't have a way to re-open an spreadsheet without loading it into memory. In Laravel-Excel 3.2 this will be slightly improved as it will have the option to use cell-caching (decreases memory usage). We are also planning to release a package specifically designed to handled big data export (using a queue).

@gabbie-newell
Copy link
Author

gabbie-newell commented May 31, 2019

There's currently not a whole lot we can do about the increasing memory size. Unfortunately PhpSpreadsheet doesn't have a way to re-open an spreadsheet without loading it into memory. In Laravel-Excel 3.2 this will be slightly improved as it will have the option to use cell-caching (decreases memory usage). We are also planning to release a package specifically designed to handled big data export (using a queue).

Unfortunately, Laravel Storage Facade doesn't support appending to file without reading.
I solved my problem with php function fopen in "a+b" mode and with exporting only to csv file. Works fine and so fast.
Will wait for next version Laravel-Excel package. I wish you all good luck with the development of a new version. Anyway, Laravel-Excel package is amazing.

This is my solution:
In controller:

public function export($id)
{
        $bulk = Auth::user()->bulks()->withCount('contacts')->findOrFail($id);
        ExportBulkHeadings::withChain([
            new ExportBulkContent($bulk),
            new ExportBulkFile($bulk),
            new NotifyUserOfCompletedExport($bulk)
        ])->dispatch($bulk)->allOnQueue('export');
        return redirect()
            ->route('bulks.show', $bulk->id)
            ->with('status', ['Your export is executing. Please wait for "Export is ready" notification with download link.']);
}

In ExportBulkContent Job:

    public function handle()
    {
        $bulk = $this->bulk;
        $this->bulk->contacts()->chunk(1000, function($contacts) use($bulk){
            ExportBulkRows::dispatch($bulk,$contacts)->onQueue('export');
        });
    }

In ExportBulkRows Job:

    public function handle()
    {
        foreach($this->contacts as $contact){
            $file = fopen(Storage::disk('tmp')->path($temp), "a+b");
            fwrite($file,'"'.$contact->number.'";"'.$contact->tries.'";'.($contact->date_tries?$this->s(implode($contact->date_tries,',')):'"No tries yet";').'"'.($contact->status_tries?implode(str_replace([0,1,2,3,4,5],['In process','Success','Recalled','No answer','Busy','Error'],$contact->status_tries),','):'In process').'";"'.(string)$contact->duration.'";"'.str_replace([0,1,2,3,4,5],['In process','Success','Recalled','No answer','Busy','Error'],$contact->status).'";'."\n");
            fclose($file);
        }
    }

    public function failed(Exception $exception)
    {
        $this->bulk->user->notify(new ExportFailure($this->bulk->toArray()));
    }

    private function s($string)
    {
        return '"'.str_replace(['"',';'],'',$string).'";';
    }

And one php artisan queue:work --queue=export is enough because it finishes so fast. With multiple php artisan queue:work --queue=export process my csv file is filled with rows out of order and dispatch chain doesn't work as I wish. But that's another story :)

@rakishii13
Copy link

On large file exports, the PhpSpreadsheet causes the file to open every time and this causes the server to run out of memory due to the size of the file, which is similar to the above issue, you mentioned there is another package to handle large data, any idea about when it would be released?

@patrickbrouwers
Copy link
Member

Probably in a few weeks.

@wouterrutgers
Copy link

Hi @patrickbrouwers, is there any status update on the package for large datasets?

@arondeparon
Copy link

There's currently not a whole lot we can do about the increasing memory size. Unfortunately PhpSpreadsheet doesn't have a way to re-open an spreadsheet without loading it into memory. In Laravel-Excel 3.2 this will be slightly improved as it will have the option to use cell-caching (decreases memory usage). We are also planning to release a package specifically designed to handled big data export (using a queue).

@patrickbrouwers I think "a few weeks" have passed by now 😅

Is there anything we can do to get 3.2 shipped? We are also stumbling upon memory issues and would really love to see if this fixes our problems.

I don't see any definition of 3.2 in the milestones, so I am wondering what would define it as "done".

@pedrommone
Copy link

Hey @patrickbrouwers!

I've been searching a way to export large volumes too, even tried FromIterator or FromArray, also tried to use the new cursor() feature from Laravel 6.x.

They all result in a memory issue. Is there a way to implement right now some optimisations to fix this?

@dustingtaylor
Copy link

Also looking for this. Dealing with some exports with very large data sets.

@ragingdave
Copy link

To be honest the maintainers don't seem to care about large data sets as their own use cases don't seem to correspond to those that have large data sets. This has been an issue for over a year at this point and they haven't addressed it yet......good luck

@rdgout
Copy link

rdgout commented Mar 2, 2020

@patrickbrouwers Any update on this?

@patrickbrouwers
Copy link
Member

patrickbrouwers commented Mar 2, 2020

Lack of progress is unrelated to us not caring, just simply a lack of time to work on this project. :( That's open source you know :)

In the meantime you can have a look at https://stefanzweifel.io/posts/lazy-collections-in-laravel-excel/ According to his benchmarks he's able to export millions of rows without running into a memory limit. We are working on making this the default behaviour for exports in 3.2, given the lack of time I really can't give any promises on timelines. Hope you all understand :)

@ultrono
Copy link

ultrono commented Jul 31, 2023

Lack of progress is unrelated to us not caring, just simply a lack of time to work on this project. :( That's open source you know :)

In the meantime you can have a look at https://stefanzweifel.io/posts/lazy-collections-in-laravel-excel/ According to his benchmarks he's able to export millions of rows without running into a memory limit. We are working on making this the default behaviour for exports in 3.2, given the lack of time I really can't give any promises on timelines. Hope you all understand :)

That unfortunately doesn't work when dispatching the export to a queue (without a queue it works fine).

When you implement \Maatwebsite\Excel\Concerns\FromCollection, return a \Illuminate\Support\LazyCollection instance from your collection() method (i.e. ->cursor()), using a queue to export, you'll get an exception:

Maatwebsite\Excel\QueuedWriter::exportCollection(): Return value must be of type Illuminate\Support\Collection, Illuminate\Support\LazyCollection returned

I'm using Laravel 9 with the package version 3.1.48 (latest).

Using a none lazy collection isn't really an option here as I'm loading a lot of data.

EDIT: after a bit of research on this issue it appears this functionality was removed in a past commit. I've decided to keep the lazy collection and just run it on a custom background task. Shame it's not possible at present using Laravel Excel.

@patrickbrouwers
Copy link
Member

patrickbrouwers commented Jul 31, 2023

Can you show me the commit that removed it

I think in 2020 LazyCollection still extended Collection and now it doesn't anymore

@ultrono
Copy link

ultrono commented Nov 5, 2023

Can you show me the commit that removed it

I think in 2020 LazyCollection still extended Collection and now it doesn't anymore

Appears to be d3560cc#diff-eb05512b565745cc1514fe33aa948e31f9ead7e03abb8cf18a3a2a00354d2ccf - a Collection PHP 8 return type was added. previously the return type was added via doc blocks.

Would a union return type fix this?

private function exportCollection(
        FromCollection $export,
        TemporaryFile $temporaryFile,
        string $writerType,
        int $sheetIndex
    ): Collection|LazyCollection {

@patrickbrouwers
Copy link
Member

We cannot use union types at the moment, as we still support older PHP versions. I would be fine removing the return type (and do it via doc blocks again) for now until a major release if someone PRs it.

@sebestenyb
Copy link
Contributor

@patrickbrouwers please see the PR #4044

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests