Skip to content

Commit

Permalink
Adding TabularDataReader::chunkBy method
Browse files Browse the repository at this point in the history
  • Loading branch information
nyamsprod committed Jan 26, 2024
1 parent 1a7b889 commit 60b0062
Show file tree
Hide file tree
Showing 6 changed files with 75 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ All Notable changes to `Csv` will be documented in this file

- `Statement::select`
- `TabularDataReader::getRecordsAsObject`
- `TabularDataReader::chunkBy`

### Deprecated

Expand Down
20 changes: 20 additions & 0 deletions docs/9.0/reader/tabular-data-reader.md
Original file line number Diff line number Diff line change
Expand Up @@ -472,3 +472,23 @@ $reader->matchingFirstOrFail('row=3-1;4-6'); // will throw

<p class="message-info"> Wraps the functionality of <code>FragmentFinder</code> class.</p>
<p class="message-notice">Added in version <code>9.12.0</code> for <code>Reader</code> and <code>ResultSet</code>.</p>

### chunkBy

<p class="message-notice">Added in version <code>9.15.0</code> for <code>Reader</code> and <code>ResultSet</code>.</p>

If you are dealing with a large CSV and you want it to be split in smaller sizes for better handling you can use
the `chunkBy` method which breaks the `TabularDataReader` into multiple, smaller instance of a given size. The
last instance may contain fewer records because of the chunk size you have chosen.

```php
use League\Csv\Reader;

$reader = Reader::createFromString($csv);

foreach ($reader->chunkBy(4) as $chunk) {
foreach ($chunk as $record) {
//the actual record will be found here.
}
}
```
12 changes: 12 additions & 0 deletions src/Reader.php
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,18 @@ public function reduce(Closure $closure, mixed $initial = null): mixed
return ResultSet::createFromTabularDataReader($this)->reduce($closure, $initial);
}

/**
* @param positive-int $length
*
* @throws InvalidArgument
*
* @return Iterator<TabularDataReader>
*/
public function chunkBy(int $length): Iterator
{
return ResultSet::createFromTabularDataReader($this)->chunkBy($length);
}

/**
* @param Closure(array<mixed>, array-key): bool $closure
*
Expand Down
30 changes: 30 additions & 0 deletions src/ResultSet.php
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,36 @@ public function reduce(Closure $closure, mixed $initial = null): mixed
return $initial;
}

/**
* @param positive-int $length
*
* @throws InvalidArgument
*
* @return Iterator<TabularDataReader>
*/
public function chunkBy(int $length): Iterator
{
if ($length < 1) {
throw InvalidArgument::dueToInvalidChunkSize($length, __METHOD__);
}

$records = [];
$i = 0;
foreach ($this->getRecords() as $record) {
$records[] = $record;
++$i;
if ($i === $length) {
yield self::createFromRecords($records);
$i = 0;
$records = [];
}
}

if ([] !== $records) {
yield self::createFromRecords($records);
}
}

public function filter(Closure $closure): TabularDataReader
{
return Statement::create()->where($closure)->process($this);
Expand Down
1 change: 1 addition & 0 deletions src/TabularDataReader.php
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
* @method TabularDataReader matchingFirstOrFail(string $expression) extract the first found fragment identifier of the tabular data or fail
* @method TabularDataReader|null matchingFirst(string $expression) extract the first found fragment identifier of the tabular data or return null if none is found
* @method iterable<int, TabularDataReader> matching(string $expression) extract all found fragment identifiers for the tabular data
* @method iterable<int, TabularDataReader> chunkBy(int $length) Chunk the TabulaDataReader into smaller TabularDataReader instance of the given size or less.
*/
interface TabularDataReader extends Countable, IteratorAggregate
{
Expand Down
11 changes: 11 additions & 0 deletions src/TabularDataReaderTestCase.php
Original file line number Diff line number Diff line change
Expand Up @@ -449,6 +449,17 @@ public function __construct(

self::assertInstanceOf($class::class, $this->tabularDataWithHeader()->firstAsObject($class::class, ['observedOn', 'temperature', 'place']));
}

public function testChunkingTabularDataUsingTheRangeMethod(): void
{
self::assertCount(2, [...$this->tabularData()->chunkBy(4)]);
foreach ($this->tabularDataWithHeader()->chunkBy(4) as $offset => $item) {
match ($offset) {
0 => self::assertCount(4, $item),
default => self::assertCount(2, $item),
};
}
}
}

enum Place: string
Expand Down

0 comments on commit 60b0062

Please sign in to comment.