Skip to content

Commit

Permalink
Expand on row-oriented API drawbacks
Browse files Browse the repository at this point in the history
  • Loading branch information
adamreeve committed Mar 6, 2022
1 parent 4a86103 commit 77104a8
Showing 1 changed file with 16 additions and 0 deletions.
16 changes: 16 additions & 0 deletions docs/RowOriented.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,25 @@ for (int i = 0; i != timestamps.Length; ++i)
}
}

// Write a new row group (pretend we have new timestamps, objectIds and values)
rowWriter.StartNewRowGroup();
for (int i = 0; i != timestamps.Length; ++i)
{
for (int j = 0; j != objectIds.Length; ++j)
{
rowWriter.WriteRow((timestamps[i], objectIds[j], values[i][j]));
}
}

rowWriter.Close();
```

Internally, ParquetSharp will build up a buffer of row values and then write each column when the file
is closed or a new row group is started.
This means all values in a row group must be stored in memory at once,
and the row values buffer must be resized and copied as it grows.
Therefore, it's recommended to use the lower-level column oriented API if performance is a concern.

## Explicit column mapping

The row-oriented API allows for specifying your own name-independent/order-independent column mapping using the optional `MapToColumn` attribute.
Expand Down

0 comments on commit 77104a8

Please sign in to comment.