A stream that detects tabular data (spreadsheets, dsv or json) and yields objects.
Supports 20+ different file formats. Spreadsheets and DSV must have a header.
npm i detect-tabular map-tabular-keys snake-case jsonstream
const detect = require('detect-tabular')
const fs = require('fs')
const keys = require('map-tabular-keys')
const snake = require('snake-case').snakeCase
const json = require('jsonstream')
fs.createReadStream('test/air_pollution_nl.xlsx')
.pipe(detect())
.pipe(keys(snake))
.pipe(json.stringify())
.pipe(process.stdout)
Tip If you need normalization like this or number coercion, jump to tabular-stream. If you want a CLI that does multi-format conversion, check out tabular-cli.
Returns a duplex stream - give it any tabular data, get back objects. Options are passed as-is to spreadsheet-stream
(if applicable).
Text formats:
- DSV (CSV, TSV or anything) through
csv-parser
- JSON and NDJSON through
JSONStream
Binary formats, through spreadsheet-stream
:
- Office Open XML (xlsx, Excel 2007 and above)
- SpreadsheetML (xml, Excel 2003)
- BIFF 5-8 (xls, Excel 95 and above)
- Open Document Format/OASIS (ods)
- SYLK
- And more.
NB. Because these binary formats are not streamable, spreadsheet-stream
will buffer the whole thing in memory. As a safe-guard you can set the maxSize
option (in bytes): detect({ maxSize: 1024 * 1024 })
. See spreadsheet-stream
for details.
With npm do:
npm install detect-tabular
MIT.
Test data © Statistics Netherlands, The Hague/Heerlen.