Skip to content

A stream that detects tabular data (spreadsheets, dsv or json) and yields objects.

License

Notifications You must be signed in to change notification settings

vweevers/detect-tabular

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

detect-tabular

A stream that detects tabular data (spreadsheets, dsv or json) and yields objects.
Supports 20+ different file formats. Spreadsheets and DSV must have a header.

npm status node Test Standard

Example

npm i detect-tabular map-tabular-keys snake-case jsonstream

const detect = require('detect-tabular')
const fs = require('fs')
const keys = require('map-tabular-keys')
const snake = require('snake-case').snakeCase
const json = require('jsonstream')

fs.createReadStream('test/air_pollution_nl.xlsx')
  .pipe(detect())
  .pipe(keys(snake))
  .pipe(json.stringify())
  .pipe(process.stdout)

Tip   If you need normalization like this or number coercion, jump to tabular-stream. If you want a CLI that does multi-format conversion, check out tabular-cli.

API

detect([options])

Returns a duplex stream - give it any tabular data, get back objects. Options are passed as-is to spreadsheet-stream (if applicable).

Supported Input Formats

Text formats:

Binary formats, through spreadsheet-stream:

  • Office Open XML (xlsx, Excel 2007 and above)
  • SpreadsheetML (xml, Excel 2003)
  • BIFF 5-8 (xls, Excel 95 and above)
  • Open Document Format/OASIS (ods)
  • SYLK
  • And more.

NB. Because these binary formats are not streamable, spreadsheet-stream will buffer the whole thing in memory. As a safe-guard you can set the maxSize option (in bytes): detect({ maxSize: 1024 * 1024 }). See spreadsheet-stream for details.

Install

With npm do:

npm install detect-tabular

License

MIT.

Test data © Statistics Netherlands, The Hague/Heerlen.