You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an epic for creating a detailed guide for implementors of the specs. The guide would be the main place for implementors to understand the nuances of the specs and how best to proceed.
User Stories
As a Developer creating and implementation I want a guide in one place on implementation to complement the bare specs so that I understand the nuances of the specs and how best to proceed.
Suggested API, existing implementations to look at
FAQs e.g. how to deal with INF, -INF if your language does not support it
Backwards compatibility (pre-v1 support)
As a Developer looking to create an implementation I want to know what the interface of my library should look like so that I can design it well and consistently with libraries in other languages
As a Developer looking to use one of the existing Data Package libraries i want to get up and running as quickly as possible so that I can be very productive
I want to quick walk through in my language of doing the simple things quickly
There is an implementer note about resource.url support in Table Resource spec. So I suppose the same we could do for Table Schema: date/time/datetime and fmt: - I suppose temporal format is the most often mistakes I saw. And implementations better to support prev version too.
FAQs
Dereferencing schemas before validation: Also should we have a note here about dereferencing? It means that if schema is url-or-path that it should be dereferenced before descriptor validation (other question that the spec doesn’t touch concept of validation at all).
Interface design and code walkthrough
Walkthrough - simple elegant code for each of these (could do as one big code block or as separate ones -- maybe easier together as you can reuse earlier pages)
Creating / loading a data package from a url, path and getting the data
Tabular and Non-Tabular (e.g. GeoJSON)
Bonus: ?? inline data (create from descriptor with inline data)
Create a data package (from descriptor or null) and set properties and then save
Create a data package with CSV and guess schema and then save ...
Load a Tabular Data Package and then save to a database ...
We recommend library implementors support an interface similar to the following:
# $ pip install datapackage==1.0.0a4fromdatapackageimportDataPackage# With datapackage-v1 [WIP]:# - validate and update logic should by synced with JavaScript version# - added function like add/get/remote_resource etc# Remote tabulardataPackage=DataPackage('https://raw.githubusercontent.com/frictionlessdata/datapackage-py/master/tests/fixtures/datapackage/datapackage.json')
foritemindataPackage.resources[0].table.read(keyed=True):
print('City %s has an id %s'% (item['city'], item['id']))
# Local tabulardataPackage=DataPackage('datapackage/datapackage.json')
foritemindataPackage.resources[0].table.read(keyed=True):
print('City %s has an id %s'% (item['city'], item['id']))
# Local tabulardataPackage=DataPackage({
'name': "datapackage",
'resources': [
{
'name': "data",
'path': ["https://raw.githubusercontent.com/frictionlessdata/datapackage-py/master/tests/fixtures/datapackage/data.csv"],
'profile': "tabular-data-resource",
'dialect': {
'quoteChar': "|"
},
'schema': {
'fields': [
{
'name': "id",
'type': "integer"
},
{
'name': "city",
'type': "string"
}
]
}
}
]
})
foritemindataPackage.resources[0].table.read(keyed=True):
print('City %s has an id %s'% (item['city'], item['id']))
# Create from scratch and update datapackagedataPackage=DataPackage({})
dataPackage.descriptor['name'] ='datapackage'dataPackage.descriptor['description'] ='Good data package'dataPackage.descriptor['resources'] = [{
'name': 'cities',
'profile': 'tabular-data-resource',
'path': ["https://raw.githubusercontent.com/frictionlessdata/datapackage-py/master/tests/fixtures/datapackage/data.csv"],
'schema': {
'fields': [
{
'name': "id",
'type': "integer"
},
{
'name': "city",
'type': "string"
}
]
}
}]
foritemindataPackage.resources[0].table.read(keyed=True):
print('City %s has an id %s'% (item['city'], item['id']))
# Non-tabular datapackagedataPackage=DataPackage({
'name': 'geojson',
'resources': [
{
'name': 'point',
'data': {
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [125.6, 10.1]
},
"properties": {
"name": "Dinagat Islands"
}
}
}
]
})
print(dataPackage.resources[0].source['type']) # Feature# Load a Tabular Data Package and then save to a database# This API is WIP - https://github.com/frictionlessdata/datapackage-py/issues/132
JavaScript
// ES6 with async/await
// $ npm install datapackage@latest
// $ node7 --harmony-async-await example.js
const DataPackage = require('datapackage').DataPackage
// With tableschema-v1 [WIP]:
// - no need to await resource.table
// - resource.table.read({keyed: true})
// https://github.com/frictionlessdata/tableschema-js/pull/69
// Remote tabular
async function example1() {
// Load will throw an error on invalid descriptor
const dataPackage = await DataPackage.load('https://raw.githubusercontent.com/frictionlessdata/datapackage-py/master/tests/fixtures/datapackage/datapackage.json')
const table = await dataPackage.resources[0].table
// Read will throw an error if data not compliant to schema
const data = await table.read(true)
for ({id, city} of data) {
console.log(`City ${city} has an id ${id}`)
}
}
// Local tabular
async function example2() {
// Load will throw an error on invalid descriptor
const dataPackage = await DataPackage.load('datapackage/datapackage.json')
const table = await dataPackage.resources[0].table
// Read will throw an error if data not compliant to schema
const data = await table.read(true)
for ({id, city} of data) {
console.log(`City ${city} has an id ${id}`)
}
}
// Inline tabular
async function example3() {
// Load will throw an error on invalid descriptor
const dataPackage = await DataPackage.load({
name: "datapackage",
resources: [
{
name: "data",
path: ["https://raw.githubusercontent.com/frictionlessdata/datapackage-py/master/tests/fixtures/datapackage/data.csv"],
profile: "tabular-data-resource",
dialect: {
quoteChar: "|"
},
schema: {
fields: [
{
name: "id",
type: "integer"
},
{
name: "city",
type: "string"
}
]
}
}
]
})
const table = await dataPackage.resources[0].table
// Read will throw an error if data not compliant to schema
const data = await table.read(true)
for ({id, city} of data) {
console.log(`City ${city} has an id ${id}`)
}
}
// Create from scratch and update datapackage
async function example4() {
// In non strict mode we could provide not valid descriptor
const dataPackage = await DataPackage.load({resources: []}, {strict: false})
dataPackage.descriptor.name = 'datapackage'
dataPackage.descriptor.description = 'Good data package'
dataPackage.update()
dataPackage.addResource({
name: 'cities',
profile: 'tabular-data-resource',
path: ["https://raw.githubusercontent.com/frictionlessdata/datapackage-py/master/tests/fixtures/datapackage/data.csv"],
schema: {
fields: [
{
name: "id",
type: "integer"
},
{
name: "city",
type: "string"
}
]
}
})
// Check for errors if updated descriptor is not valid
if (!dataPackage.valid) {
for (let error of dataPackage.errors) {
console.log(error)
}
}
const table = await dataPackage.resources[0].table
// Read will throw an error if data not compliant to schema
const data = await table.read(true)
for ({id, city} of data) {
console.log(`City ${city} has an id ${id}`)
}
}
// Non-tabular datapackage
// TODO: it doesn't work because of pending path/data change in spec
async function example5() {
// Load will throw an error on invalid descriptor
const dataPackage = await DataPackage.load({
name: 'geojson',
resources: [
{
name: 'point',
data: {
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [125.6, 10.1]
},
"properties": {
"name": "Dinagat Islands"
}
}
}
]
})
console.log(dataPackage.resources[0].source.type) // Feature
}
example1()
example2()
example3()
example4()
example5()
The text was updated successfully, but these errors were encountered:
@pwalsh yes and I read it before I wrote this issue 😉 - this issue reflects the things I think are missing from that implementors guide. I could be mistaken on this so please comment against the items in the description 😄 (and add items).
This is an epic for creating a detailed guide for implementors of the specs. The guide would be the main place for implementors to understand the nuances of the specs and how best to proceed.
User Stories
As a Developer creating and implementation I want a guide in one place on implementation to complement the bare specs so that I understand the nuances of the specs and how best to proceed.
As a Developer looking to create an implementation I want to know what the interface of my library should look like so that I can design it well and consistently with libraries in other languages
As a Developer looking to use one of the existing Data Package libraries i want to get up and running as quickly as possible so that I can be very productive
Acceptance criteria
Tasks
fmt:
support #436Analysis
Existing Work on a Guide
Stack reference - https://github.com/frictionlessdata/stack#frictionless-data-stack
For implementors - http://specs.frictionlessdata.io/implementation/
Neither of these provide simple examples.
Backwards Compatibility
There is an implementer note about
resource.url
support inTable Resource
spec. So I suppose the same we could do forTable Schema: date/time/datetime
andfmt:
- I suppose temporal format is the most often mistakes I saw. And implementations better to support prev version too.FAQs
Interface design and code walkthrough
Walkthrough - simple elegant code for each of these (could do as one big code block or as separate ones -- maybe easier together as you can reuse earlier pages)
We recommend library implementors support an interface similar to the following:
Python
JavaScript
The text was updated successfully, but these errors were encountered: