Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide function to accept data in a flat relational csv format #149

Closed
EE2dev opened this issue Sep 4, 2019 · 4 comments
Closed

provide function to accept data in a flat relational csv format #149

EE2dev opened this issue Sep 4, 2019 · 4 comments

Comments

@EE2dev
Copy link

EE2dev commented Sep 4, 2019

With the stratify function, the data can be provided as csv in the parent, child format, one per row. Often the data resides in a flat "relational" format, e.g.:

all,continent,country,population
World,Asia,,4436
World,Asia,China,1420
World,Asia,India,1369
World,Africa,,1216
World,Europe,,739
World,North America,,579
World,North America,USA,329
World,South America,,423
World,Oceania,,38

Since its not trivial to convert this data into the supported csv format, it would be great to provide a function, which can convert data in this representation into a hierarchy.

@mbostock
Copy link
Member

mbostock commented Sep 4, 2019

You can do this using d3.rollups:

rollups = d3.rollups(data, v => d3.sum(v, d => d.population), d => d.continent, d => d.country)
root = d3.hierarchy([null, rollups], ([, value]) => value)
    .sum(([, value]) => value)
    .sort((a, b) => b.value - a.value)

Live example here: https://observablehq.com/@mbostock/2019-h-1b-employers

Related #140.

@mbostock
Copy link
Member

mbostock commented Sep 4, 2019

Here’s a quick example with your data:

https://observablehq.com/d/83cabf4257e8b8cc

@mbostock mbostock closed this as completed Sep 4, 2019
@EE2dev
Copy link
Author

EE2dev commented Sep 7, 2019

Thanks for the quick answer and the great examples. But just to clarify - I haven't been clear in the description of the issue:

My suggestion was to offer a function that can load this kind of data (wide vs narrow format)

if you have the hierarchy (all -> continent -> country) represented in this format:

all,continent,country,population,likeit
World,Asia,China,1420,yes
World,Asia,India,1369,yes

It is not straight forward to load in into the d3.hierarchy.
This format has the following properties:
a) Each row does not reflect just one node but the nodes from root to leaf
b) each node can have not just one value but more data attached to it (population and likeit in this case)
c) The individual values in that format might not be unique (so a separat key has to be created whereas one of the values will be the displayed name of the node

So if one has this kind of representation one option is to convert it to the long form to obtain: (e.g. ignoring possible property c)

parent,key,all,continent,country,population,likeit
,root,,,,
root,World,,,,,
World,Asia,,,,,
Asia,China,World,Asia,China,1420,yes
Asia,India,World,Asia,India,1369,yes

Then you can apply d3.statify() and are ready to go.

@nvelden
Copy link

nvelden commented Oct 17, 2022

I have been struggling with this as well since the change from .nest() to .group(). See my question on stack overflow:
https://stackoverflow.com/questions/74098888/formatting-flat-json-for-tree-graph-in-d3-v7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants