Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add preprocessor for Langmuir collection CSV files #667

Merged
merged 1 commit into from
Oct 9, 2019

Conversation

mark-dce
Copy link
Collaborator

@mark-dce mark-dce commented Oct 9, 2019

The preprocessor accepts existing Langmuir CSVs with one file per row and
returns a new csv with each work and fileset in their own row.

  • All of the rows for a single work are grouped together
  • Each group begins with a row containing the work-level metadata
  • All files associated with a fileset are listed in the same row
  • Filesets are listed in sequence order and given appropriate labels
  • Blank lines in the source CSV are ignored
  • The preprocessor adds a deduplication_key field

@mark-dce mark-dce force-pushed the langmuir-preprocessor branch from 9db00f0 to 478fcd8 Compare October 9, 2019 08:59
The preprocessor accepts existing Langmuir CSVs with one file per row and
returns a new csv with each work and fileset in their own row.

* All of the rows for a single work are grouped together
* Each group begins with a row containing the work-level metadata
* All files associated with a fileset are listed in the same row
* Filesets are listed in sequence order and given appropriate labels
* Blank lines in the source CSV are ignored
* The preprocessor adds a deduplication_key field
@mark-dce mark-dce force-pushed the langmuir-preprocessor branch from 478fcd8 to 0a3ed7f Compare October 9, 2019 09:03
@little9 little9 self-requested a review October 9, 2019 14:36
@little9 little9 merged commit 51d7b31 into master Oct 9, 2019
@little9 little9 deleted the langmuir-preprocessor branch October 9, 2019 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants