Skip to content

Latest commit

 

History

History
 
 

parsing_penn_treebank

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

🪐 spaCy Project: Dependency Parsing (Penn Treebank)

📋 project.yml

The project.yml defines the data assets required by the project, as well as the available commands and workflows. For details, see the spaCy projects documentation.

⏯ Commands

The following commands are defined by the project. They can be executed using spacy project run [name]. Commands are only re-run if their inputs have changed.

Command Description
install Install dependencies
corpus Convert the data to spaCy's format
vectors Convert, truncate and prune the vectors.
train Train the full pipeline
evaluate Evaluate on the test data and save the metrics
clean Remove intermediate files

⏭ Workflows

The following workflows are defined by the project. They can be executed using spacy project run [name] and will run the specified commands in order. Commands are only re-run if their inputs have changed.

Workflow Steps
all installvectorscorpustrainevaluate

🗂 Assets

The following assets are defined by the project. They can be fetched by running spacy project assets in the project directory.

File Source Description
assets/PTB_SD_3_3_0/train.gold.conll Local Training data (not available publicly so you have to add the file yourself)
assets/PTB_SD_3_3_0/dev.gold.conll Local Development data (not available publicly so you have to add the file yourself)
assets/PTB_SD_3_3_0/test.gold.conll Local Test data (not available publicly so you have to add the file yourself)
assets/vectors.zip URL GloVe vectors
assets/orth_variants.json URL A file containing orth variants for data augmentation