Provides basic function to read a ContentMine CProject and CTrees into python datastructures.
Main use is to read in all results.xml created by ami, and to be relate them to papers/metadata.
Visualization / network analysis in factnet.py will is no longer supported or maintained in this package as it is not part of the core functionality, and will be removed.
- You can install it from PyPI with
pip install pycproject
- Another option is to install it into a virtualenv:
source activate YOURVIRTUALENV
pip install pycproject
- Download and unzip the source from github, change into the unzipped folder, then run
python setup.py build
python setup.py install
If your cproject is in PATH/TO/CPROJECT/CPROJECTNAME
, call the script with
python3 pycproject/convert2elasticdump.py --raw PATH/TO/CPROJECT --name CPROJECTNAME --output PATH/TO/OUTPUTFOLDER
You can then read a generated ContentMine-project in with
from pycproject.readctree import CProject
MYPROJECT = CProject("path_to_cproject", "cproject_name")
You can work with a pandas DataFrame after creating it with
df = MYPROJECT.get_dataframe()