Skip to content

Previous Work

Nithin Krishna edited this page Aug 6, 2016 · 2 revisions

We've baselined of previous work, identified issues and improved off it. We'd build the described pipeline and ran it on a small subset of the data set(tree-polar-dd). We'd established a working content extraction and evaluation pipeline. We built visualizations explored the spatial and temporal diversity of documents with the ability to drill down to specific concepts. We were able to find reasonable correlation between trends from our insights and real world data.

Issues identified from previous work

  • Poor quality of extraction. (Quality: Relevance to the polar domain).
  • Lack of a mechanism to gauge document relevance with respect to the polar domain.
  • Explored only a small chunk of the dataset.

You can find our the demo here.