- About:arXiv dataset and metadata of 1.7M+ scholarly papers across STEM
- Dataset URL:https://www.kaggle.com/Cornell-University/arxiv
- Task: Build NLP applications upon arXiv dataset, including but not limited to Question/Answering,Knowledge Graph, Visualizaion, Recommendation, Survey systems.
- 1: SWI-Prolog 8.2.1 reference manual
- URL:https://www.swi-prolog.org/download/stable/doc/SWI-Prolog-8.2.1.pdf
- 2: SWI-Prolog Semantic Web Library 3.0
- URL:https://www.swi-prolog.org/pldoc/doc_for?object=section(%27packages/semweb.html%27)
- 3: Build an interesting application with SWI-Prolog Semantic Web Library.
- Task: Translate one of the reference manuals (1/2) and give examples for important predicates, or 3.
- 1: Natural Language Processing for Prolog Programmers (Michael A. Covington)
- URL:http://www.covingtoninnovations.com/books/NLPPP.pdf
- 2: Prolog and Natural-Language Analysis(Fernando Pereira, et al)
- URL:http://www.mtome.com/Publications/PNLA/prolog-digital.pdf
- Task: Translate one of the books and extract source code files for the book examples.
- About:Watson is a famous IBM NLP project of Question/Answering for open domains.
- Reference URL:http://brenocon.com/watson_special_issue/
- Task: Build a miniWatson for specific domains: Computer Science, Mathmatics, ..., or your perfered domain.
- Task: Build an AI Writer that can pass turing-test.
- Task tips: Collect as many as possible computer science research papers, extract different paper sections (abstract, introduction, model, experiment, discussion, conclusion etc.) from papers. Construct a huge sentence bank from the extracted paper sections contents. Build n-gram models for sentences (instead of words). Build sentence similarity measures to help find simlar sentences. Given an input sentence and section tag (for example: 'abstract' or 'introduction'), generate the whole section with simlar following sentences. The generated sections (for example: 'abstract' or 'introduction') should pass turing-test.
- Task: Build knowledge graph from given research papers.
- Task tips: Collect as many as possible computer science research papers, extract different paper sections (abstract, introduction, model, experiment, discussion, conclusion etc.) from papers. Identify important concepts entities and relations from the extracted paper sections contents. Build knowledge graphs with the entities and relations. Build a visualization system for the KG to give interactive demonstrations.
- Task: Generate text knowledge from visual contents (images or video).
- Task tips: Collect as much as possible visual contents (images or videos). Build a generative model (human eye) from visual contents (images or videos). The gererator should pass turing-test.
- Task: Generate thought paths from any concept pairs.
- Task tips: Simulate human brain's functionalities. Given any concept pairs, generate a sequence of thoughts to connect the concept pairs. The thought path should be reasonable and should form a story that can pass turing-test.