GoQU: a Generator of QUestion: approaching question generation using Deep Learning

This repository contains a final project realized for the Natural Language Processing course of the Master's degree in Artificial Intelligence, University of Bologna.

Data

The dataset on which we trained, developed and tested our Question Generation (QG) network is the Stanford Question Answering Dataset (SQuAD) version 1.1, which is a collection of question-answer pairs derived from Wikipedia articles. The dataset was processed in order to better accomodonate our needs for the implementation.

Project Details

This project tries to solve the Question Generation task by using the ideas introduced in the paper by Du et al. [1]. It acknowledge that by implementing our revisited version of the model proposed in the 2017 by exploiting newer technologies and using the acclaimed Tensorflow framework provided by Google. To this end, this project only purpouse is only an educational and we do not reserve any credit for the great work done by Du et al.

Model Architecture

Folder structure (WIP)

├── goqu.py
│
├── GoQUreport.pdf
│
├── requirements.txt
│
├──  configs
│   └── config.py   - this file contains the configurations for the project.
│
├──  data   - this folder contains the data for the training and some additional files useful for further operations.
│
├── models
│   └── eval
│       ├── eval_metrics.py     - this file contains the metrics used for the evaluation.
│       └── evaluator.py        - this file contains the class used for evaluation.
│   └── layers
│       ├── decoder.py          - this file contains the decoder layer.
│       ├── encoder.py          - this file contains the encoder layer.
│       └── masking.py          - this file contains the custom masking layer.
│   └── trainers
│       ├── keras_tuner.py      - this file contains the code for the automatic tuning.
│       ├── trainer.py          - this file contains the class used for training.
│       └── metrics.py          - this file contains the metrics used for evaluating training.
│   ├── weights             - this folder contains the pre-trained weights from colab.
│   ├── loss.py         - this file contains the loss used by the model
│   └── callbacks.py    - this file contains the classes used as callbacks
│
├── data_loader
│   └── data_generator.py   - this file contains the dataset methods for loading and processing it.
│
└── utils   - this folder contains utility methods useful for complementary operations
     ├── dirs.py
     ├── embeddings.py
     └── utils.py

Technologies and Frameworks

Frameworks:

Platforms

Google Colaboratory

Configurations and enviroments

The config.py file contains all the configurations needed by the project. The environment could be loaded by using conda by launching the command:

$ conda create --name <env> --file requirements.txt

Versioning

We used Git for versioning.

Future Works

Possible improvements to this project could be:

encoding additional information to the embedding dimension, this means that we could concatenate to each word vector its NER and POS tags to augment the information given to the network,
adding a more sophisticated decoding in the last part, instead of using the temperature sampling, see beam search decoding,
use contextual word embeddings,
use a different model, maybe more sophisticated.

Bibliography

1. Learning to Ask: Neural Question Generation for Reading Comprehension (Du et al., ACL 2017)

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
configs		configs
data		data
models		models
pictures		pictures
src/data		src/data
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
GoQUreport.pdf		GoQUreport.pdf
LICENSE		LICENSE
README.md		README.md
goqu.py		goqu.py
goqu_nb.ipynb		goqu_nb.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GoQU: a Generator of QUestion: approaching question generation using Deep Learning

Table Of Contents

Data

Project Details

Model Architecture

Folder structure (WIP)

Technologies and Frameworks

Configurations and enviroments

Versioning

Future Works

Bibliography

1. Learning to Ask: Neural Question Generation for Reading Comprehension (Du et al., ACL 2017)

License

About

Releases

Packages

Languages

License

Erhtric/neural-question-generation

Folders and files

Latest commit

History

Repository files navigation

GoQU: a Generator of QUestion: approaching question generation using Deep Learning

Table Of Contents

Data

Project Details

Model Architecture

Folder structure (WIP)

Technologies and Frameworks

Configurations and enviroments

Versioning

Future Works

Bibliography

1. Learning to Ask: Neural Question Generation for Reading Comprehension (Du et al., ACL 2017)

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages