- 项目参考了
Yoon Kim
的论文Convolutional Neural Networks for Sentence Classification
的实现方法,利用CNN卷积神经网络完成语句情感分析。 - 项目结构
data/*
:json数据parse_data
:生成数据的json文件cnn.py
:CNN网络模型train.py
:训练脚本test.py
:测试脚本
- 参数
ALLOW_SOFT_PLACEMENT=True BATCH_SIZE=50 CHECKPOINT_EVERY=100 DEV_DATA_FILE=./data/dev.json DROPOUT_KEEP_PROB=0.5 EMBEDDING_DIM=300 EVALUATE_EVERY=100 FILTER_SIZES=3,4,5 L2_REG_LAMBDA=3.0 LOG_DEVICE_PLACEMENT=False NUM_CHECKPOINTS=5 NUM_EPOCHS=200 NUM_FILTERS=100 TEST_DATA_FILE=./data/test.json TRAIN_DATA_FILE=./data/train.json
-
模型数据:
https://cloud.tsinghua.edu.cn/d/e3da1c00a9e84a5d9132/
-
Train with dropout.
$ python test.py --checkpoint_dir="./runs/1577169494/checkpoints/" Total number of test examples: 2210 Accuracy: 0.400905
-
Train with 256 or 512 hidden size.
- 修改参数
NUM_FILTERS = 256
$ python test.py --checkpoint_dir="./runs/1577171119/checkpoints/" Total number of test examples: 2210 Accuracy: 0.40905
- 修改参数
NUM_FILTERS = 512
$ python test.py --checkpoint_dir="./runs/1577172661/checkpoints/" Total number of test examples: 2210 Accuracy: 0.39819
- 修改参数
-
Train with a different number of the hidden layer. (The number of hiddenlayer should be set to 1 and 3)
- 修改参数
FILTER_SIZES = 3
$ python test.py --checkpoint_dir="./runs/1577174708/checkpoints/" Total number of test examples: 2210 Accuracy: 0.384163
- 修改参数
FILTER_SIZES = 3,4
$ python test.py --checkpoint_dir="./runs/1577175634/checkpoints/" Total number of test examples: 2210 Accuracy: 0.39819
- 修改参数
FILTER_SIZES = 3,4,5
$ python test.py --checkpoint_dir="./runs/1577169494/checkpoints/" Total number of test examples: 2210 Accuracy: 0.400905
- 修改参数
-
Train with pre-trained word embedding. (We supply GloVe pretrainedword embedding with 300-dimension for your experiments and you canexplore the model performance with the same dimension without pre-trained word embeddings.)
$ python test.py --checkpoint_dir="./runs/1577022248/checkpoints/" Total number of test examples: 2210 Accuracy: 0.414027