Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the output of cnn in DSSM. #534

Merged
merged 4 commits into from
Dec 12, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions dssm/README.cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ def create_cnn(self, emb, prefix=''):

conv_3 = create_conv(3, self.dnn_dims[1], "cnn")
conv_4 = create_conv(4, self.dnn_dims[1], "cnn")
return conv_3, conv_4
return paddle.layer.concat(input=[conv_3, conv_4])
```

CNN 接受词向量序列,通过卷积和池化操作捕捉到原始句子的关键信息,最终输出一个语义向量(可以认为是句子向量)。
Expand Down Expand Up @@ -263,7 +263,7 @@ Pairwise Rank复用上面的DNN结构,同一个source对两个target求相似

## 执行训练

可以直接执行 `python train.py -y 0 --model_arch 0` 使用 `./data/classification` 目录里的实例数据来测试能否直接运行训练分类FC模型。
可以直接执行 `python train.py -y 0 --model_arch 0 --class_num 2` 使用 `./data/classification` 目录里的实例数据来测试能否直接运行训练分类FC模型。

其他模型结构也可以通过命令行实现定制,详细命令行参数请执行 `python train.py --help`进行查阅。

Expand Down
6 changes: 3 additions & 3 deletions dssm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ def create_cnn(self, emb, prefix=''):

conv_3 = create_conv(3, self.dnn_dims[1], "cnn")
conv_4 = create_conv(4, self.dnn_dims[1], "cnn")
return conv_3, conv_4
return paddle.layer.concat(input=[conv_3, conv_4])
```

CNN accepts the word sequence of the embedding table, then process the data by convolution and pooling, and finally outputs a semantic vector.
Expand Down Expand Up @@ -240,12 +240,12 @@ The example of this format is as follows.

## Training

We use `python train.py -y 0 --model_arch 0` with the data in `./data/classification` to train a DSSM model for classification. The paremeters to execute the script `train.py` can be found by execution `python infer.py --help`. Some important parameters are:
We use `python train.py -y 0 --model_arch 0 --class_num 2` with the data in `./data/classification` to train a DSSM model for classification. The paremeters to execute the script `train.py` can be found by execution `python infer.py --help`. Some important parameters are:

- `train_data_path` Training data path
- `test_data_path` Test data path, optional
- `source_dic_path` Source dictionary path
- `target_dic_path` 目Target dictionary path
- `target_dic_path` Target dictionary path
- `model_type` The type of loss function of the model: classification 0, sort 1, regression 2
- `model_arch` Model structure: FC 0,CNN 1, RNN 2
- `dnn_dims` The dimension of each layer of the model is set, the default is `256,128,64,32`,with 4 layers.
Expand Down
6 changes: 3 additions & 3 deletions dssm/network_conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,12 +146,12 @@ def create_conv(context_len, hidden_size, prefix):
pool_bias_attr=ParamAttr(name=key + "_pool.b"))
return conv

logger.info("create a sequence_conv_pool which context width is 3")
logger.info("create a sequence_conv_pool whose context width is 3.")
conv_3 = create_conv(3, self.dnn_dims[1], "cnn")
logger.info("create a sequence_conv_pool which context width is 4")
logger.info("create a sequence_conv_pool whose context width is 4.")
conv_4 = create_conv(4, self.dnn_dims[1], "cnn")

return conv_3, conv_4
return paddle.layer.concat(input=[conv_3, conv_4])

def create_dnn(self, sent_vec, prefix):
# if more than three layers, than a fc layer will be added.
Expand Down