diff --git a/dssm/README.cn.md b/dssm/README.cn.md index e1bd3cab89..4191d0fc0e 100644 --- a/dssm/README.cn.md +++ b/dssm/README.cn.md @@ -131,7 +131,7 @@ def create_cnn(self, emb, prefix=''): conv_3 = create_conv(3, self.dnn_dims[1], "cnn") conv_4 = create_conv(4, self.dnn_dims[1], "cnn") - return conv_3, conv_4 + return paddle.layer.concat(input=[conv_3, conv_4]) ``` CNN 接受词向量序列,通过卷积和池化操作捕捉到原始句子的关键信息,最终输出一个语义向量(可以认为是句子向量)。 @@ -263,7 +263,7 @@ Pairwise Rank复用上面的DNN结构,同一个source对两个target求相似 ## 执行训练 -可以直接执行 `python train.py -y 0 --model_arch 0` 使用 `./data/classification` 目录里的实例数据来测试能否直接运行训练分类FC模型。 +可以直接执行 `python train.py -y 0 --model_arch 0 --class_num 2` 使用 `./data/classification` 目录里的实例数据来测试能否直接运行训练分类FC模型。 其他模型结构也可以通过命令行实现定制,详细命令行参数请执行 `python train.py --help`进行查阅。 diff --git a/dssm/README.md b/dssm/README.md index 8148ea6557..23f55b55cc 100644 --- a/dssm/README.md +++ b/dssm/README.md @@ -107,7 +107,7 @@ def create_cnn(self, emb, prefix=''): conv_3 = create_conv(3, self.dnn_dims[1], "cnn") conv_4 = create_conv(4, self.dnn_dims[1], "cnn") - return conv_3, conv_4 + return paddle.layer.concat(input=[conv_3, conv_4]) ``` CNN accepts the word sequence of the embedding table, then process the data by convolution and pooling, and finally outputs a semantic vector. @@ -240,12 +240,12 @@ The example of this format is as follows. ## Training -We use `python train.py -y 0 --model_arch 0` with the data in `./data/classification` to train a DSSM model for classification. The paremeters to execute the script `train.py` can be found by execution `python infer.py --help`. Some important parameters are: +We use `python train.py -y 0 --model_arch 0 --class_num 2` with the data in `./data/classification` to train a DSSM model for classification. The paremeters to execute the script `train.py` can be found by execution `python infer.py --help`. Some important parameters are: - `train_data_path` Training data path - `test_data_path` Test data path, optional - `source_dic_path` Source dictionary path -- `target_dic_path` 目Target dictionary path +- `target_dic_path` Target dictionary path - `model_type` The type of loss function of the model: classification 0, sort 1, regression 2 - `model_arch` Model structure: FC 0,CNN 1, RNN 2 - `dnn_dims` The dimension of each layer of the model is set, the default is `256,128,64,32`,with 4 layers. diff --git a/dssm/network_conf.py b/dssm/network_conf.py index 135a00bf6f..8cd4b6f008 100644 --- a/dssm/network_conf.py +++ b/dssm/network_conf.py @@ -146,12 +146,12 @@ def create_conv(context_len, hidden_size, prefix): pool_bias_attr=ParamAttr(name=key + "_pool.b")) return conv - logger.info("create a sequence_conv_pool which context width is 3") + logger.info("create a sequence_conv_pool whose context width is 3.") conv_3 = create_conv(3, self.dnn_dims[1], "cnn") - logger.info("create a sequence_conv_pool which context width is 4") + logger.info("create a sequence_conv_pool whose context width is 4.") conv_4 = create_conv(4, self.dnn_dims[1], "cnn") - return conv_3, conv_4 + return paddle.layer.concat(input=[conv_3, conv_4]) def create_dnn(self, sent_vec, prefix): # if more than three layers, than a fc layer will be added.