Add optimized decoder for the deployment of DS2 #139

kuke · 2017-06-29T02:09:42Z

Implement the CTC beam search decoder in C++ to speedup decoding. Compared with the prototype decoder in Python, this optimized decoder gets the identical decoding results and has the advantage of about 3x speedup in single thread (measured by the time module in Python) when given the same parameters.

To further achieve real-time decoding for deployment in reality, the width of beam search can be set appropriately. An experiment is carried out to illustrate the effect of beam size on the WER and speed of decoding with 100 samples, and here are some results:

It is not hard to find that when beam size < 200, the average time of one sample's decoding is limited to 1s without a significant decay in WER. Therefore by setting a proper beam size in this range, the decoding in deployment can be completed within acceptable time.

lcy-seso · 2017-07-05T04:42:40Z

接下来，有没有可能加入 Paddle 作为一个新的Layer，比如CtcDecodingLayer?

kuke · 2017-07-06T03:34:34Z

@lcy-seso 这个是可以的，事实上我们也有这个计划，待decoder部分充分优化之后就会放到Paddle中去

pkuyym · 2017-07-06T04:52:39Z

放到PaddlePaddle里面，需要考虑Language Model。

lcy-seso · 2017-07-06T04:54:16Z

TensorFlow 的CTCDecoder Layer 不需要语言模型吧？可以留一个回调函数接口。

pkuyym · 2017-07-06T04:55:17Z

这个版本是需要的，TF的LM是最后reranking，这个是decoder过程就要考虑

kuke · 2017-07-06T13:06:28Z

只能说我们在当前decode的过程中考虑了语言模型，而作为一个普通的decoder加入paddle，是完全可以和TensorFlow等同的

…nto ctc_decoder_deploy

kuke

Updated. Please continue to review

kuke · 2017-09-15T14:42:01Z

deep_speech_2/infer.py

@@ -84,14 +84,16 @@ def infer():
        use_gru=args.use_gru,
        pretrained_model_path=args.model_path,
        share_rnn_weights=args.share_rnn_weights)
+
+    vocab_list = [chars.encode("utf-8") for chars in data_generator.vocab_list]


kuke · 2017-09-15T14:43:13Z

deep_speech_2/decoders/swig/ctc_decoders.h

+ *     num_processes: Number of threads for beam search.
+ *     cutoff_prob: Cutoff probability for pruning.
+ *     cutoff_top_n: Cutoff number for pruning.
+ *     ext_scorer: External scorer to evaluate a prefix.


pkuyym · 2017-09-16T06:17:32Z

deep_speech_2/decoders/swig/path_trie.h

@@ -0,0 +1,68 @@
+#ifndef PATH_TRIE_H
+#define PATH_TRIE_H
+#pragma once


If we have #pragma once here, i think there's no need for #ifndef #define #endif

pkuyym · 2017-09-16T06:24:33Z

deep_speech_2/decoders/decoder_deprecated.py

            prob_idx = sorted(prob_idx, key=lambda asd: asd[1], reverse=True)
            cutoff_len, cum_prob = 0, 0.0
            for i in xrange(len(prob_idx)):
                cum_prob += prob_idx[i][1]
                cutoff_len += 1
                if cum_prob >= cutoff_prob:
                    break
+            cutoff_len = min(cutoff_top_n, cutoff_top_n)


Could move cutoff_len into for loop as a stop condition like if (cum_prob >= cutoff_prob or cutoff_len >= threshold) break.
I think min(cutoff_top_n, cutoff_top_n) should be a typo.

It is not necessary.

Corrected

pkuyym · 2017-09-16T06:29:14Z

deep_speech_2/decoders/decoder_deprecated.py

@@ -232,8 +228,8 @@ def ctc_beam_search_decoder_batch(probs_split,
    pool = multiprocessing.Pool(processes=num_processes)
    results = []
    for i, probs_list in enumerate(probs_split):
-        args = (probs_list, beam_size, vocabulary, blank_id, cutoff_prob, None,
-                nproc)
+        args = (probs_list, beam_size, vocabulary, blank_id, cutoff_prob,


We can comment more to clarify why using global ext_nproc_scorer instead of passing ext_nproc_scorer to ctc_beam_search_decoder.

Would append the comment in later pr.

pkuyym · 2017-09-16T06:31:37Z

deep_speech_2/decoders/swig/ctc_decoders.cpp

+  for (size_t i = 0; i < num_time_steps; ++i) {
+    VALID_CHECK_EQ(probs_seq[i].size(),
+                   vocabulary.size() + 1,
+                   "The shape of probs_seq does not match with "


The comment lacks information like size of probs_seq should be equal to size of vocabulary plus one.

Modified. Now this macro function will output where the error happens and the explicit expression

pkuyym · 2017-09-16T06:33:47Z

deep_speech_2/decoders/swig/ctc_decoders.cpp

+  for (size_t i = 0; i < num_time_steps; ++i) {
+    double max_prob = 0.0;
+    size_t max_idx = 0;
+    for (size_t j = 0; j < probs_seq[i].size(); j++) {


j++ --> ++j

pkuyym · 2017-09-16T07:07:07Z

deep_speech_2/decoders/swig/ctc_decoders.cpp

+  }  // end of loop over time
+
+  // compute aproximate ctc score as the return score
+  for (size_t i = 0; i < beam_size && i < prefixes.size(); ++i) {


mark here, this can be removed.

pkuyym · 2017-09-16T07:08:51Z

deep_speech_2/decoders/swig/ctc_decoders.h

@@ -0,0 +1,75 @@
+#ifndef CTC_BEAM_SEARCH_DECODER_H_


please unify #pragma once or #ifndef #define #endif

pkuyym · 2017-09-16T07:13:08Z

deep_speech_2/decoders/swig/path_trie.h

+#include <utility>
+#include <vector>
+
+using FSTMATCH = fst::SortedMatcher<fst::StdVectorFst>;


don't use using in header file.

pkuyym · 2017-09-16T07:17:54Z

deep_speech_2/decoders/swig/setup.sh

@@ -0,0 +1,21 @@
+#!/bin/bash


change to #! /usr/bin/env bash, see https://stackoverflow.com/questions/16365130/the-difference-between-usr-bin-env-bash-and-usr-bin-bash

pkuyym · 2017-09-16T07:20:12Z

deep_speech_2/decoders/swig/setup.sh

@@ -0,0 +1,21 @@
+#!/bin/bash
+
+if [ ! -d kenlm ]; then


please add error checking.

Will append later

xinghai-sun

Great job. Thanks!

xinghai-sun · 2017-09-16T05:02:30Z

deep_speech_2/README.md

@@ -176,6 +176,7 @@ Data augmentation has often been a highly effective technique to boost the deep

 Six optional augmentation components are provided to be selected, configured and inserted into the processing pipeline.

+### Inference


Remove L179

xinghai-sun · 2017-09-16T05:04:28Z

deep_speech_2/decoders/decoder_deprecated.py

                            cutoff_prob=1.0,
+                            cutoff_top_n=40,


Why to add cutoff_top_n?

It's a param used by Mandarin vocabulary cutoff

xinghai-sun · 2017-09-16T05:20:52Z

deep_speech_2/decoders/swig/decoder_utils.h

+const float NUM_FLT_MIN = std::numeric_limits<float>::min();
+
+// check if __A == _B
+#define VALID_CHECK_EQ(__A, __B, __ERR)          \


Could you consider using GLOG instead for simplicity?

GLOG will conflict with the macro definition in openfst. An improved macro function is used here instead.

xinghai-sun · 2017-09-16T05:21:56Z

deep_speech_2/decoders/swig/path_trie.h

+#ifndef PATH_TRIE_H
+#define PATH_TRIE_H
+#pragma once
+#include <fst/fstlib.h>


Include order follows Google Coding Style?

xinghai-sun · 2017-09-16T05:27:12Z

deep_speech_2/decoders/swig/ctc_decoders.cpp

+  std::vector<size_t> max_idx_vec;
+  for (size_t i = 0; i < num_time_steps; ++i) {
+    double max_prob = 0.0;
+    size_t max_idx = 0;


Add const std::vector<double>& probs_seq_step = probs_seq[i];
And afterwards just using probs_seq_step would be a little faster.
But I'm not sure whether the compiler has already done this implicitely?

xinghai-sun · 2017-09-16T07:16:19Z

deep_speech_2/decoders/swig/ctc_decoders.h

+    std::vector<std::string> vocabulary,
+    const double cutoff_prob = 1.0,
+    const size_t cutoff_top_n = 40,
+    Scorer *ext_scorer = NULL);


NULL --> nullptr

xinghai-sun · 2017-09-16T07:16:29Z

deep_speech_2/decoders/swig/ctc_decoders.h

+    const size_t num_processes,
+    double cutoff_prob = 1.0,
+    const size_t cutoff_top_n = 40,
+    Scorer *ext_scorer = NULL);


xinghai-sun · 2017-09-16T07:32:15Z

deep_speech_2/decoders/swig/scorer.cpp

+  if (dictionary != nullptr) delete static_cast<fst::StdVectorFst*>(dictionary);
+}
+
+void Scorer::load_LM(const char* filename) {


xinghai-sun · 2017-09-16T07:34:43Z

deep_speech_2/decoders/swig_wrapper.py

+                  language model when alpha = 0.
+    :type alpha: float
+    :param beta: Parameter associated with word count. Don't use word
+                count when beta = 0.


Be careful of the indent.

xinghai-sun · 2017-09-16T07:36:17Z

deep_speech_2/examples/librispeech/run_infer.sh

@@ -21,9 +21,10 @@ python -u infer.py \
 --num_conv_layers=2 \
 --num_rnn_layers=3 \
 --rnn_layer_size=2048 \
--alpha=0.36 \


Please also update this in examples/tiny.

kuke

Followed comments. Please continue to review

kuke · 2017-09-17T13:38:39Z

deep_speech_2/README.md

@@ -176,6 +176,7 @@ Data augmentation has often been a highly effective technique to boost the deep

 Six optional augmentation components are provided to be selected, configured and inserted into the processing pipeline.

+### Inference


kuke · 2017-09-17T13:39:37Z

deep_speech_2/decoders/decoder_deprecated.py

                            cutoff_prob=1.0,
+                            cutoff_top_n=40,


It's a param used by Mandarin vocabulary cutoff

kuke · 2017-09-17T13:42:22Z

deep_speech_2/decoders/decoder_deprecated.py

            prob_idx = sorted(prob_idx, key=lambda asd: asd[1], reverse=True)
            cutoff_len, cum_prob = 0, 0.0
            for i in xrange(len(prob_idx)):
                cum_prob += prob_idx[i][1]
                cutoff_len += 1
                if cum_prob >= cutoff_prob:
                    break
+            cutoff_len = min(cutoff_top_n, cutoff_top_n)


It is not necessary.

Corrected

kuke · 2017-09-17T13:43:14Z

deep_speech_2/decoders/decoder_deprecated.py

@@ -232,8 +228,8 @@ def ctc_beam_search_decoder_batch(probs_split,
    pool = multiprocessing.Pool(processes=num_processes)
    results = []
    for i, probs_list in enumerate(probs_split):
-        args = (probs_list, beam_size, vocabulary, blank_id, cutoff_prob, None,
-                nproc)
+        args = (probs_list, beam_size, vocabulary, blank_id, cutoff_prob,


Would append the comment in later pr.

kuke · 2017-09-17T13:43:28Z

deep_speech_2/decoders/swig/ctc_decoders.cpp

+#include "decoder_utils.h"
+#include "path_trie.h"
+
+std::string ctc_greedy_decoder(


kuke · 2017-09-17T14:24:24Z

deep_speech_2/decoders/swig/setup.sh

@@ -0,0 +1,21 @@
+#!/bin/bash
+
+if [ ! -d kenlm ]; then


Will append later

kuke · 2017-09-17T14:24:32Z

deep_speech_2/decoders/swig_wrapper.py

+                  language model when alpha = 0.
+    :type alpha: float
+    :param beta: Parameter associated with word count. Don't use word
+                count when beta = 0.


kuke · 2017-09-17T14:24:45Z

deep_speech_2/examples/librispeech/run_infer.sh

@@ -21,9 +21,10 @@ python -u infer.py \
 --num_conv_layers=2 \
 --num_rnn_layers=3 \
 --rnn_layer_size=2048 \
--alpha=0.36 \


kuke · 2017-09-17T14:27:32Z

deep_speech_2/decoders/swig/scorer.h

+  void reset_params(float alpha, float beta);
+
+  // make ngram
+  std::vector<std::string> make_ngram(PathTrie *prefix);


Seems that there would be an error when using const reference

kuke · 2017-09-17T14:42:01Z

deep_speech_2/decoders/swig/ctc_decoders.cpp

+        ext_scorer->fill_dictionary(true);
+      }
+      auto fst_dict = static_cast<fst::StdVectorFst *>(ext_scorer->dictionary);
+      fst::StdVectorFst *dict_ptr = fst_dict->Copy(true);


此处应该不是一个简单的指针赋值，而是根据状态不同返回不同的match type:
http://www.openfst.org/doxygen/fst/html/matcher_8h_source.html#l00041

pkuyym · 2017-09-18T06:51:42Z

Great job! LGTM

kuke · 2017-09-18T12:14:50Z

Resolve #277

fanlu · 2017-10-14T04:27:27Z

@kuke 在swig目录中执行sh setup.sh
报了一个错
Install decoders ...
decoder_utils.h:55: Error: Syntax error in input(1).
running install
最终安装成功了
Processing dependencies for swig-decoders==0.1
Finished processing dependencies for swig-decoders==0.1
但是执行python -c "import swig_decoders"还是报错
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named swig_decoders

kuke · 2017-10-14T04:46:22Z

decoder_utils.h:55: Error: Syntax error in input(1)
This error results from that the version of swig is too low. Please upgrade swig first then reinstall the decoders.
@fanlu

fanlu · 2017-10-14T07:04:09Z

@kuke 非常感谢，对swig不了解，另外还有两个问题，1.在mac上执行会报这个问题，
openfst-1.6.3/src/include/fst/types.h:19:10: fatal error: 'cstdint' file not found
2.另外，在中文的处理中，ctc_beam_search_decoder.cpp 第122行，c == space_id 中文是没有空格的，怎么把语言模型的转移概率加进去呢？

lcy-seso · 2017-10-14T07:07:46Z

@fanlu hi~ 你好，考虑到这个PR已经merge，也不易被其它人搜索到。能否提一个issue，把所有遇到的问题都汇总到issue中呢？

fanlu · 2017-10-14T07:12:56Z

@lcy-seso 好的

Yibing Liu added 2 commits June 29, 2017 10:05

add initial files for deployment

724b0fb

add deploy.py

348d6bb

kuke changed the title ~~Add optimized decoder for deployment~~ Add optimized decoder for the deployment of DS2 Jun 29, 2017

Yibing Liu added 2 commits June 29, 2017 15:56

Merge branch 'develop' into ctc_decoder_deploy

59b4b87

fix bugs

a506198

kuke force-pushed the ctc_decoder_deploy branch from d6c3028 to a506198 Compare July 5, 2017 04:14

Yibing Liu added 2 commits July 5, 2017 15:07

Merge branch 'develop' into ctc_decoder_deploy

903c300

code cleanup for the deployment decoder

1ca3814

add setup and README for deployment

34f98e0

kuke requested review from xinghai-sun and pkuyym July 6, 2017 11:52

enable loading language model in multiple format

ae05535

kuke force-pushed the ctc_decoder_deploy branch from 4a3b6e9 to ae05535 Compare July 11, 2017 07:26

kuke force-pushed the ctc_decoder_deploy branch 2 times, most recently from ac18ee5 to fff62dc Compare July 27, 2017 02:04

Yibing Liu added 2 commits July 31, 2017 12:12

change probs' computation into log scale & add best path decoder

4e5b345

refine the interface of decoders in swig

908932f

kuke force-pushed the ctc_decoder_deploy branch from 14b16ac to 908932f Compare August 3, 2017 03:58

kuke and others added 6 commits August 3, 2017 14:46

Delete swig_decoder.py

ac3a49c

reorganize cpp files

9ff48b0

refine wrapper for swig and simplify setup

32047c7

add the support of parallel beam search decoding in deployment

f41375b

Refactor scorer and move utility functions to decoder_util.h

a96c650

Merge branch 'ctc_decoder_deploy' of https://github.com/kuke/models i…

3441148

…nto ctc_decoder_deploy

Yibing Liu added 2 commits September 15, 2017 22:30

refine by following review comments

0bda37c

append some changes

902c35b

kuke commented Sep 15, 2017

View reviewed changes

kuke mentioned this pull request Sep 15, 2017

vocabulary should contain blank to make the decoder API easy to use. #254

Closed

Yibing Liu added 2 commits September 16, 2017 11:04

Merge branch 'develop' of upstream into ctc_decoder_deploy

bb35363

expose param cutoff_top_n

15728d0

pkuyym requested changes Sep 16, 2017

View reviewed changes

xinghai-sun requested changes Sep 16, 2017

View reviewed changes

Yibing Liu added 2 commits September 17, 2017 19:05

adjust scorer's init & add logging for scorer & separate long functions

e6740af

format varabiables' name & add more comments

8c5576d

xinghai-sun approved these changes Sep 17, 2017

View reviewed changes

kuke commented Sep 17, 2017

View reviewed changes

Yibing Liu added 3 commits September 18, 2017 12:48

Merge branch 'develop' of upstream into ctc_decoder_deploy

bcc236e

adjust to pass ci

98d35b9

specify clang_format to ver3.9

d7a9752

pkuyym approved these changes Sep 18, 2017

View reviewed changes

Yibing Liu added 2 commits September 18, 2017 16:20

disable the make output of libsndfile in setup

cc2f91f

use cd instead of pushd in setup.sh

f1cd672

kuke force-pushed the ctc_decoder_deploy branch from 47dc47b to f1cd672 Compare September 18, 2017 11:32

Yibing Liu added 2 commits September 18, 2017 22:18

Merge branch 'develop' of upstream into ctc_decoder_deploy

cfecaa8

pass unittest for deprecated decoders

9db0d25

kuke merged commit 17ebb40 into PaddlePaddle:develop Sep 18, 2017

fanlu mentioned this pull request Oct 14, 2017

ctc_decoder的一些问题 #376

Closed

		@@ -176,6 +176,7 @@ Data augmentation has often been a highly effective technique to boost the deep

		Six optional augmentation components are provided to be selected, configured and inserted into the processing pipeline.

		### Inference

Add optimized decoder for the deployment of DS2 #139

Add optimized decoder for the deployment of DS2 #139

Conversation

kuke commented Jun 29, 2017 • edited Loading

lcy-seso commented Jul 5, 2017 • edited Loading

kuke commented Jul 6, 2017

pkuyym commented Jul 6, 2017

lcy-seso commented Jul 6, 2017 • edited Loading

pkuyym commented Jul 6, 2017

kuke commented Jul 6, 2017

kuke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xinghai-sun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kuke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuyym commented Sep 18, 2017

kuke commented Sep 18, 2017

fanlu commented Oct 14, 2017

kuke commented Oct 14, 2017 • edited Loading

fanlu commented Oct 14, 2017

lcy-seso commented Oct 14, 2017

fanlu commented Oct 14, 2017

kuke commented Jun 29, 2017 •

edited

Loading

lcy-seso commented Jul 5, 2017 •

edited

Loading

lcy-seso commented Jul 6, 2017 •

edited

Loading

kuke commented Oct 14, 2017 •

edited

Loading