Add native JavaScript(TypeScript) to replace translation.py #243

huan · 2019-03-07T07:08:49Z

In our translation example, we use a python script to generate the model, train, and save the model to file for future use.

I think it will be more clear to use a native JavaScript script to do this task, so I rewrite it from Python to TypeScript.

Train the model and save it to file in JavaScript demostrated as the following:

$ yarn train /dist/fra.txt 
yarn run v1.12.3
$ node -r ts-node/register translation.ts --artifacts_dir dist/resources /home/zixia/git/javascript-concist-chit-chat/data/fra.txt
2019-03-07 15:02:07.198157: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Number of samples: 10000
Number of unique input tokens: 69
Number of unique output tokens: 93
Max sequence length for inputs: 16
Max sequence length for outputs: 59
Saved metadata at:  dist/resources/metadata.json
Orthogonal initializer is being called on a matrix with more than 2000 (262144) elements: Slowness may result.
Orthogonal initializer is being called on a matrix with more than 2000 (262144) elements: Slowness may result.
Epoch 1 / 20
eta=0.0 =============================================================================================================================> 
59356ms 7419us/step - loss=1.11 val_loss=1.10 
Epoch 2 / 20
eta=0.0 =============================================================================================================================> 
60262ms 7533us/step - loss=0.944 val_loss=1.04 
Epoch 3 / 20
eta=0.0 =============================================================================================================================> 

...
...
...

63610ms 7951us/step - loss=0.529 val_loss=0.631 
Epoch 17 / 20
eta=0.0 =============================================================================================================================> 
64396ms 8050us/step - loss=0.521 val_loss=0.618 
Epoch 18 / 20
eta=0.0 =============================================================================================================================> 
61433ms 7679us/step - loss=0.510 val_loss=0.611 
Epoch 19 / 20
eta=0.0 =============================================================================================================================> 
61209ms 7651us/step - loss=0.499 val_loss=0.600 
Epoch 20 / 20
eta=0.0 =============================================================================================================================> 
61208ms 7651us/step - loss=0.492 val_loss=0.590

Cheers.

CC @wangtz

BTW: I'd like to suggest let's open the ISSUE entry for this repo because the examples might have many problems need to be discussed.

This change is

huan · 2019-03-09T03:51:50Z

translation/translation.ts

+  */
+  // Define an input sequence and process it.
+  const encoder_inputs = tf.layers.input({
+    shape: [null, num_encoder_tokens] as number[],


See: tensorflow/tfjs-layers#492 (comment)
@davidsoergel @caisq

nsthorat · 2019-03-11T17:43:35Z

Once tensorflow/tfjs-node#209 goes in, we'll have a 1.0 for node.js.

caisq

Thank you, @huan ! I think it's very exciting that the training of the translation model can be made working in tfjs-node/tfjs-node-gpu. Please see my first round of comments below.

Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @caisq and @huan)

translation/build-resources.sh, line 68 at r2 (raw file):

orthogonal

Nice!

translation/package.json, line 18 at r2 (raw file):

Quoted 6 lines of code…

    "argparse": "^1.0.10",
    "invert-kv": "^2.0.0",
    "mkdirp": "^0.5.1",
    "readline": "^1.3.0",
    "vega-embed": "^3.0.0",
    "zip-array": "^1.0.1"

I think these should all go into devDependencies (except vega-embed). These newly added dependencies are used only for training in Node.js. We don't want the to become a part of the web bundle.

translation/TRAIN.javascript.md, line 22 at r2 (raw file):

=============================================>·

nit: These copy-and-pasted progress bar are not very nice because they are too long and will cause line wrapping on many screens. The width of the progress bar in tfjs-node/tfjs-node-gpu actually scales with the width of the terminal. Can you shorten them to <= 80 chars. You don't need to run the training job again. Just do string replacement in this file.

translation/TRAIN.javascript.md, line 324 at r2 (raw file):

Va !

nit: Can you remove these ugly-looking tabs at the end of the lines? Same below.

translation/translation.ts, line 208 at r1 (raw file):

Previously, huan (Huan LI (李卓桓)) wrote…

See: tensorflow/tfjs-layers#492 (comment)
@davidsoergel @caisq

Ack.

translation/translation.ts, line 16 at r2 (raw file):

import fs     from 'fs'

style nit: Please follow the style of other examples:

No alignment through extra whitespace
No space after { and before }.

translation/translation.ts, line 27 at r2 (raw file):

'@tensorflow/tfjs-node

We want to parameterize this training script, so that the user can switch to CUDA GPU-based training by simply adding a --gpu flag.
You can achieve this by adding tfjs-node-gpu as a devDependency (in package.json) and properly configuring argparse here.
You can follow the example in https://github.com/tensorflow/tfjs-examples/blob/master/date-conversion-attention/train.js#L129

translation/translation.ts, line 33 at r2 (raw file):

Does it make sense to explicitly type the return value of this function and the functions below?

translation/translation.ts, line 48 at r2 (raw file):

= 0

style nit: Google JS/TS style requires semicolons at the end of every line.

translation/translation.ts, line 85 at r2 (raw file):

.redu

nit: hanging indentation should be four spaces relative to the beginning of the previous line.

translation/translation.ts, line 180 at r2 (raw file):

num_encoder_tokens:

style nit: Google JS style recommends lowerCamelCase. Please change the variable names throughout.
https://google.github.io/styleguide/jsguide.html#naming-local-variable-names

translation/translation.ts, line 184 at r2 (raw file):

/**

nit: doc string should be above the function. See other .js files in the tfjs-example for examples.

translation/translation.ts, line 265 at r2 (raw file):

decode_sequence

Like above, Google JS style recommends loweCamelCase function names. Please make changes throughout.

huan · 2019-03-12T04:23:29Z

@nsthorat That's right. I found that the tfjs-node does not work with the current tfjs 1.0. I'll upgrade this example to TFJS 1.0 when they getting compatible stably.

@caisq Thanks for the review, I'll follow your reviews and modify this PR as you requested later.

…us line

huan · 2019-03-12T06:15:19Z

Hi @caisq, I had modified my PR as your request, except that I did not add the explicitly type for the return of function because they can be referenced automatically from the return statement inside the function.

Please let me know if there need any other modifications, thanks.

caisq

Thanks! I just have some remaining comments. Please address them and then we'll be ready to merge the PR.

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @huan)

translation/README.md, line 9 at r3 (raw file):

TRAIN DEMO

nit: --> Training Demo

translation/TRAIN.python.md, line 54 at r3 (raw file):

nit: Remove the tabs at the end of these lines. Same elsewhere throughout.

translation/translation.ts, line 31 at r3 (raw file):

async function readData (
  dataFile: string,
) {

nit: Put these on the same line. We do line breaks like this only when the line width would exceed 80 chars.

translation/translation.ts, line 42 at r3 (raw file):

nit: Don't do horizontal alignment like this. It's against Google JS style.
https://google.github.io/styleguide/jsguide.html#formatting-horizontal-alignment

translation/translation.ts, line 397 at r3 (raw file):

Huan: be aware that the Node need a `file://` prefix to local filename

If this is just a note to yourself, please remove it.

…ranslation-js

huan · 2019-03-13T05:40:06Z

@caisq Thanks for the additional reviews!

I had followed your reviews and updated the PR as well.

caisq

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @huan)

translation/translation.ts, line 2 at r4 (raw file):

/**

Sorry I missed it in the earlier reviews. But please add the same copyright and license header as in the other .js files in tfjs-examples.

huan · 2019-03-13T14:46:55Z

@caisq Done.

huan added 2 commits March 7, 2019 15:01

Add native JavaScript(TypeScript) to replace translation.py

b06d646

Add train log output messages for python and javascript

de5bb8c

huan commented Mar 9, 2019

View reviewed changes

huan added 5 commits March 10, 2019 00:38

Tensorflow.js 1.0 have supported orthogonal initializer

7950ea6

Tensorflow.js 1.0 have supported orthogonal initializer

696bd1f

JavaScript in 100 epochs

429e5cb

TFJS 1.0 is not stable, revert back

bfa5bbb

JS need more epochs, try default 200

ca65eeb

nsthorat requested a review from caisq March 11, 2019 17:42

caisq requested changes Mar 12, 2019

View reviewed changes

huan added 8 commits March 12, 2019 13:38

Move to devDependencies

ed22157

Remove TABs and shorten the bar length

c916e04

Fix lint for {} and spaces

2fa4d09

Add --gpu argument

92cd91a

Add semicolons

5670a5f

hanging indentation: 4 spaces relative to the beginning of the previo…

906cd64

…us line

Change all variable names to lowerCamelCase

9650670

Doc string should be above the function

163b08f

huan force-pushed the translation-js branch from b3d92e4 to 163b08f Compare March 12, 2019 06:10

huan added 2 commits March 12, 2019 14:12

Resolve conflic

03c240f

Merge branch 'master' into translation-js

cbfcd18

caisq approved these changes Mar 12, 2019

View reviewed changes

huan added 3 commits March 13, 2019 13:32

Follow reviews from @caisq

8f3317e

Merge branch 'translation-js' of github.com:huan/tfjs-examples into t…

955b588

…ranslation-js

Update to work under tfjs 1.0

29c7a63

caisq approved these changes Mar 13, 2019

View reviewed changes

Add standard license to TS file

df97b72

caisq approved these changes Mar 13, 2019

View reviewed changes

caisq added 2 commits March 13, 2019 10:47

Update translation.ts

4fd2492

Merge branch 'master' into translation-js

5e6e0c6

caisq merged commit abca94c into tensorflow:master Mar 13, 2019

caisq mentioned this pull request Mar 14, 2019

lstm translation example under node.js #157

Closed

huan added a commit to huan/tensorflow-handbook-javascript that referenced this pull request May 7, 2020

link to my js seq2seq implementation (tensorflow/tfjs-examples#243)

803301d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add native JavaScript(TypeScript) to replace translation.py #243

Add native JavaScript(TypeScript) to replace translation.py #243

Uh oh!

huan commented Mar 7, 2019 •

edited

Loading

Uh oh!

huan Mar 9, 2019

Uh oh!

nsthorat commented Mar 11, 2019

Uh oh!

caisq left a comment

Uh oh!

huan commented Mar 12, 2019

Uh oh!

huan commented Mar 12, 2019

Uh oh!

caisq left a comment

Uh oh!

huan commented Mar 13, 2019

Uh oh!

caisq left a comment

Uh oh!

huan commented Mar 13, 2019

Uh oh!

Uh oh!

Add native JavaScript(TypeScript) to replace translation.py #243

Add native JavaScript(TypeScript) to replace translation.py #243

Uh oh!

Conversation

huan commented Mar 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huan Mar 9, 2019

Choose a reason for hiding this comment

Uh oh!

nsthorat commented Mar 11, 2019

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

huan commented Mar 12, 2019

Uh oh!

huan commented Mar 12, 2019

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

huan commented Mar 13, 2019

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

huan commented Mar 13, 2019

Uh oh!

Uh oh!

huan commented Mar 7, 2019 •

edited

Loading