Skip to content

Add native JavaScript(TypeScript) to replace translation.py #243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Mar 13, 2019

Conversation

huan
Copy link
Contributor

@huan huan commented Mar 7, 2019

In our translation example, we use a python script to generate the model, train, and save the model to file for future use.

I think it will be more clear to use a native JavaScript script to do this task, so I rewrite it from Python to TypeScript.

Train the model and save it to file in JavaScript demostrated as the following:

$ yarn train /dist/fra.txt 
yarn run v1.12.3
$ node -r ts-node/register translation.ts --artifacts_dir dist/resources /home/zixia/git/javascript-concist-chit-chat/data/fra.txt
2019-03-07 15:02:07.198157: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Number of samples: 10000
Number of unique input tokens: 69
Number of unique output tokens: 93
Max sequence length for inputs: 16
Max sequence length for outputs: 59
Saved metadata at:  dist/resources/metadata.json
Orthogonal initializer is being called on a matrix with more than 2000 (262144) elements: Slowness may result.
Orthogonal initializer is being called on a matrix with more than 2000 (262144) elements: Slowness may result.
Epoch 1 / 20
eta=0.0 =============================================================================================================================> 
59356ms 7419us/step - loss=1.11 val_loss=1.10 
Epoch 2 / 20
eta=0.0 =============================================================================================================================> 
60262ms 7533us/step - loss=0.944 val_loss=1.04 
Epoch 3 / 20
eta=0.0 =============================================================================================================================> 

...
...
...

63610ms 7951us/step - loss=0.529 val_loss=0.631 
Epoch 17 / 20
eta=0.0 =============================================================================================================================> 
64396ms 8050us/step - loss=0.521 val_loss=0.618 
Epoch 18 / 20
eta=0.0 =============================================================================================================================> 
61433ms 7679us/step - loss=0.510 val_loss=0.611 
Epoch 19 / 20
eta=0.0 =============================================================================================================================> 
61209ms 7651us/step - loss=0.499 val_loss=0.600 
Epoch 20 / 20
eta=0.0 =============================================================================================================================> 
61208ms 7651us/step - loss=0.492 val_loss=0.590 

Cheers.

CC @wangtz

BTW: I'd like to suggest let's open the ISSUE entry for this repo because the examples might have many problems need to be discussed.


This change is Reviewable

*/
// Define an input sequence and process it.
const encoder_inputs = tf.layers.input({
shape: [null, num_encoder_tokens] as number[],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nsthorat nsthorat requested a review from caisq March 11, 2019 17:42
@nsthorat
Copy link

Once tensorflow/tfjs-node#209 goes in, we'll have a 1.0 for node.js.

Copy link
Collaborator

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @huan ! I think it's very exciting that the training of the translation model can be made working in tfjs-node/tfjs-node-gpu. Please see my first round of comments below.

Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @caisq and @huan)


translation/build-resources.sh, line 68 at r2 (raw file):

orthogonal

Nice!


translation/package.json, line 18 at r2 (raw file):

Quoted 6 lines of code…
    "argparse": "^1.0.10",
    "invert-kv": "^2.0.0",
    "mkdirp": "^0.5.1",
    "readline": "^1.3.0",
    "vega-embed": "^3.0.0",
    "zip-array": "^1.0.1"

I think these should all go into devDependencies (except vega-embed). These newly added dependencies are used only for training in Node.js. We don't want the to become a part of the web bundle.


translation/TRAIN.javascript.md, line 22 at r2 (raw file):

=============================================>·

nit: These copy-and-pasted progress bar are not very nice because they are too long and will cause line wrapping on many screens. The width of the progress bar in tfjs-node/tfjs-node-gpu actually scales with the width of the terminal. Can you shorten them to <= 80 chars. You don't need to run the training job again. Just do string replacement in this file.


translation/TRAIN.javascript.md, line 324 at r2 (raw file):

Va !

nit: Can you remove these ugly-looking tabs at the end of the lines? Same below.


translation/translation.ts, line 208 at r1 (raw file):

Previously, huan (Huan LI (李卓桓)) wrote…

See: tensorflow/tfjs-layers#492 (comment)
@davidsoergel @caisq

Ack.


translation/translation.ts, line 16 at r2 (raw file):

import fs     from 'fs'

style nit: Please follow the style of other examples:

  • No alignment through extra whitespace
  • No space after { and before }.

translation/translation.ts, line 27 at r2 (raw file):

'@tensorflow/tfjs-node

We want to parameterize this training script, so that the user can switch to CUDA GPU-based training by simply adding a --gpu flag.
You can achieve this by adding tfjs-node-gpu as a devDependency (in package.json) and properly configuring argparse here.
You can follow the example in https://github.com/tensorflow/tfjs-examples/blob/master/date-conversion-attention/train.js#L129


translation/translation.ts, line 33 at r2 (raw file):

)

Does it make sense to explicitly type the return value of this function and the functions below?


translation/translation.ts, line 48 at r2 (raw file):

 = 0

style nit: Google JS/TS style requires semicolons at the end of every line.


translation/translation.ts, line 85 at r2 (raw file):

.redu

nit: hanging indentation should be four spaces relative to the beginning of the previous line.


translation/translation.ts, line 180 at r2 (raw file):

num_encoder_tokens:

style nit: Google JS style recommends lowerCamelCase. Please change the variable names throughout.
https://google.github.io/styleguide/jsguide.html#naming-local-variable-names


translation/translation.ts, line 184 at r2 (raw file):

/**

nit: doc string should be above the function. See other .js files in the tfjs-example for examples.


translation/translation.ts, line 265 at r2 (raw file):

decode_sequence

Like above, Google JS style recommends loweCamelCase function names. Please make changes throughout.

@huan
Copy link
Contributor Author

huan commented Mar 12, 2019

@nsthorat That's right. I found that the tfjs-node does not work with the current tfjs 1.0. I'll upgrade this example to TFJS 1.0 when they getting compatible stably.

@caisq Thanks for the review, I'll follow your reviews and modify this PR as you requested later.

@huan
Copy link
Contributor Author

huan commented Mar 12, 2019

Hi @caisq, I had modified my PR as your request, except that I did not add the explicitly type for the return of function because they can be referenced automatically from the return statement inside the function.

Please let me know if there need any other modifications, thanks.

Copy link
Collaborator

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I just have some remaining comments. Please address them and then we'll be ready to merge the PR.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @huan)


translation/README.md, line 9 at r3 (raw file):

TRAIN DEMO

nit: --> Training Demo


translation/TRAIN.python.md, line 54 at r3 (raw file):

!

nit: Remove the tabs at the end of these lines. Same elsewhere throughout.


translation/translation.ts, line 31 at r3 (raw file):

async function readData (
  dataFile: string,
) {

nit: Put these on the same line. We do line breaks like this only when the line width would exceed 80 chars.


translation/translation.ts, line 42 at r3 (raw file):

    f

nit: Don't do horizontal alignment like this. It's against Google JS style.
https://google.github.io/styleguide/jsguide.html#formatting-horizontal-alignment


translation/translation.ts, line 397 at r3 (raw file):

Huan: be aware that the Node need a `file://` prefix to local filename

If this is just a note to yourself, please remove it.

@huan
Copy link
Contributor Author

huan commented Mar 13, 2019

@caisq Thanks for the additional reviews!

I had followed your reviews and updated the PR as well.

Copy link
Collaborator

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @huan)


translation/translation.ts, line 2 at r4 (raw file):


/**

Sorry I missed it in the earlier reviews. But please add the same copyright and license header as in the other .js files in tfjs-examples.

@huan
Copy link
Contributor Author

huan commented Mar 13, 2019

@caisq Done.

@caisq caisq merged commit abca94c into tensorflow:master Mar 13, 2019
huan added a commit to huan/tensorflow-handbook-javascript that referenced this pull request May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants