-
Notifications
You must be signed in to change notification settings - Fork 530
Conversation
- Remove params and prefix arguments for MXNet 2 and update parameter sharing implementation - Remove Block.name_scope() for MXNet 2 - Remove self.params.get() and self.params.get_constant()
@zheyuye @hymzoque Would you also help review? |
|
self.mlm_decoder[-1].share_parameters(word_embed_params) | ||
self.backbone_model.token_type_embed.share_parameters(token_type_embed_params) | ||
self.backbone_model.token_pos_embed.share_parameters(token_pos_embed_params) | ||
self.backbone_model.embed_layer_norm.share_parameters(embed_layer_norm_params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the above code, we can share weights in two ways: 1. weight = weight 2.share_parameters, is that correct? Do we need consistency on this part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, both is supported. The 1. weight = weight
way works well if you only need to replace a single parameter and you know it's location. The second way is more general and doesn't require users to specify the position of the weight (but just pass a dictionary containing the weights with proper name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to @leezu for revising, this is generally good but may require some extra efforts on the conversion toolkits which are highly dependent on the prefix as https://github.com/leezu/gluon-nlp/blob/a79101da3a40d5212e419fa1f46a40e9ad3e7eb3/scripts/conversion_toolkits/convert_tf_hub_model.py#L134-L166
params=embed_layer.collect_params( | ||
'(.*_embed|.*_inter_proj)')) | ||
div_val=div_val) | ||
layer_with_shared_proj_embed.share_parameters(embed_layer.collect_params('(.*_embed|.*_inter_proj)')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we cant locate to these parameters since prefix are removed
https://github.com/leezu/gluon-nlp/blob/a79101da3a40d5212e419fa1f46a40e9ad3e7eb3/src/gluonnlp/layers.py#L916-L917
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because the regex needs to be updated. It contains _
character which is no longer used and should be replaced by \.
.
@dmlc/gluon-nlp-committers let's halt other merges in the numpy branch to yield for this change. |
tests/test_models.py
Outdated
net.hybridize() | ||
num_params, num_fixed_params = count_parameters(net.collect_params()) | ||
assert num_params > 0 | ||
@pytest.mark.parametrize('name', list_backbone_names()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved to use sequential test intentionally because running multiple tests in parallel may cause some memory issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case we may mark it as serial. Having a single large tests makes it very hard to reproduce failures of specific models, because the test will always test all models. It's not a good development experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be we should remove the forward test of xlmr, which is too large.
@@ -120,6 +120,7 @@ def test_adaptive_embedding(vocab_size, cutoffs, embed_size, units, div_val): | |||
[1000, None, 1.0]]) | |||
@pytest.mark.parametrize('embed_size', [128]) | |||
@pytest.mark.parametrize('in_units', [16]) | |||
# TODO This test even passes without sharing the parameters. It needs to be improved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, we should revise the test. (May be in a later PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should revise the test in a later PR. I just noticed that the test passed prior even when I disabled the parameter sharing or even when the parameter sharing is handled wrongly (such as using invalid regex in collect_params
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to investigate the test after this gets merged. I'm also having issues about parameter sharing in the layout PR so need to wait for this one.
@sxjscience is this performance as expected? |
Yes, we need to check "best_exact" and "best_f1". |
@szha @sxjscience Yes, the |
@zheyuye generally the scripts can be updated by replacing the
It' can be done in a separate PR if we decide to keep the conversion scripts in the release (which may require adding tests). |
@leezu This sounds great. Let's leave conversion scripts alone for now and re-revise these with some useful test cases once this PR merged. |
|
@leezu It's expected. I propose to merge this in. |
Codecov Report
@@ Coverage Diff @@
## numpy #1261 +/- ##
==========================================
+ Coverage 82.52% 82.53% +0.01%
==========================================
Files 38 38
Lines 5500 5446 -54
==========================================
- Hits 4539 4495 -44
+ Misses 961 951 -10
|
commit 35a586676036f627bffd0d3c753c6cd0a70d63cf Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 17 10:10:14 2020 +0800 Squashed commit of the following: commit 673344d Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 15 22:43:07 2020 +0800 CharTokenizer commit 8dabfd6 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 15 15:47:24 2020 +0800 lowercase commit f5c94a6 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jul 14 17:45:28 2020 +0800 test commit dc55fc9 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jul 14 05:45:01 2020 +0800 tiny update on run_squad commit 4defc7a Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 23:18:08 2020 +0800 update testings commit 2719e81 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 23:08:32 2020 +0800 re-upload xlmr commit cd0509d Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 22:30:47 2020 +0800 fix get_pretrained commit 8ed8a72 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 22:28:13 2020 +0800 re-upload roberta commit 5811d40 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 18:27:23 2020 +0800 update commit 44a09a3 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sat Jul 11 15:06:33 2020 +0800 fix commit 4074a26 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 10 16:08:49 2020 +0800 inference without horovod commit 31cb953 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 18:41:55 2020 +0800 update commit 838be2a Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 15:14:39 2020 +0800 horovod for squad commit 1d374a2 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 12:09:19 2020 +0800 fix commit e4fba39 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 10:35:08 2020 +0800 remove multiply_grads commit 007f07e Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jul 7 11:26:38 2020 +0800 multiply_grads commit b8c85bb Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 6 12:28:56 2020 +0800 fix ModelForQABasic commit 0e13a58 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sat Jul 4 18:42:12 2020 +0800 clip_grad_global_norm with zeros max_grad_norm commit bd270f2 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 3 20:21:31 2020 +0800 fix roberta commit 4fc564c Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 3 19:36:08 2020 +0800 update hyper-parameters of adamw commit 59cffbf Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 3 16:25:46 2020 +0800 try commit a84f782 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 2 20:39:03 2020 +0800 fix mobilebert commit 4bc3a96 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 2 11:14:39 2020 +0800 layer-wise decay commit 07186d5 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 2 02:14:43 2020 +0800 revise commit a5a6475 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 1 19:50:20 2020 +0800 topk commit 34ee884 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 1 19:25:09 2020 +0800 index_update commit 74178e2 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 1 00:48:32 2020 +0800 rename commit fa011aa Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 23:40:28 2020 +0800 update commit 402d625 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 21:40:30 2020 +0800 multiprocessing for wiki commit ddbde75 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 20:41:35 2020 +0800 fix bookcorpus commit 6cc5ccd Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 16:39:12 2020 +0800 fix wiki commit 9773efd Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 15:52:13 2020 +0800 fix openwebtext commit 1fb8eb8 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 19:51:25 2020 +0800 upload gluon_electra_small_owt commit ca83fac Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 18:09:48 2020 +0800 revise train_transformer commit 1450f5c Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 18:07:04 2020 +0800 revise commit b460bbe Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 17:24:00 2020 +0800 repeat for pretraining commit 8ee381b Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 17:06:43 2020 +0800 repeat commit aea936f Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 16:39:22 2020 +0800 fix mobilebert commit eead164 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 18:44:28 2020 +0800 fix commit 8645115 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 17:27:43 2020 +0800 update commit 2b7f7a3 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 17:18:00 2020 +0800 fix roberta commit 86702fe Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 16:27:43 2020 +0800 use_segmentation commit 6d03d7a Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 15:52:40 2020 +0800 fix commit 5c0ca43 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 15:49:48 2020 +0800 fix token_ids commit ff7aae8 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 13:56:07 2020 +0800 fix xlmr commit 2070b86 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 13:54:26 2020 +0800 fix roberta commit 70a1887 Author: Leonard Lausen <lausen@amazon.com> Date: Fri Jul 17 00:07:08 2020 +0000 Update for Block API (dmlc#1261) - Remove params and prefix arguments for MXNet 2 and update parameter sharing implementation - Remove Block.name_scope() for MXNet 2 - Remove self.params.get() and self.params.get_constant() commit ea9152b Author: Xingjian Shi <xshiab@connect.ust.hk> Date: Thu Jul 16 15:42:04 2020 -0700 Fixes to make the CI more stable (dmlc#1265) * Some fixes to make the CI more stable * add retries * Update tokenizers.py commit a646c34 Author: ht <wawawa@akane.waseda.jp> Date: Sun Jul 12 02:49:53 2020 +0800 [FEATURE] update backtranslation and add multinomial sampler (dmlc#1259) * back translation bash * split "lang-pair" para in clean_tok_para_corpus * added clean_tok_mono_corpus * fix * add num_process para * fix * fix * add yml * rm yml * update cfg name * update evaluate * added max_update / save_interval_update params * fix * fix * multi gpu inference * fix * update * update multi gpu inference * fix * fix * split evaluate and parallel infer * fix * test * fix * update * add comments * fix * remove todo comment * revert remove todo comment * raw lines remove duplicated '\n' * update multinomaial sampler * fix * fix * fix * fix * sampling * update script * fix * add test_case with k > 1 in topk sampling * fix multinomial sampler * update docs * comments situation eos_id = None * fix Co-authored-by: Hu <huta@a483e74650ff.ant.amazon.com> commit 83e1f13 Author: Leonard Lausen <lausen@amazon.com> Date: Thu Jul 9 20:57:55 2020 -0700 Use Amazon S3 Transfer Acceleration (dmlc#1260) commit cd48efd Author: Leonard Lausen <lausen@amazon.com> Date: Tue Jul 7 17:39:42 2020 -0700 Update codecov action to handle different OS and Python versions (dmlc#1254) codecov/codecov-action#80 (comment) commit 689eba9 Author: Sheng Zha <szha@users.noreply.github.com> Date: Tue Jul 7 09:55:34 2020 -0700 [CI] AWS batch job tool for GluonNLP (Part I) (dmlc#1251) * AWS batch job tool for GluonNLP * limit range Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk> commit e06ff01 Author: Leonard Lausen <lausen@amazon.com> Date: Tue Jul 7 08:36:24 2020 -0700 Pin mxnet version range on CI (dmlc#1257)
* fix roberta * fix xlmr * fix token_ids * fix * use_segmentation * fix roberta * update * fix * fix mobilebert * repeat * repeat for pretraining * revise * revise train_transformer * upload gluon_electra_small_owt * fix openwebtext * fix wiki * fix bookcorpus * multiprocessing for wiki * update * rename * index_update * topk * revise * layer-wise decay * fix mobilebert * try * update hyper-parameters of adamw * fix roberta * clip_grad_global_norm with zeros max_grad_norm * fix ModelForQABasic * multiply_grads * remove multiply_grads * fix * horovod for squad * update * inference without horovod * fix * update * re-upload roberta * fix get_pretrained * re-upload xlmr * update testings * tiny update on run_squad * test * lowercase * CharTokenizer * Squashed commit of the following: commit 35a586676036f627bffd0d3c753c6cd0a70d63cf Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 17 10:10:14 2020 +0800 Squashed commit of the following: commit 673344d Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 15 22:43:07 2020 +0800 CharTokenizer commit 8dabfd6 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 15 15:47:24 2020 +0800 lowercase commit f5c94a6 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jul 14 17:45:28 2020 +0800 test commit dc55fc9 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jul 14 05:45:01 2020 +0800 tiny update on run_squad commit 4defc7a Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 23:18:08 2020 +0800 update testings commit 2719e81 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 23:08:32 2020 +0800 re-upload xlmr commit cd0509d Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 22:30:47 2020 +0800 fix get_pretrained commit 8ed8a72 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 22:28:13 2020 +0800 re-upload roberta commit 5811d40 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 13 18:27:23 2020 +0800 update commit 44a09a3 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sat Jul 11 15:06:33 2020 +0800 fix commit 4074a26 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 10 16:08:49 2020 +0800 inference without horovod commit 31cb953 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 18:41:55 2020 +0800 update commit 838be2a Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 15:14:39 2020 +0800 horovod for squad commit 1d374a2 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 12:09:19 2020 +0800 fix commit e4fba39 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 9 10:35:08 2020 +0800 remove multiply_grads commit 007f07e Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jul 7 11:26:38 2020 +0800 multiply_grads commit b8c85bb Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jul 6 12:28:56 2020 +0800 fix ModelForQABasic commit 0e13a58 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sat Jul 4 18:42:12 2020 +0800 clip_grad_global_norm with zeros max_grad_norm commit bd270f2 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 3 20:21:31 2020 +0800 fix roberta commit 4fc564c Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 3 19:36:08 2020 +0800 update hyper-parameters of adamw commit 59cffbf Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Fri Jul 3 16:25:46 2020 +0800 try commit a84f782 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 2 20:39:03 2020 +0800 fix mobilebert commit 4bc3a96 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 2 11:14:39 2020 +0800 layer-wise decay commit 07186d5 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Thu Jul 2 02:14:43 2020 +0800 revise commit a5a6475 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 1 19:50:20 2020 +0800 topk commit 34ee884 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 1 19:25:09 2020 +0800 index_update commit 74178e2 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Wed Jul 1 00:48:32 2020 +0800 rename commit fa011aa Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 23:40:28 2020 +0800 update commit 402d625 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 21:40:30 2020 +0800 multiprocessing for wiki commit ddbde75 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 20:41:35 2020 +0800 fix bookcorpus commit 6cc5ccd Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 16:39:12 2020 +0800 fix wiki commit 9773efd Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Tue Jun 30 15:52:13 2020 +0800 fix openwebtext commit 1fb8eb8 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 19:51:25 2020 +0800 upload gluon_electra_small_owt commit ca83fac Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 18:09:48 2020 +0800 revise train_transformer commit 1450f5c Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 18:07:04 2020 +0800 revise commit b460bbe Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 17:24:00 2020 +0800 repeat for pretraining commit 8ee381b Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 17:06:43 2020 +0800 repeat commit aea936f Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Mon Jun 29 16:39:22 2020 +0800 fix mobilebert commit eead164 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 18:44:28 2020 +0800 fix commit 8645115 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 17:27:43 2020 +0800 update commit 2b7f7a3 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 17:18:00 2020 +0800 fix roberta commit 86702fe Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 16:27:43 2020 +0800 use_segmentation commit 6d03d7a Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 15:52:40 2020 +0800 fix commit 5c0ca43 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 15:49:48 2020 +0800 fix token_ids commit ff7aae8 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 13:56:07 2020 +0800 fix xlmr commit 2070b86 Author: ZheyuYe <zheyu.ye1995@gmail.com> Date: Sun Jun 28 13:54:26 2020 +0800 fix roberta commit 70a1887 Author: Leonard Lausen <lausen@amazon.com> Date: Fri Jul 17 00:07:08 2020 +0000 Update for Block API (#1261) - Remove params and prefix arguments for MXNet 2 and update parameter sharing implementation - Remove Block.name_scope() for MXNet 2 - Remove self.params.get() and self.params.get_constant() commit ea9152b Author: Xingjian Shi <xshiab@connect.ust.hk> Date: Thu Jul 16 15:42:04 2020 -0700 Fixes to make the CI more stable (#1265) * Some fixes to make the CI more stable * add retries * Update tokenizers.py commit a646c34 Author: ht <wawawa@akane.waseda.jp> Date: Sun Jul 12 02:49:53 2020 +0800 [FEATURE] update backtranslation and add multinomial sampler (#1259) * back translation bash * split "lang-pair" para in clean_tok_para_corpus * added clean_tok_mono_corpus * fix * add num_process para * fix * fix * add yml * rm yml * update cfg name * update evaluate * added max_update / save_interval_update params * fix * fix * multi gpu inference * fix * update * update multi gpu inference * fix * fix * split evaluate and parallel infer * fix * test * fix * update * add comments * fix * remove todo comment * revert remove todo comment * raw lines remove duplicated '\n' * update multinomaial sampler * fix * fix * fix * fix * sampling * update script * fix * add test_case with k > 1 in topk sampling * fix multinomial sampler * update docs * comments situation eos_id = None * fix Co-authored-by: Hu <huta@a483e74650ff.ant.amazon.com> commit 83e1f13 Author: Leonard Lausen <lausen@amazon.com> Date: Thu Jul 9 20:57:55 2020 -0700 Use Amazon S3 Transfer Acceleration (#1260) commit cd48efd Author: Leonard Lausen <lausen@amazon.com> Date: Tue Jul 7 17:39:42 2020 -0700 Update codecov action to handle different OS and Python versions (#1254) codecov/codecov-action#80 (comment) commit 689eba9 Author: Sheng Zha <szha@users.noreply.github.com> Date: Tue Jul 7 09:55:34 2020 -0700 [CI] AWS batch job tool for GluonNLP (Part I) (#1251) * AWS batch job tool for GluonNLP * limit range Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk> commit e06ff01 Author: Leonard Lausen <lausen@amazon.com> Date: Tue Jul 7 08:36:24 2020 -0700 Pin mxnet version range on CI (#1257) * frozen_params * remove conversion to a sperate pr * fix * fix * update * test * revise * update performance numbers * update apply_layerwisw_decay * use shuffle * fix mobilebert * fix vocab_file
parameter sharing implementation
CI will pass once apache/mxnet#18619 is merged
Please review the API changes. Scripts are not updated by this PR, but at least some will be updated and included here following verification of fine-tuning performance and NMT training. Note that it's not required to re-generate the parameter files.
Thanks to @acphile for his hard work on the Gluon API refactor on the MXNet side (apache/mxnet@cb54a4a)