Skip to content
This repository was archived by the owner on Jan 15, 2024. It is now read-only.

Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR #1258

Merged
merged 67 commits into from
Jul 29, 2020

Conversation

sxjscience
Copy link
Member

@sxjscience sxjscience commented Jul 8, 2020

@MoisesHer @dmlc/gluon-nlp-team

This PR adds two additional flags to backbone models to enhance the computational speed and usability.

  • layout
    The layout of the inputs + outputs, where
    • NT: (batch_size, sequence_length, ...)
    • TN: (sequence_length, batch_size, ...)
  • compute_layout
    The layout of the inner computation. By default, we have the "auto" option, in which GluonNLP will determine the best layout based on heuristics.

The technical insights about why layouts may matter lies as follows (also documented here

# 2. Calculate the attention weights
# Score: (L_query, B, N, C_Q) X (L_mem, B, N, C_Q) --> (B, N, L_query, L_mem)
# This layout structure can be implemented very efficiently because B, N are consecutive
# to each other. To have a clear picture of what's happening, we may consider the
# (i, j)th element of the output
# out[i, j, :, :] = query[:, i, j, :] X key[:, i, j, :].T, which is just one GEMM call
# We can thus implement the whole kernel via a single call of batched GEMM with stride.
):

When the layouts of memory and query are "TNC", they have the shape:

  • query: (L_query, B, N, C_Q)
  • memory: (L_mem, B, N, C_Q)

One step in the AttentionCell is to obtain the multi-head attention scores, which can be written as follows:

(L_query, B, N, C_Q) X (L_mem, B, N, C_Q) -> (B, N, L_query, L_mem)

This layout structure can be implemented very efficiently because B, N are consecutive to each other.

To have a clear picture of what's happening, we may consider the (i, j)th element of the output, i.e.,

out[i, j, :, :] = query[:, i, j, :] X key[:, i, j, :].T

This is just one GEMM call. We can thus implement the whole kernel via a single call of batched GEMM by correctly specifying the strides.

Also, in fairseq, the inner computation of the TransformerEncoder is using the TN layout:

https://github.com/pytorch/fairseq/blob/108bb2560b1ec01524ba723bc7c69186875afa0a/fairseq/models/transformer.py#L399-L407

After this PR, these models will have the layout flag:

  • TransformerEncoder, TransformerDecoder, TransformerNMTModel
  • BertModel
  • AlbertModel
  • ElectraModel
  • MobileBertModel
  • RobertaModel
  • XLMRModel
  • Test cases

@codecov
Copy link

codecov bot commented Jul 13, 2020

Codecov Report

Merging #1258 into numpy will increase coverage by 1.35%.
The diff coverage is 89.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##            numpy    #1258      +/-   ##
==========================================
+ Coverage   82.56%   83.92%   +1.35%     
==========================================
  Files          38       41       +3     
  Lines        5534     6157     +623     
==========================================
+ Hits         4569     5167     +598     
- Misses        965      990      +25     
Impacted Files Coverage Δ
setup.py 0.00% <ø> (ø)
src/gluonnlp/models/transformer_xl.py 82.52% <66.66%> (-0.20%) ⬇️
src/gluonnlp/models/xlmr.py 88.23% <70.00%> (+1.35%) ⬆️
src/gluonnlp/models/mobilebert.py 87.72% <81.67%> (+6.37%) ⬆️
src/gluonnlp/models/electra.py 80.78% <84.68%> (+1.83%) ⬆️
src/gluonnlp/attention_cell.py 79.52% <86.66%> (-0.39%) ⬇️
src/gluonnlp/models/roberta.py 93.67% <87.50%> (+4.89%) ⬆️
src/gluonnlp/models/bert.py 94.86% <92.96%> (+9.93%) ⬆️
src/gluonnlp/utils/testing.py 94.11% <93.47%> (-3.03%) ⬇️
src/gluonnlp/models/albert.py 95.47% <93.80%> (-1.22%) ⬇️
... and 17 more

@sxjscience
Copy link
Member Author

Now I'm waiting for #1261. We added two flags:

  • layout
    It means the layout of the input + output
  • compute_layout
    It means the layout of the inner computation. By default, we will always use compute_layout=auto, which will automatically determine the best compute layout based on our heuristic.

@sxjscience sxjscience changed the title [WIP] Add layout support of all backbones + Transformer [WIP] Add layout support of all backbone models + Transformer for NMT Jul 13, 2020
# Also, we can not use string here due to https://github.com/rbgirshick/yacs/issues/26
cfg.VERSION = 1
cfg.freeze()
return cfg
Copy link
Member Author

@sxjscience sxjscience Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zheyuye I'm moving the cfgs to the file to make it consistent to Transformer + RoBERTA.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I am going to revise other models

@sxjscience sxjscience requested review from MoisesHer and zheyuye July 13, 2020 21:00
@sxjscience sxjscience changed the title Add layout support of Transformer for NMT, BERT, ALBERT, ELECTRA Add layout + compute_layout support. First round models: Transformer for NMT, BERT, ALBERT, ELECTRA Jul 27, 2020
@sxjscience sxjscience changed the title Add layout + compute_layout support. First round models: Transformer for NMT, BERT, ALBERT, ELECTRA Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR Jul 28, 2020
@sxjscience
Copy link
Member Author

@dmlc/gluon-nlp-committers @MoisesHer @zheyuye Should be ready for review.

@szha szha requested review from eric-haibin-lin and szhengac July 28, 2020 19:32
@sxjscience sxjscience merged commit 3c87457 into dmlc:numpy Jul 29, 2020
zheyuye added a commit to zheyuye/gluon-nlp that referenced this pull request Jul 29, 2020
commit 232e0b6
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:05:17 2020 +0800

    update

commit 995e5d7
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:01:56 2020 +0800

    fix

commit 9623240
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 00:52:17 2020 +0800

    fix

commit d9c4140
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Wed Jul 29 23:07:10 2020 +0800

    fix transformer

commit e49fbe1
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Wed Jul 29 22:18:12 2020 +0800

    update

commit 1f75b26
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Wed Jul 29 22:04:08 2020 +0800

    test bart

commit 5bab516
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Wed Jul 29 21:34:47 2020 +0800

    fix cfg

commit 6c62a29
Merge: 3366cf3 033214e
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Wed Jul 29 21:33:10 2020 +0800

    Merge remote-tracking branch 'upstream/numpy' into bart

commit 033214e
Author: Xingjian Shi <xshiab@connect.ust.hk>
Date:   Wed Jul 29 00:36:57 2020 -0700

    [Numpy] Fix SQuAD + Fix GLUE downloading (dmlc#1280)

    * Update run_squad.py

    * Update run_squad.py

    * Update prepare_glue.py

commit 3c87457
Author: Xingjian Shi <xshiab@connect.ust.hk>
Date:   Tue Jul 28 18:03:21 2020 -0700

    Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (dmlc#1258)

    * Add layout support

    * fix test

    * Update transformer.py

    * Update transformer.py

    * Update README.md

    * try to add set_layout

    * update test case

    * fix

    * update

    * update

    * update

    * Update bert.py

    * fix bug

    * update

    * Update test_models_bert.py

    * Update tokenizers.py

    * add compute layout

    * Update xlmr.py

    * Update test_models_bert.py

    * revise test cases

    * Update layers.py

    * move jieba to try import

    * fix

    * Update transformer.py

    * fix

    * Update bert.py

    * Update setup.py

    * Update test_models_bert.py

    * Update test_models_bert.py

    * fix

    * update

    * Revise

    * Update electra.py

    * Update electra.py

    * Update test_models_electra.py

    * fix

    * fix bug

    * Update test_models_albert.py

    * add more testcases

    * fix

    * Update albert.py

    * Update albert.py

    * fix bug

    * fix testcase

    * Update test_models_electra.py

    * Update bert.py

    * update

    * Update test_models_electra.py

    * Update mobilebert.py

    * Update mobilebert.py

    * update mobilebert

    * Update test_models_mobilebert.py

    * Update mobilebert.py

    * fix bug

    * Update roberta.py

    * fix roberta

    * update

    * update

    * fix import

    * fix bug

    * update

    * reduce test workloads

    * address comment

    * address comment

commit 4d43f82
Author: Sheng Zha <szha@users.noreply.github.com>
Date:   Mon Jul 27 20:21:00 2020 -0700

    add subversion/wget to docker, add readme (dmlc#1279)

commit d76897b
Author: phile <phile_999@126.com>
Date:   Tue Jul 28 10:10:13 2020 +0800

    Add embedding related methods in numpy version (dmlc#1263)

    * A draft for embedding

    * fix embed_loader

    * add hyperbolic space and some updates

    * revise evaluation

    * fix

    * simple fixes

    * move l2norm to op.py

    * new features

    * fix

    * update

    * add tests, update

    * newline
zheyuye added a commit to zheyuye/gluon-nlp that referenced this pull request Jul 29, 2020
commit 510d991
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 02:33:22 2020 +0800

    test

commit 1b5fa7b
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:48:01 2020 +0800

    fix comment1

commit 6533601
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:27:44 2020 +0800

    fix comment

commit a8853f9
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:10:06 2020 +0800

    Squashed commit of the following:

    commit 232e0b6
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:05:17 2020 +0800

        update

    commit 995e5d7
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:01:56 2020 +0800

        fix

    commit 9623240
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 00:52:17 2020 +0800

        fix

    commit d9c4140
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 23:07:10 2020 +0800

        fix transformer

    commit e49fbe1
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:18:12 2020 +0800

        update

    commit 1f75b26
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:04:08 2020 +0800

        test bart

    commit 5bab516
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:34:47 2020 +0800

        fix cfg

    commit 6c62a29
    Merge: 3366cf3 033214e
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:33:10 2020 +0800

        Merge remote-tracking branch 'upstream/numpy' into bart

    commit 033214e
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Wed Jul 29 00:36:57 2020 -0700

        [Numpy] Fix SQuAD + Fix GLUE downloading (dmlc#1280)

        * Update run_squad.py

        * Update run_squad.py

        * Update prepare_glue.py

    commit 3c87457
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Tue Jul 28 18:03:21 2020 -0700

        Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (dmlc#1258)

        * Add layout support

        * fix test

        * Update transformer.py

        * Update transformer.py

        * Update README.md

        * try to add set_layout

        * update test case

        * fix

        * update

        * update

        * update

        * Update bert.py

        * fix bug

        * update

        * Update test_models_bert.py

        * Update tokenizers.py

        * add compute layout

        * Update xlmr.py

        * Update test_models_bert.py

        * revise test cases

        * Update layers.py

        * move jieba to try import

        * fix

        * Update transformer.py

        * fix

        * Update bert.py

        * Update setup.py

        * Update test_models_bert.py

        * Update test_models_bert.py

        * fix

        * update

        * Revise

        * Update electra.py

        * Update electra.py

        * Update test_models_electra.py

        * fix

        * fix bug

        * Update test_models_albert.py

        * add more testcases

        * fix

        * Update albert.py

        * Update albert.py

        * fix bug

        * fix testcase

        * Update test_models_electra.py

        * Update bert.py

        * update

        * Update test_models_electra.py

        * Update mobilebert.py

        * Update mobilebert.py

        * update mobilebert

        * Update test_models_mobilebert.py

        * Update mobilebert.py

        * fix bug

        * Update roberta.py

        * fix roberta

        * update

        * update

        * fix import

        * fix bug

        * update

        * reduce test workloads

        * address comment

        * address comment

    commit 4d43f82
    Author: Sheng Zha <szha@users.noreply.github.com>
    Date:   Mon Jul 27 20:21:00 2020 -0700

        add subversion/wget to docker, add readme (dmlc#1279)

    commit d76897b
    Author: phile <phile_999@126.com>
    Date:   Tue Jul 28 10:10:13 2020 +0800

        Add embedding related methods in numpy version (dmlc#1263)

        * A draft for embedding

        * fix embed_loader

        * add hyperbolic space and some updates

        * revise evaluation

        * fix

        * simple fixes

        * move l2norm to op.py

        * new features

        * fix

        * update

        * add tests, update

        * newline
zheyuye added a commit to zheyuye/gluon-nlp that referenced this pull request Jul 30, 2020
commit 9e1ffde
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 11:42:01 2020 +0800

    todo

commit 9a7c343
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 10:53:15 2020 +0800

    revert gelu

commit 0425346
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 10:49:52 2020 +0800

    re-upload bart

commit 516ae84
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 03:32:35 2020 +0800

    use_qkv_bias for transformer

commit 9d60cda
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 03:17:28 2020 +0800

    classifier_activation

commit 510d991
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 02:33:22 2020 +0800

    test

commit 1b5fa7b
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:48:01 2020 +0800

    fix comment1

commit 6533601
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:27:44 2020 +0800

    fix comment

commit a8853f9
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:10:06 2020 +0800

    Squashed commit of the following:

    commit 232e0b6
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:05:17 2020 +0800

        update

    commit 995e5d7
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:01:56 2020 +0800

        fix

    commit 9623240
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 00:52:17 2020 +0800

        fix

    commit d9c4140
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 23:07:10 2020 +0800

        fix transformer

    commit e49fbe1
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:18:12 2020 +0800

        update

    commit 1f75b26
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:04:08 2020 +0800

        test bart

    commit 5bab516
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:34:47 2020 +0800

        fix cfg

    commit 6c62a29
    Merge: 3366cf3 033214e
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:33:10 2020 +0800

        Merge remote-tracking branch 'upstream/numpy' into bart

    commit 033214e
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Wed Jul 29 00:36:57 2020 -0700

        [Numpy] Fix SQuAD + Fix GLUE downloading (dmlc#1280)

        * Update run_squad.py

        * Update run_squad.py

        * Update prepare_glue.py

    commit 3c87457
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Tue Jul 28 18:03:21 2020 -0700

        Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (dmlc#1258)

        * Add layout support

        * fix test

        * Update transformer.py

        * Update transformer.py

        * Update README.md

        * try to add set_layout

        * update test case

        * fix

        * update

        * update

        * update

        * Update bert.py

        * fix bug

        * update

        * Update test_models_bert.py

        * Update tokenizers.py

        * add compute layout

        * Update xlmr.py

        * Update test_models_bert.py

        * revise test cases

        * Update layers.py

        * move jieba to try import

        * fix

        * Update transformer.py

        * fix

        * Update bert.py

        * Update setup.py

        * Update test_models_bert.py

        * Update test_models_bert.py

        * fix

        * update

        * Revise

        * Update electra.py

        * Update electra.py

        * Update test_models_electra.py

        * fix

        * fix bug

        * Update test_models_albert.py

        * add more testcases

        * fix

        * Update albert.py

        * Update albert.py

        * fix bug

        * fix testcase

        * Update test_models_electra.py

        * Update bert.py

        * update

        * Update test_models_electra.py

        * Update mobilebert.py

        * Update mobilebert.py

        * update mobilebert

        * Update test_models_mobilebert.py

        * Update mobilebert.py

        * fix bug

        * Update roberta.py

        * fix roberta

        * update

        * update

        * fix import

        * fix bug

        * update

        * reduce test workloads

        * address comment

        * address comment

    commit 4d43f82
    Author: Sheng Zha <szha@users.noreply.github.com>
    Date:   Mon Jul 27 20:21:00 2020 -0700

        add subversion/wget to docker, add readme (dmlc#1279)

    commit d76897b
    Author: phile <phile_999@126.com>
    Date:   Tue Jul 28 10:10:13 2020 +0800

        Add embedding related methods in numpy version (dmlc#1263)

        * A draft for embedding

        * fix embed_loader

        * add hyperbolic space and some updates

        * revise evaluation

        * fix

        * simple fixes

        * move l2norm to op.py

        * new features

        * fix

        * update

        * add tests, update

        * newline
sxjscience pushed a commit that referenced this pull request Jul 30, 2020
* init

* fix convert roberta

* rename TransformerNMTModel as TransformerModel

* update bart

* fix

* fix

* update init

* add layernorm_embedding for transformer

* convert script

* encoder

* fix

* fix vocab

* fix roberta

* fix

* fix electra

* add conversion bash for roberta and xlmr

* ELECTRA SETUP

* convert bart decoder

* fix

* update

* testing output

* remove arange_like for embeddings

* fix

* update

* use_pooler for bart

* fix

* upload params for bart

* add test_models_bart

* fix cfg

* test bart

* update

* fix transformer

* Squashed commit of the following:

commit 510d991
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 02:33:22 2020 +0800

    test

commit 1b5fa7b
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:48:01 2020 +0800

    fix comment1

commit 6533601
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:27:44 2020 +0800

    fix comment

commit a8853f9
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:10:06 2020 +0800

    Squashed commit of the following:

    commit 232e0b6
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:05:17 2020 +0800

        update

    commit 995e5d7
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:01:56 2020 +0800

        fix

    commit 9623240
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 00:52:17 2020 +0800

        fix

    commit d9c4140
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 23:07:10 2020 +0800

        fix transformer

    commit e49fbe1
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:18:12 2020 +0800

        update

    commit 1f75b26
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:04:08 2020 +0800

        test bart

    commit 5bab516
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:34:47 2020 +0800

        fix cfg

    commit 6c62a29
    Merge: 3366cf3 033214e
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:33:10 2020 +0800

        Merge remote-tracking branch 'upstream/numpy' into bart

    commit 033214e
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Wed Jul 29 00:36:57 2020 -0700

        [Numpy] Fix SQuAD + Fix GLUE downloading (#1280)

        * Update run_squad.py

        * Update run_squad.py

        * Update prepare_glue.py

    commit 3c87457
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Tue Jul 28 18:03:21 2020 -0700

        Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (#1258)

        * Add layout support

        * fix test

        * Update transformer.py

        * Update transformer.py

        * Update README.md

        * try to add set_layout

        * update test case

        * fix

        * update

        * update

        * update

        * Update bert.py

        * fix bug

        * update

        * Update test_models_bert.py

        * Update tokenizers.py

        * add compute layout

        * Update xlmr.py

        * Update test_models_bert.py

        * revise test cases

        * Update layers.py

        * move jieba to try import

        * fix

        * Update transformer.py

        * fix

        * Update bert.py

        * Update setup.py

        * Update test_models_bert.py

        * Update test_models_bert.py

        * fix

        * update

        * Revise

        * Update electra.py

        * Update electra.py

        * Update test_models_electra.py

        * fix

        * fix bug

        * Update test_models_albert.py

        * add more testcases

        * fix

        * Update albert.py

        * Update albert.py

        * fix bug

        * fix testcase

        * Update test_models_electra.py

        * Update bert.py

        * update

        * Update test_models_electra.py

        * Update mobilebert.py

        * Update mobilebert.py

        * update mobilebert

        * Update test_models_mobilebert.py

        * Update mobilebert.py

        * fix bug

        * Update roberta.py

        * fix roberta

        * update

        * update

        * fix import

        * fix bug

        * update

        * reduce test workloads

        * address comment

        * address comment

    commit 4d43f82
    Author: Sheng Zha <szha@users.noreply.github.com>
    Date:   Mon Jul 27 20:21:00 2020 -0700

        add subversion/wget to docker, add readme (#1279)

    commit d76897b
    Author: phile <phile_999@126.com>
    Date:   Tue Jul 28 10:10:13 2020 +0800

        Add embedding related methods in numpy version (#1263)

        * A draft for embedding

        * fix embed_loader

        * add hyperbolic space and some updates

        * revise evaluation

        * fix

        * simple fixes

        * move l2norm to op.py

        * new features

        * fix

        * update

        * add tests, update

        * newline

* Squashed commit of the following:

commit 9e1ffde
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 11:42:01 2020 +0800

    todo

commit 9a7c343
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 10:53:15 2020 +0800

    revert gelu

commit 0425346
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 10:49:52 2020 +0800

    re-upload bart

commit 516ae84
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 03:32:35 2020 +0800

    use_qkv_bias for transformer

commit 9d60cda
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 03:17:28 2020 +0800

    classifier_activation

commit 510d991
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 02:33:22 2020 +0800

    test

commit 1b5fa7b
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:48:01 2020 +0800

    fix comment1

commit 6533601
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:27:44 2020 +0800

    fix comment

commit a8853f9
Author: ZheyuYe <zheyu.ye1995@gmail.com>
Date:   Thu Jul 30 01:10:06 2020 +0800

    Squashed commit of the following:

    commit 232e0b6
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:05:17 2020 +0800

        update

    commit 995e5d7
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 01:01:56 2020 +0800

        fix

    commit 9623240
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Thu Jul 30 00:52:17 2020 +0800

        fix

    commit d9c4140
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 23:07:10 2020 +0800

        fix transformer

    commit e49fbe1
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:18:12 2020 +0800

        update

    commit 1f75b26
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 22:04:08 2020 +0800

        test bart

    commit 5bab516
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:34:47 2020 +0800

        fix cfg

    commit 6c62a29
    Merge: 3366cf3 033214e
    Author: ZheyuYe <zheyu.ye1995@gmail.com>
    Date:   Wed Jul 29 21:33:10 2020 +0800

        Merge remote-tracking branch 'upstream/numpy' into bart

    commit 033214e
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Wed Jul 29 00:36:57 2020 -0700

        [Numpy] Fix SQuAD + Fix GLUE downloading (#1280)

        * Update run_squad.py

        * Update run_squad.py

        * Update prepare_glue.py

    commit 3c87457
    Author: Xingjian Shi <xshiab@connect.ust.hk>
    Date:   Tue Jul 28 18:03:21 2020 -0700

        Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (#1258)

        * Add layout support

        * fix test

        * Update transformer.py

        * Update transformer.py

        * Update README.md

        * try to add set_layout

        * update test case

        * fix

        * update

        * update

        * update

        * Update bert.py

        * fix bug

        * update

        * Update test_models_bert.py

        * Update tokenizers.py

        * add compute layout

        * Update xlmr.py

        * Update test_models_bert.py

        * revise test cases

        * Update layers.py

        * move jieba to try import

        * fix

        * Update transformer.py

        * fix

        * Update bert.py

        * Update setup.py

        * Update test_models_bert.py

        * Update test_models_bert.py

        * fix

        * update

        * Revise

        * Update electra.py

        * Update electra.py

        * Update test_models_electra.py

        * fix

        * fix bug

        * Update test_models_albert.py

        * add more testcases

        * fix

        * Update albert.py

        * Update albert.py

        * fix bug

        * fix testcase

        * Update test_models_electra.py

        * Update bert.py

        * update

        * Update test_models_electra.py

        * Update mobilebert.py

        * Update mobilebert.py

        * update mobilebert

        * Update test_models_mobilebert.py

        * Update mobilebert.py

        * fix bug

        * Update roberta.py

        * fix roberta

        * update

        * update

        * fix import

        * fix bug

        * update

        * reduce test workloads

        * address comment

        * address comment

    commit 4d43f82
    Author: Sheng Zha <szha@users.noreply.github.com>
    Date:   Mon Jul 27 20:21:00 2020 -0700

        add subversion/wget to docker, add readme (#1279)

    commit d76897b
    Author: phile <phile_999@126.com>
    Date:   Tue Jul 28 10:10:13 2020 +0800

        Add embedding related methods in numpy version (#1263)

        * A draft for embedding

        * fix embed_loader

        * add hyperbolic space and some updates

        * revise evaluation

        * fix

        * simple fixes

        * move l2norm to op.py

        * new features

        * fix

        * update

        * add tests, update

        * newline

* fix comment

* use xavier for embedding initializer
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants