-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[shardformer] merge shardformer to main #4152
Commits on Jun 26, 2023
-
[shardformer] init shardformer code structure (#3731)
* init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example
Configuration menu - View commit details
-
Copy full SHA for 604a213 - Browse repository at this point
Copy the full SHA 604a213View commit details -
[shardformer]: Feature/shardformer, add some docstring and readme (#3816
) * init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example * add share weight and train example * add train * add docstring and readme * add docstring for other files * pre-commit
Configuration menu - View commit details
-
Copy full SHA for ffacf0f - Browse repository at this point
Copy the full SHA ffacf0fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 69d3daa - Browse repository at this point
Copy the full SHA 69d3daaView commit details -
[shardformer] refactored the user api (#3828)
* [shardformer] refactored the user api * polish code
Configuration menu - View commit details
-
Copy full SHA for 0470f1b - Browse repository at this point
Copy the full SHA 0470f1bView commit details -
[shardformer] update readme with modules implement doc (#3834)
* update readme with modules content * remove img
Configuration menu - View commit details
-
Copy full SHA for 051e970 - Browse repository at this point
Copy the full SHA 051e970View commit details -
[shardformer] add Dropout layer support different dropout pattern (#3856
Configuration menu - View commit details
-
Copy full SHA for 3e840f7 - Browse repository at this point
Copy the full SHA 3e840f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for bf9c2fd - Browse repository at this point
Copy the full SHA bf9c2fdView commit details -
[shardformer] add gpt2 policy and modify shard and slicer to support (#…
…3883) * add gpt2 policy and modify shard and slicer to support * remove unused code * polish code
Configuration menu - View commit details
-
Copy full SHA for 551fec3 - Browse repository at this point
Copy the full SHA 551fec3View commit details -
[shardformer] Align bert value (#3907)
* add bert align test, fix dist loss bug * forward and backward align * add ignore index * add shardformer CI * add gather_output optional for user in shardconfig * update readme with optional gather_ouput * add dist crossentropy loss test, remove unused files * remove unused file * remove unused file * rename the file * polish code
Configuration menu - View commit details
-
Copy full SHA for e5bc7e3 - Browse repository at this point
Copy the full SHA e5bc7e3View commit details -
[shardformer] Unit test (#3928)
* fix bug in slicer, add slicer unit test * add dropout test * use pid as dropout seed * updata dropout test with local pattern * ad todo
Configuration menu - View commit details
-
Copy full SHA for 661dc3b - Browse repository at this point
Copy the full SHA 661dc3bView commit details -
[shardformer] Add dropout layer in shard model and refactor policy api (
#3949) * add dist dropout in model * update docstring and bert policy with dropout * refactor basepolicy and sharded, update bert * update format * update gpt2 policy * update bert policy * remove unused code * update readme for new policy usage
Configuration menu - View commit details
-
Copy full SHA for 702513a - Browse repository at this point
Copy the full SHA 702513aView commit details -
[shardformer] support llama model using shardformer (#3969)
adjust layer attr
Configuration menu - View commit details
-
Copy full SHA for 17d1607 - Browse repository at this point
Copy the full SHA 17d1607View commit details -
Configuration menu - View commit details
-
Copy full SHA for e849d1b - Browse repository at this point
Copy the full SHA e849d1bView commit details -
[Shardformer] Downstream bert (#3979)
* add dist dropout in model * update docstring and bert policy with dropout * refactor basepolicy and sharded, update bert * update format * update gpt2 policy * update bert policy * remove unused code * update readme for new policy usage * add downstream model of bert * remove unused code
Configuration menu - View commit details
-
Copy full SHA for 735e44b - Browse repository at this point
Copy the full SHA 735e44bView commit details -
[shardformer] fix an error in readme (#3988)
* fix an error in readme * simplify code
Configuration menu - View commit details
-
Copy full SHA for 73cacb7 - Browse repository at this point
Copy the full SHA 73cacb7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 45a3110 - Browse repository at this point
Copy the full SHA 45a3110View commit details -
[shardformer] Refactor shardformer api (#4001)
* fix an error in readme * simplify code * refactor shardformer * add todo * remove slicer * resolve code review
Configuration menu - View commit details
-
Copy full SHA for 18396e7 - Browse repository at this point
Copy the full SHA 18396e7View commit details -
[shardformer] integrated linear 1D with dtensor (#3996)
* [shardformer] integrated linear 1D with dtensor * polish code
Configuration menu - View commit details
-
Copy full SHA for 579b617 - Browse repository at this point
Copy the full SHA 579b617View commit details -
Configuration menu - View commit details
-
Copy full SHA for bdc405e - Browse repository at this point
Copy the full SHA bdc405eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2c366e3 - Browse repository at this point
Copy the full SHA 2c366e3View commit details -
Configuration menu - View commit details
-
Copy full SHA for eaa46d7 - Browse repository at this point
Copy the full SHA eaa46d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 60eb380 - Browse repository at this point
Copy the full SHA 60eb380View commit details -
Configuration menu - View commit details
-
Copy full SHA for 90e1a0a - Browse repository at this point
Copy the full SHA 90e1a0aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 38ceded - Browse repository at this point
Copy the full SHA 38cededView commit details -
[shardformer] fix bert and gpt downstream with new api (#4024)
* fix bert downstream with new api * remove comment line
Configuration menu - View commit details
-
Copy full SHA for c982769 - Browse repository at this point
Copy the full SHA c982769View commit details -
Configuration menu - View commit details
-
Copy full SHA for b2c5dd0 - Browse repository at this point
Copy the full SHA b2c5dd0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8219d96 - Browse repository at this point
Copy the full SHA 8219d96View commit details -
[shardformer] add gpt2 test and layer class refactor (#4041)
* add gpt2 test and layer class refactor * add dropout in gpt2 policy
Configuration menu - View commit details
-
Copy full SHA for 0113097 - Browse repository at this point
Copy the full SHA 0113097View commit details -
[shardformer] adapted T5 and LLaMa test to use kit (#4049)
* [shardformer] adapted T5 and LLaMa test to use kit * polish code
Configuration menu - View commit details
-
Copy full SHA for ac3aef3 - Browse repository at this point
Copy the full SHA ac3aef3View commit details -
Configuration menu - View commit details
-
Copy full SHA for e5d4a87 - Browse repository at this point
Copy the full SHA e5d4a87View commit details -
support kit use for bert/gpt test (#4055)
* support kit use for bert test * support kit test for gpt2
Configuration menu - View commit details
-
Copy full SHA for d5d9178 - Browse repository at this point
Copy the full SHA d5d9178View commit details -
[shardformer] support module saving and loading (#4062)
* [shardformer] support module saving and loading * polish code
Configuration menu - View commit details
-
Copy full SHA for 9436f73 - Browse repository at this point
Copy the full SHA 9436f73View commit details -
[shardformer] add linearconv1d test (#4067)
* add linearconv1d test * add linearconv1d test
Configuration menu - View commit details
-
Copy full SHA for 8108c35 - Browse repository at this point
Copy the full SHA 8108c35View commit details -
Configuration menu - View commit details
-
Copy full SHA for a484c71 - Browse repository at this point
Copy the full SHA a484c71View commit details -
[shardformer] Add layernorm (#4072)
* add layernorm to bert * add layernorm test * add layernorm test with load state dict * add use_mixedfusedLN in shard config * refactor policy to support fused_layernorm
Configuration menu - View commit details
-
Copy full SHA for 12801e8 - Browse repository at this point
Copy the full SHA 12801e8View commit details -
[test] fixed tests failed due to dtensor change (#4082)
* [test] fixed tests failed due to dtensor change * polish code
Configuration menu - View commit details
-
Copy full SHA for d88844c - Browse repository at this point
Copy the full SHA d88844cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4e0db99 - Browse repository at this point
Copy the full SHA 4e0db99View commit details
Commits on Jun 27, 2023
-
[shardformer] shardformer support opt models (#4091)
* [shardformer] shardformer support opt models * [shardformer] shardformer support opt models, fix * [shardformer] shardformer support opt models, fix * [shardformer] shardformer support opt models, fix
Configuration menu - View commit details
-
Copy full SHA for a7433a0 - Browse repository at this point
Copy the full SHA a7433a0View commit details
Commits on Jun 28, 2023
-
[shardformer] support vision transformer (#4096)
* first v of vit shardformer * keep vit * update * vit shard add vitattention vitlayer * update num head shard para * finish test for vit * add new_model_class & postprocess * add vit readme * delete old files & fix the conflict * fix sth
Configuration menu - View commit details
-
Copy full SHA for ad604f7 - Browse repository at this point
Copy the full SHA ad604f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8b0930c - Browse repository at this point
Copy the full SHA 8b0930cView commit details
Commits on Jun 30, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 92e669e - Browse repository at this point
Copy the full SHA 92e669eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8d3f077 - Browse repository at this point
Copy the full SHA 8d3f077View commit details -
Configuration menu - View commit details
-
Copy full SHA for 60d2cad - Browse repository at this point
Copy the full SHA 60d2cadView commit details -
Configuration menu - View commit details
-
Copy full SHA for 26ecfd7 - Browse repository at this point
Copy the full SHA 26ecfd7View commit details -
[shardformer] write an shardformer example with bert finetuning (#4126)
* [shardformer] add benchmark of shardformer * [shardformer] add benchmark of shardformer
Configuration menu - View commit details
-
Copy full SHA for b6f4e05 - Browse repository at this point
Copy the full SHA b6f4e05View commit details
Commits on Jul 3, 2023
-
[shardformer] refactored some doc and api (#4137)
* [shardformer] refactored some doc and api * polish code
Configuration menu - View commit details
-
Copy full SHA for 1b4a901 - Browse repository at this point
Copy the full SHA 1b4a901View commit details
Commits on Jul 4, 2023
-
[shardformer] made tensor parallelism configurable (#4144)
* [shardformer] made tensor parallelism configurable * polish code
Configuration menu - View commit details
-
Copy full SHA for f8dcf9d - Browse repository at this point
Copy the full SHA f8dcf9dView commit details -
Configuration menu - View commit details
-
Copy full SHA for d1db043 - Browse repository at this point
Copy the full SHA d1db043View commit details -
Configuration menu - View commit details
-
Copy full SHA for dd9fe39 - Browse repository at this point
Copy the full SHA dd9fe39View commit details