BERTIN Project T5X training files
- Clone the repo and cd to it
- Clone https://github.com/google-research/t5x inside with name
t5x_repo
and install in edit mode - Symlink
t5x_repo/t5x
tot5x
in the cloned folder of this repo - Install dependencies jax for TPU and seqio (this one from repo)
- Run
run.sh
Lists of checkpoints can be found:
- https://console.cloud.google.com/storage/browser/t5-data
- https://console.cloud.google.com/storage/browser/scenic-bucket
If meeting segmentation faults when writing checkpoints to the buckets, the reasone might be tensorstore
version 0.1.18
. As a temporal fix, try using version 0.1.14
instead.