Inference speed between distilled and parent model is almost the same #11257
Unanswered
probavee
asked this question in
Help: Model Advice
Replies: 1 comment
-
For a more accurate comparison, train with the The speed of the following components in the pipeline is similar no matter which transformer model is used, so you won't see a 2x difference in the whole pipeline. If you want to test just the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello !
I trained distilCamemBert using spacy's project for parser and tagging to have a lighter and faster model but there's almost no difference with the
fr_core_news_trf
based on Camembert. However the paper on distilbert suggest a 2x speed up.I did some benchmark on this text using different approach.
My environment :
jupyter notebook served with jupyter lab in a docker container
gpu: Tesla P100-PCIE-16GB
cuda: 11.6
spacy 3.4.1
distilcamembert pipeline:
["transformer","tagger","morphologizer","trainable_lemmatizer","parser"]
(I don't know why it spikes: google cloud infra, spacy or notebook)
Then for bigger documents

So I wanted to know if someone has clues on why it isn't twice as fast and where does those spikes comes from ?
Also the transformer component complexity is expected to be quadratic ?
Finally does the training config can influence the inference speed ?
Thank you !
Beta Was this translation helpful? Give feedback.
All reactions