Replies: 6 comments 14 replies
-
I am midway through writing up a whitepaper that has more detail. If you have any specific questions, please ask. |
Beta Was this translation helpful? Give feedback.
-
@neonbjb Hello! As far as I understand, each of the models is trained separately. Could you clarify what are the inputs and targets during training for each model? And what loss functions are used? |
Beta Was this translation helpful? Give feedback.
-
Which models need to be retrained? Which models can be left unchanged for another language? |
Beta Was this translation helpful? Give feedback.
-
Sorry, meant "very similar architecture". |
Beta Was this translation helpful? Give feedback.
-
Hello! It's me again) Do I understand correctly that VAE, which you use, gets the vector index from the codebook by argmax in style of VQ-VAE, and not using Gumbel Softmax Relaxation, as it is done in DALL-E? |
Beta Was this translation helpful? Give feedback.
-
Hello! I wondered why "codes or latents" are written in several places in your diagrams with architecture. As far as I understand, you trained the diffusion model first on the pure VAE outputs, then on latents obtained by gpt (judging by the config that you provided in DL-Art-School), so how does it work? |
Beta Was this translation helpful? Give feedback.
-
Could you be more specific about the drawings that describe the architecture here?
https://nonint.com/2022/04/25/tortoise-architectural-design-doc/
Beta Was this translation helpful? Give feedback.
All reactions