|
42 | 42 |
|
43 | 43 | - Llama 3
|
44 | 44 | - [top 5 in Lmsys, but also tied for first in English](https://x.com/lmsysorg/status/1782483701710061675?s=46&t=90xQ8sGy63D2OtiaoGJuww)
|
| 45 | + - [karpathy notes](https://x.com/karpathy/status/1781028605709234613), [HN](https://news.ycombinator.com/item?id=40077533) |
45 | 46 | - Cohere Command R+: [@cohere](https://twitter.com/cohere/status/1775878850699808928) released Command R+, a 104B parameter model with 128k context length, open weights for non-commercial use, and strong multilingual and RAG capabilities. It's available on the [Cohere playground](https://twitter.com/cohere/status/1775878883268509801) and [Hugging Face](https://twitter.com/osanseviero/status/1775882744792273209). [Aidan tweet](https://twitter.com/aidangomez/status/1775878606108979495)
|
46 | 47 | - **Optimized for RAG workflows**: Command R+ is [optimized for RAG](https://twitter.com/aidangomez/status/1775878606108979495), with multi-hop capabilities to break down complex questions and strong tool use. It's integrated with [@LangChainAI](https://twitter.com/cohere/status/1775931339361149230) for building RAG applications.
|
47 | 48 | - **Multilingual support**: Command R+ has [strong performance](https://twitter.com/seb_ruder/status/1775882934542533021) across 10 languages including English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese. The SonnetTokenizer is [efficient for non-English text](https://twitter.com/JayAlammar/status/1775928159784915229).
|
48 | 49 | - [became the #6 model on lmsys ](https://twitter.com/lmsysorg/status/1777630133798772766?t=90xQ8sGy63D2OtiaoGJuww)and top open model, beating mistral large and qwen, but behind claude sonnet, gemini pro, and gpt4t
|
49 | 50 | - Mistral 8x22B
|
| 51 | + - https://news.ycombinator.com/item?id=40064736 |
50 | 52 | - Phi-3 ([HN, Technical Report](https://news.ycombinator.com/item?id=40127806), [sebastian bubeck short video](https://twitter.com/SebastienBubeck/status/1782627991874678809?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1782627991874678809%7Ctwgr%5E507304ee4fbb7b0a8a9c60b9bb5711109bde1d41%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fwww.emergentmind.com%2Fpapers%2F2404.14219))
|
51 | 53 | - phi-3-mini: 3.8B model trained on 3.3T tokens rivals Mixtral 8x7B and GPT-3.5
|
52 | 54 | - phi-3-medium: 14B model trained on 4.8T tokens w/ 78% on MMLU and 8.9 on MT-bench
|
|
89 | 91 |
|
90 | 92 | - udio music https://twitter.com/udiomusic/status/1778045322654003448?t=6FDPaNxZcbSsELal6Sv7Ug
|
91 | 93 | - [comedy dialogue, sports analysis, commercials, radio broadcasts, asmr, nature sounds](https://x.com/mckaywrigley/status/1778867824217542766?s=46&t=6FDPaNxZcbSsELal6Sv7Ug)
|
92 |
| - - sonauto as well |
| 94 | + - [sonauto as well](https://news.ycombinator.com/item?id=39992817): A more controllable AI music creator |
| 95 | + - Others do music generation by training a Vector Quantized Variational Autoencoder like Descript Audio Codec (https://github.com/descriptinc/descript-audio-codec) to turn music into tokens, then training an LLM on those tokens. Instead, we ripped the tokenization part off and replaced it with a normal variational autoencoder bottleneck (along with some other important changes to enable insane compression ratios). This gave us a nice, normally distributed latent space on which to train a diffusion transformer (like Sora). Our diffusion model is also particularly interesting because it is the first audio diffusion model to generate coherent lyrics! |
93 | 96 | - [Reka Core/Flash/Edge](https://publications.reka.ai/reka-core-tech-report.pdf)
|
94 | 97 | - [Infinity AI: first Ai generated YC AI demo Day](https://x.com/snowmaker/status/1775598317399060687)
|
95 | 98 |
|
|
100 | 103 | - [Augment - 252m seed](https://techcrunch.com/2024/04/24/eric-schmidt-backed-augment-a-github-copilot-rival-launches-out-of-stealth-with-252m/)
|
101 | 104 | - [XAI seeking 4b](https://www.bloomberg.com/news/articles/2024-04-11/elon-musk-s-xai-seeks-up-to-4-billion-to-compete-with-openai)
|
102 | 105 |
|
| 106 | +[Nvidia acquires RunAI for ~700m](https://news.ycombinator.com/item?id=40144235) |
103 | 107 | ## Learning
|
104 | 108 |
|
| 109 | +- llm.c release - [karpathy explanation](https://x.com/karpathy/status/1778153659106533806) |
105 | 110 | - Thom Wolf - [how to train LLMs in 2024](https://youtu.be/2-SPH9hIKT8?si=wqYrDbhvgJUT2zHP)
|
| 111 | +- [Building A GPU from scratch](https://x.com/MajmudarAdam/status/1783304235909877846) |
106 | 112 | ## discussion
|
107 | 113 |
|
108 | 114 | - soumith v fchollet https://x.com/fchollet/status/1776319511807115589
|
|
0 commit comments