Skip to content

Commit 609b7ba

Browse files
author
swyx
committed
manual vault backup: 2024-06-29 - 6 files
Affected files: Monthly Notes/Apr 2024 notes.md Monthly Notes/Jun 2024 notes.md Monthly Notes/May 2024 notes.md Resources/Good AI Podcasts and Newsletters.md Resources/Understanding Transformers.md stub notes/OpenAI notes.md
1 parent 37cda03 commit 609b7ba

6 files changed

+38
-4
lines changed

Monthly Notes/Apr 2024 notes.md

+7-1
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,13 @@
4242

4343
- Llama 3
4444
- [top 5 in Lmsys, but also tied for first in English](https://x.com/lmsysorg/status/1782483701710061675?s=46&t=90xQ8sGy63D2OtiaoGJuww)
45+
- [karpathy notes](https://x.com/karpathy/status/1781028605709234613), [HN](https://news.ycombinator.com/item?id=40077533)
4546
- Cohere Command R+: [@cohere](https://twitter.com/cohere/status/1775878850699808928) released Command R+, a 104B parameter model with 128k context length, open weights for non-commercial use, and strong multilingual and RAG capabilities. It's available on the [Cohere playground](https://twitter.com/cohere/status/1775878883268509801) and [Hugging Face](https://twitter.com/osanseviero/status/1775882744792273209). [Aidan tweet](https://twitter.com/aidangomez/status/1775878606108979495)
4647
- **Optimized for RAG workflows**: Command R+ is [optimized for RAG](https://twitter.com/aidangomez/status/1775878606108979495), with multi-hop capabilities to break down complex questions and strong tool use. It's integrated with [@LangChainAI](https://twitter.com/cohere/status/1775931339361149230) for building RAG applications.
4748
- **Multilingual support**: Command R+ has [strong performance](https://twitter.com/seb_ruder/status/1775882934542533021) across 10 languages including English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese. The SonnetTokenizer is [efficient for non-English text](https://twitter.com/JayAlammar/status/1775928159784915229).
4849
- [became the #6 model on lmsys ](https://twitter.com/lmsysorg/status/1777630133798772766?t=90xQ8sGy63D2OtiaoGJuww)and top open model, beating mistral large and qwen, but behind claude sonnet, gemini pro, and gpt4t
4950
- Mistral 8x22B
51+
- https://news.ycombinator.com/item?id=40064736
5052
- Phi-3 ([HN, Technical Report](https://news.ycombinator.com/item?id=40127806), [sebastian bubeck short video](https://twitter.com/SebastienBubeck/status/1782627991874678809?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1782627991874678809%7Ctwgr%5E507304ee4fbb7b0a8a9c60b9bb5711109bde1d41%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fwww.emergentmind.com%2Fpapers%2F2404.14219))
5153
- phi-3-mini: 3.8B model trained on 3.3T tokens rivals Mixtral 8x7B and GPT-3.5
5254
- phi-3-medium: 14B model trained on 4.8T tokens w/ 78% on MMLU and 8.9 on MT-bench
@@ -89,7 +91,8 @@
8991

9092
- udio music https://twitter.com/udiomusic/status/1778045322654003448?t=6FDPaNxZcbSsELal6Sv7Ug
9193
- [comedy dialogue, sports analysis, commercials, radio broadcasts, asmr, nature sounds](https://x.com/mckaywrigley/status/1778867824217542766?s=46&t=6FDPaNxZcbSsELal6Sv7Ug)
92-
- sonauto as well
94+
- [sonauto as well](https://news.ycombinator.com/item?id=39992817): A more controllable AI music creator
95+
- Others do music generation by training a Vector Quantized Variational Autoencoder like Descript Audio Codec (https://github.com/descriptinc/descript-audio-codec) to turn music into tokens, then training an LLM on those tokens. Instead, we ripped the tokenization part off and replaced it with a normal variational autoencoder bottleneck (along with some other important changes to enable insane compression ratios). This gave us a nice, normally distributed latent space on which to train a diffusion transformer (like Sora). Our diffusion model is also particularly interesting because it is the first audio diffusion model to generate coherent lyrics!
9396
- [Reka Core/Flash/Edge](https://publications.reka.ai/reka-core-tech-report.pdf)
9497
- [Infinity AI: first Ai generated YC AI demo Day](https://x.com/snowmaker/status/1775598317399060687)
9598

@@ -100,9 +103,12 @@
100103
- [Augment - 252m seed](https://techcrunch.com/2024/04/24/eric-schmidt-backed-augment-a-github-copilot-rival-launches-out-of-stealth-with-252m/)
101104
- [XAI seeking 4b](https://www.bloomberg.com/news/articles/2024-04-11/elon-musk-s-xai-seeks-up-to-4-billion-to-compete-with-openai)
102105

106+
[Nvidia acquires RunAI for ~700m](https://news.ycombinator.com/item?id=40144235)
103107
## Learning
104108

109+
- llm.c release - [karpathy explanation](https://x.com/karpathy/status/1778153659106533806)
105110
- Thom Wolf - [how to train LLMs in 2024](https://youtu.be/2-SPH9hIKT8?si=wqYrDbhvgJUT2zHP)
111+
- [Building A GPU from scratch](https://x.com/MajmudarAdam/status/1783304235909877846)
106112
## discussion
107113

108114
- soumith v fchollet https://x.com/fchollet/status/1776319511807115589

Monthly Notes/Jun 2024 notes.md

+10-1
Original file line numberDiff line numberDiff line change
@@ -23,20 +23,26 @@
2323

2424
## open models
2525

26+
27+
- [Mamba-2 release](https://goombalab.github.io/blog/2024/mamba2-part1-model/)
28+
- https://arxiv.org/abs/2405.21060
29+
- https://x.com/_albertgu/status/1797651223035904355
30+
- https://x.com/tri_dao/status/1797650443218436165
2631
- [stable diffusion 3 medium](https://stability.ai/news/stable-diffusion-3-medium)
2732

2833

2934
## open tooling
3035

3136
- [plandex](https://github.com/plandex-ai/plandex) - a reliable and developer-friendly AI coding agent in your terminal. It can plan out and complete large tasks that span many files and steps.
37+
- [R2R - rag2riches](https://github.com/SciPhi-AI/R2R) -  a prod-ready RAG (Retrieval-Augmented Generation) engine with a RESTful API. R2R includes hybrid search, knowledge graphs, and more.
3238

3339
## other launches
3440

3541
- etched launch: https://www.etched.com/announcing-etched
3642
- Sohu is the world’s first transformer ASIC. One 8xSohu server replaces 160 H100 GPUs.
3743
- By specializing, Sohu gets unprecedented performance. One 8xSohu server can serve over 500,000 Llama 70B tokens per second.
3844
- luma ai dream machine https://news.ycombinator.com/item?id=40670096
39-
- arc-agi benchmark
45+
- arc-agi benchmark, [arc prize](https://news.ycombinator.com/item?id=40648960)
4046
- got 71% or 50% solution https://x.com/bshlgrs/status/1802766374961553887
4147
- [hugginface open llm leaderboard v2](https://huggingface.co/spaces/open-llm-leaderboard/blog)
4248
- gsm8k, truthfulqa are contaminated in instruction datasets
@@ -58,6 +64,7 @@
5864
- Assuming 100% utilization of your model Llama-3 8B-Instruct model costs about $17 dollars per 1M tokens when self hosting with EKS, vs ChatGPT with the same workload can offer $1 per 1M tokens.
5965
- Choosing to self host the hardware can make the cost <$0.01 per 1M token that takes ~5.5 years to break even.
6066
- [A Picture is Worth 170 Tokens: How Does GPT-4o Encode Images?](https://www.oranlooney.com/post/gpt-cnn/)
67+
- Here’s a [fact](https://openai.com/api/pricing/): GPT-4o charges 170 tokens to process each `512x512` tile used in high-res mode. At ~0.75 tokens/word, this suggests a picture is worth about 227 words—only a factor of four off from the traditional saying.
6168
- forcing AI on to us
6269
- msft recall default https://news.ycombinator.com/item?id=40610435
6370
- and delay https://news.ycombinator.com/item?id=40683210
@@ -66,3 +73,5 @@
6673
- [talaria tool](https://buttondown.email/ainews/archive/ainews-talaria-apples-new-mlops-superweapon-4066/)
6774
- perplexity - forbes attribution issue
6875
- [books4 dataset](https://web.archive.org/web/20240519104217/https://old.reddit.com/r/datasets/comments/1cvi151/ai_books4_dataset_for_training_llms_further/)
76+
- [together ai mixture of agents](https://www.together.ai/blog/together-moa)
77+
- Our reference implementation, Together MoA, significantly surpass GPT-4o 57.5% on AlpacaEval 2.0 with a score of 65.1% using only open source models. While Together MoA achieves higher accuracy, it does come at the cost of a slower time to first token; reducing this latency is an exciting future direction for this research.

Monthly Notes/May 2024 notes.md

+15-1
Original file line numberDiff line numberDiff line change
@@ -36,13 +36,25 @@
3636
- [Consistency LLM: converting LLMs to parallel decoders accelerates inference 3.5x](https://hao-ai-lab.github.io/blogs/cllm/)|
3737

3838

39+
## launches
40+
41+
- [ChatGPT UI for rabbit holes with reader pane - a9.io](https://delve.a9.io/)
42+
3943
## Funding
4044

4145
- scale ai: **Scale AI raises $1B at $13.8B valuation in round led by Accel**: [@alexandr_wang](https://twitter.com/alexandr_wang/status/1792905417065914858)
4246
- suno ai: **Suno raises $125M to enable anyone to make music with AI**: [@suno_ai_](https://twitter.com/suno_ai_/status/1792922276683297162)
4347

4448
## Discussions
4549

50+
- [LLM evaluation tools/vendors that really like that are useful in domain specific contexts? ](https://x.com/HamelHusain/status/1787944186714759419)
51+
I'm looking for vendors that have ALL of the below:
52+
53+
1. Observability
54+
2. Allow you to write your own assertions and LLM judges
55+
3. Bootsrap the creation of #2 automatically with LLMs
56+
4. Measure the alignment between humans (through an annotation queue) and capture critiques, also helping to generate more kinds of tests like #2
57+
5. Prompt, data and model versioning
4658
- [The future of foundation models is closed source](https://x.com/absoluttig/status/1793001830110380313)
4759
- centralized vs decentralized
4860
- Despite recent progress and endless cheerleading, open-source AI will become a financial drain for model builders, an inferior option for developers and consumers, and a risk to national security. Closed-source models will create far more economic and consumer value over the next decade.
@@ -51,9 +63,11 @@
5163
- https://twitter.com/DrJimFan/status/1786054643568517261
5264
- [Consistency LLM - parallel decoders accelerates inference 3.5x](https://news.ycombinator.com/item?id=40302201)
5365
- [Google Gemini's impending Context Caching](https://news.ycombinator.com/item?id=40364220)
66+
- [Consistency Large Language Models: A Family of Efficient Parallel Decoders](https://hao-ai-lab.github.io/blogs/cllm/): converting LLMs to parallel decoders accelerates inference 3.5x
5467
- [shunyu yao phd defense](https://twitter.com/ShunyuYao12/status/1789058769982550031)
55-
- "Language Agents: From Next-Token Prediction to Digital Automation"
68+
- "Language Agents: From Next-Token Prediction to Digital Automation" https://ysymyth.github.io/papers/Dissertation-finalized.pdf
5669
- Talk (WebShop, SWE-bench, ReAct, ToT, CoALA, and on the future of agents):
5770
- https://ysymyth.github.io/papers/Dissertation-finalized.pdf
71+
- [fineweb dataset](https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1) a new, large-scale (15-trillion tokens, 44TB disk space) dataset for LLM pretraining. FineWeb is derived from 96 CommonCrawl snapshots and produces better-performing LLMs than other open pretraining datasets.
5872
- learning
5973
- [Llama 3 implemented in pure NumPy](https://docs.likejazz.com/llama3.np/)

Resources/Good AI Podcasts and Newsletters.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ some of these are on youtube too, i dont really bother separating them. ⭐ rep
2424
- [Eye on AI](https://open.spotify.com/show/5aFnCGDhpL5bGr2uHy4bB5) - Weekly analysis at the intersection of artificial intelligence and industry. (less technical but great guest backlog)
2525
- [AI Jason](https://youtu.be/pJwR5pv0_gs?si=BdXjIX1mEik-Lbpz) - new frequent ai project breakdowns channel
2626
- [Matthew Berman](https://www.youtube.com/@matthew_berman) - short explainer videos of AI Engineering related projects and news
27-
- smaller creators worth noting: [Umar Jamil (standard concept teaching channel, very technical)](https://www.youtube.com/@umarjamilai?app=desktop) and [Daniel Bourke (livestream paper replication)](https://www.youtube.com/@danielbourkearxiv2821?app=desktop), [Efficient NLP (good short paper/technique explainers)](https://www.youtube.com/@EfficientNLP), [Trelis Research](https://www.youtube.com/watch?v=ae2lbmtTY5A)
27+
- smaller creators worth noting: [Algorithmic Simplicity](https://www.youtube.com/watch?v=N6Piou4oYx8) (explanations of archs), [Umar Jamil (standard concept teaching channel, very technical)](https://www.youtube.com/@umarjamilai?app=desktop) and [Daniel Bourke (livestream paper replication)](https://www.youtube.com/@danielbourkearxiv2821?app=desktop), [Efficient NLP (good short paper/technique explainers)](https://www.youtube.com/@EfficientNLP), [Trelis Research](https://www.youtube.com/watch?v=ae2lbmtTY5A)
2828
- [/r/LocalLlama list has 23 recommendations](https://www.reddit.com/r/LocalLLaMA/comments/1atycgd/which_localllama_focused_yt_channels_do_you_follow/)
2929
- Companies
3030
- ⭐ [The Cognitive Revolution](https://www.cognitiverevolution.ai/) - Nathan Labenz - great new pod

Resources/Understanding Transformers.md

+4
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ The Illustrated Transformer - [https://jalammar.github.io/illustrated-transform
5757
- Reminder that my deep learning course [@unige_en](https://twitter.com/unige_en)is entirely available on-line. 1000+ slides, ~20h of screen-casts. [https://fleuret.org/dlc/](https://t.co/6OVyjPdwrC)
5858
- https://e2eml.school/transformers.html Transformers from Scratch
5959

60+
https://www.jvoderho.com/blog.html?blogid=Transformer%20as%20a%20general%20purpose%20computer
6061

6162
https://news.ycombinator.com/item?id=35712334
6263
The Illustrated Transformer is fantastic, but I would suggest that those going into it really should read the previous articles in the series to get a foundation to understand it more, plus later articles that go into GPT and BERT, here's the list:
@@ -76,6 +77,9 @@ Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With
7677
[3] [https://arxiv.org/abs/2301.05062](https://arxiv.org/abs/2301.05062)
7778

7879

80+
[3blue 1 brown visualizing attention](https://www.3blue1brown.com/lessons/attention) https://news.ycombinator.com/item?id=40035514
81+
82+
7983
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) - [https://jalammar.github.io/illustrated-bert/](https://jalammar.github.io/illustrated-bert/)
8084

8185
The Illustrated GPT-2 (Visualizing Transformer Language Models) - [https://jalammar.github.io/illustrated-gpt2/](https://jalammar.github.io/illustrated-gpt2/)

stub notes/OpenAI notes.md

+1
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@
5656
## sama
5757

5858
- intentionally did not give chatgpt a human name - want to distance from “person-ness” but accept that others will do it https://overcast.fm/+SusunrACk/12:51
59+
- [not fired by YC](https://news.ycombinator.com/item?id=40521657)
5960

6061
## mainstream puff pieces
6162

0 commit comments

Comments
 (0)