This repository serves as a project to learn LLM's from first principles for me. Please read the 'thank you's' section down below if you can for credits of sources that helped me.
The goal is to not use any AI-generated code or dependencies for the actual inference & training logic. Instead, I intend to write every single line myself; after fully understanding why it's written like that.
❌ Not allowed to be touched by coding agents:
- Inference logic
- Training logic
- LLM harnass
✅ allowed to be touched by coding agents:
- Unit tests
- Infrastructure scripts like chart generation
I know JS is not optimal for heavy linear algebra and mathematics. But, I'm here to learn at the highest speed possible; and am most comfortable in JS. Thus the choice to write in this.
Guaranteed working version on tag timmy (git checkout timmy).
pnpm describe timmy
Parameter count: 6.6K
Transformer count: 2
Attention head count: 2
Hidden dimensions size: 16
I've written about the behavior and process of training the model here: http://its.beer/thoughts/training-timmy.
My main source of understanding is coming from the video series of 3Blue1Brown.
- Inference videos
- Training videos
- Gradient descent
- Backprop part 1 & part 2
On top of that AI coding agents helped me out a bunch. They were not allowed to write any inference and training code or give me answers, but were instructed to through hints and TDD guide me to them.
