from the log
5/20/2026 · 6 min read
— Ali Buğatekin
When the news of Andrej Karpathy joining Anthropic broke, my first thought was "another researcher changing sides." Then I read what he was going to do and stopped.
His mission: **use Claude to accelerate Claude's own pre-training research.**
AI is going to be used to make its own training more efficient. That sentence is worth re-reading.
Karpathy announced the move on his own X account:
I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D.
— Andrej Karpathy, on XAt Anthropic he'll work under team lead **Nick Joseph**, building a brand-new group inside Joseph's team that uses Claude itself to accelerate the next round of pre-training. Joseph publicly welcomed him, saying Karpathy is "perfectly suited for this exact challenge."
To understand why this move is loud, it helps to look at where Karpathy has been:
In the same post he added that he isn't walking away from education either:
I remain deeply passionate about education and plan to resume my work on it in time.
— Andrej KarpathyFor now, he's pausing Eureka Labs to get back to frontier LLM research. That signal alone matters: someone who founded an education startup considers it important enough to step back to the model-building front.
Pre-training is the stage where a model acquires its knowledge. Fed by the internet, books, and code repositories, it's also the most expensive phase — weeks of compute on huge GPU clusters.
Claude learns how to write, how to reason, and how to produce code at this stage. **Accelerating it means accelerating everything.**
Anthropic's stance is this: instead of more compute, **AI-assisted research** can make this phase more efficient.
Karpathy will build the new group that turns that idea into practice, sitting inside Nick Joseph's pre-training team. They'll study which parts of the pre-training pipeline Claude itself can make more efficient.
Not compute, but intelligence is the differentiator.
— Anthropic's implicit thesisSmarter research processes instead of infinite GPUs. Against OpenAI and Google, this opens a different front from the funding-and-hardware race.
This is where my head spins — in a good way.
If Claude is used to accelerate its own training, a better Claude comes out the other end. That better Claude can be used in the next round of pre-training. And the loop continues.
This isn't a theoretical scenario. Karpathy's mission is precisely to build the **practical version** of this loop.
Which decisions are researchers making, and which is Claude? That question gets harder to answer by the day.
I'm not inside this loop. Pre-training teams don't talk directly to the work I do.
But here's what I notice: **the line between model development and application development is blurring.**
Yesterday it was "take the model, build an app on top." Now the model is being used to improve its own foundation. This doesn't change my work in the short term. But in the long run, which tools I use and how fast the model evolves — all of it depends on how quickly this loop spins.
Small footnote: Karpathy also coined the term "vibe coding." Now he's moving into the foundations of the very model that makes vibe coding possible — a pleasant little loop of its own.
Karpathy's move to Anthropic reads like a career headline. But the question underneath is more lasting: **what does it mean for AI to take part in its own training?**
The tempo is shifting. Models are improving faster. And what feeds that speed is finally getting visible.
share if you found this useful