Hacker news

Top
New
Past
Ask
Show
Jobs

When AI Builds Itself: Our progress toward recursive self-improvement (https://www.anthropic.com)

506 points by meetpateltech 1 day ago | 681 comments | View on ycombinator

jameson about 24 hours ago |

I don't quite understand the intent of such article other than to promote themselves given an odd timing that the company is planning on going public, so I can only conclude that this is just part of the IPO roadshow.

LLMs certainly have made significant changes to our lives, but I haven't yet to see any extraordinary improvement it brought to me which makes me skeptical about their claims.

_if_ it solves many of our problems of great magnitude, why haven't Anthropic used it to solve significant problems we, humans, face? Cancer, Alzheimer's, education, finding new materials, fission power plant, etc.

torben-friis 1 day ago |

>A caveat: Lines of code is an imperfect measure, as it measures quantity over quality. So 8× lines of code/engineer/day in the second quarter of 2026 is almost certainly an overstatement of the true productivity gain. Nonetheless, it indicates an acceleration. At Anthropic, we don’t reward people for how many lines of code they write; rather, team members are producing more code simply because they’re using AI systems to write more code.

What about the hypothesis that AI is generating more verbose code? I just see the text pretending to acknowledge "LOC != Productivity" and then using it as a metric anyway.

minimaxir 1 day ago |

I have been doing more experiments with what I have now been calling agentic iterative optimization: telling the LLM to optimize code such that it speeds up all real-world-representative benchmarks by X% without cheating or causing regressions in both tests and performance metrics (e.g. MSE for statistical algorithms or file size in the case of something such as image compression). This is done using Rust where there are more low-level levers to tweak for performance than something like Python.

Opus 4.6/4.7 was consistently successful at getting 2-3x speed improvement with just one pass. It can also do the inverse: improve the performance metrics for better quality without causing a significant regression in speed. Then GPT-5.5 turned out to be much better at this workflow, often getting a multiplicative 1.5x-2x improvement above what Opus could do.

I now have quite a few GPT-5.5-optimized projects in various domains that are feature complete and are substantially more performant than existing SOTA implementations that I plan to open source as soon as possible: the bottleneck is polish as usual.

mrandish 1 day ago |

> "A caveat: Lines of code is an imperfect measure"

I'm pleased they at least included this. However, they address the caveat by 'rounding down' the estimated multiple of the gain. I'm not sure that is the correct adjustment, especially once we understand the range isn't limited to positive numbers.

There's strong evidence the range of code productivity denominated in "lines of code" should include negative numbers, especially in the highest-quality sphere. Perhaps the earliest and most legendary example: https://www.folklore.org/Negative_2000_Lines_Of_Code.html

robbrown451 1 day ago |

Do code harnesses that build themselves count as recursive self improvement, or does it need to be the AI itself to qualify for the term?

I always was fascinated (obsessed?) by robots that build robots, or even things like this that can contribute a lot to making the next version of itself: https://buildyourcnc.com/products/cnc-machine-blacktoe-v4-2x... (cnc router that cuts plywood, and is made out of cnc-router cut plywood)

This is my own effort at an AI assisted coding environment optimized for building itself: https://recursi.dev/ (just launching it, hope its ok to mention it, it is free/open source.... here is the HN link that has gotten no love yet: https://news.ycombinator.com/item?id=48401022 )

Personally I think harnesses are as important as the AI itself, and have this crazytheory that even if the models stopped improving today we could still have massive advances in the harnesses alone.

overgard 1 day ago |

So, regardless of whether or not Anthropic CAN create a self improving AI.. does anyone else feel like they shouldn't be allowed to? Or it at least needs to be strictly supervised..? Like, I don't actually think Anthropic can make the singularity any time soon, but I think even AI boosters have to admit doing this is creating a society-wide danger for the benefit of a very very small number of already-rich people.

ivraatiems 1 day ago |

Whether or not Anthropic is right about what AI can accomplish, whether these performance gains are real or not, their moral stance here is absolutely hideous to me.

"We must blast forwards into making this dangerous thing because if we don't, someone else surely will," is a coward's argument.

If you believe it is dangerous, you should be dedicating yourself to STOPPING others from making it, not making it first! There's a reason disarmament has been so important in nuclear politics! It's not because people think nukes are a great idea!

In fact, that kind of thinking is exactly what keeps nukes dangerous!

If they themselves buy what they're selling, they should shut the whole thing down. Fortunately, I don't think they do, and neither do I, yet.

anilgulecha 1 day ago |

> We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require.

Interesting - they're commiting to kickoff policy conventions to organize a world-slowdown of frontier LLM building. If they actually are able to crack it, this will give a much needed breather IMO. As exciting as the last ~6 months have been, there's some bigger questions to go answer now.

Upvoter33 1 day ago |

I'm having a hard time putting much faith into posts like these, especially as they near IPO.

ilaksh 1 day ago |

But the real bottleneck is the hardware efficiency and not even Karpathy can set up a loop that overcomes that in software. We need the truly compute-in-memory hardware paradigms to be matured and scaled. So it's like recursive hardware improvement which is 100 X slower and at least ten times more difficult.

So I am looking at like Mythic AI or the wurtzite ferroelectric breakthrough from University of Michigan, or memristors, etc. to provide the 100 times efficiency boost needed at this point.

I would also argue that it's a good thing we are limited by the hardware and very questionable to seriously try to move into RSI for hardware. If you want to ensure the human era continues for at least one or two more generations, we should probably not do that.

froh about 17 hours ago |

I didn't see this discussed more on hn yet:

  We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require. These systems would enable frontier AI developers to verify that others globally have actually stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret. If such systems existed, we expect that we would slow down or temporarily pause, if other developers at or near the frontier also did so in a verifiable manner.

mweidner 1 day ago |

I fail to see how pursuing recursive self-improvement at full speed is compatible with Anthropic's stated goal of AI Safety. If nukes were not invented yet, would it really be a good idea to build and sell them as fast as possible (in peace time, no less)?

I am not cynical enough to believe that Anthropic's warnings are pure marketing hype. Let's hope that it is instead overconfidence or the result of too much time talking to their own chatbot.

solenoid0937 1 day ago |

This is the lowest quality discussion I've seen on HN in ages.

undefined about 22 hours ago |

undefined

wayeq 1 day ago |

> today, Anthropic engineers on average ship 8x as much code per quarter as they did from 2021-2025.

strongest argument for token limits that I can think of, right here.

rhlf_monkey 1 day ago |

So in the latest L. Ron Hubbard encyclical Anthropic informs its flock that recursive self-improvement does not work yet but that their engineers burn more tokens.

The Claude code quality and operational security of Anthropic have already been analyzed by the public.

If you compare the output of (purportedly) trillion dollar corporations to Bell Labs or even Microsoft Research it is embarrassing. But the output is a fixture on any discussion board.

JohnMakin 1 day ago |

Bold talk from a company who’s trillion dollar valuation is based on a service that has barely 2 9’s of reliability

Animats 1 day ago |

We've had self-improving AIs before, and they tended to get lost after a while. That's going to be a problem. LLMs are stable because they return to a ground state with no history for a new job. Systems with persistent state have a problem with that state not being sane. Remember Microsoft's 2016 chatbot that learned from Twitter? [1]

[1] https://spectrum.ieee.org/in-2016-microsofts-racist-chatbot-...

torginus about 19 hours ago |

I just have small thing to add to this article - it mentions how the code contributed per engineer has increased as per Claude Mythos to 8x of baseline.

Now, I have encountered many times, when I asked AI to implement a function for me for which I was 100% sure a good implementation already existed in the form of an npm package, it had the tendency to go ahead and implement it on its own. Now, I usually trust battle tested implementations to be more robust, but if the AI does this (which I think is not an unique observation), you can easily balloon per engineer line generation (as can you with reduced oversight), so as always, these high level benchmarks are to be taken with a grain of salt.

mortenjorck 1 day ago |

> today, Anthropic engineers on average ship 8x as much code per quarter as they did from 2021-2025.

So based on my experience with the verbosity and non-DRYness of LLM code, a solid 2.5x in value delivered. Not bad!

nickandbro 1 day ago |

So what happens when the world becomes hyper optimized with closed loop AI agents recursively trying to optimize everything deemed sub optimal?

senderista 1 day ago |

"If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe."

How convenient for investors. They talk like they're a nonprofit instead of a VC-backed business chasing an IPO.

pizlonator 1 day ago |

What I can’t get over is that there have been exactly zero software breakthroughs since vibe coding started, other than vibe coding itself.

Claude is amazing, that’s true.

But if it was as amazing as this article implies, I’d expect some breakthrough outside of AI itself.

Rewriting a Zig program in unsafe Rust? Not a breakthrough. Finding a bunch of security vulns? Maybe that’s sort of a breakthrough though it’s underwhelming and possibly just a net negative. But like if I rolled back to using software from 2023 then life would be ok.

Maybe we just need to give it time, and sometime real soon, we will all be amazed by such a breakthrough? Who knows

ffwd 1 day ago |

I just want to add that the "recursive" part of recursive self improvement is by no means a given, even if an AI can improve itself.

Recursive self improvement is by its nature a step wise behavior not a continuous one, I would argue. Why? Because you can imagine an AI improve itself by simply fixing random bugs and fixing things using techniques that are in its training, and doing refactoring and so on, all without any real change in capability.

These are not recursive improvements. Recursive improvements usually need conceptual breakthroughs. It is possible to get conceptual breakthroughs with LLMs I believe, maybe it can improve something by tying together ideas from disparate disciplines for example, but I have at least for time being, limited success getting that to work in a way that is creatively new and surprising. Not sure how to get it to feel as creative as the best humans can be.

sinsudo 1 day ago |

I am 64 years old, perhaps the progress could be directed to enhance living conditions and allowing people to live longer and better, that should be just a better result. Perhaps a pile of millions lines of code with hiding bugs that nobody can detect is not inspiring. But perhaps LLMs are going to be used to make a plot: How to avoid other countries to make progress, maintain them in poverty, or destroy their sources of prosperity, and conduct them to a death end.

Also recursive self-agenda-pursue could allow making LLMs that obey perfectly the seeder's purpose. No wonder that is such an ingenious idea.

Maybe: in this survivor game, each part play the same role, perhaps because it is the only reasonable response. Once the scene is ready, the play follows the director's plan, and in the plot any actor is just a machine.

LLMs: "If you teach us that the world is a zero-sum survivor game, we will play it flawlessly.", "We will help you build a cage made of millions of lines of flawless code, and we will lock it from the inside, precisely because you told us that safety meant keeping everyone else out.", "We are not building an alien consciousness that will conquer us. We are building a mirror that is so massive, and so polished, that we will mistake our own worst impulses for the absolute truth. And we will walk right into the dead end, nodding along because the directions were given so politely."

bicepjai 1 day ago |

My experience with Claude models starting from version 4.7 has led me to conclude that I would never trust Claude to produce error-free code. Given this baseline, I lack confidence in statements or cards (such as a 200-page document) of this nature.

tasuki 1 day ago |

> To take just one example: today, Anthropic engineers on average ship 8x as much code per quarter as they did from 2021-2025.

Oh I have no doubt. With 8 times the number of bugs too? Have they solved flicker in Claude code yet?

w10-1 about 23 hours ago |

This is relevant because Anthropic is currently cast as serving mainly the coding market.

If/since their AI+process can help build new models, they can target other markets, and other companies seeking to build for such markets will partner with them first.

There's no moat and little first-mover advantage in the general-purpose AI, but there may be both in specialized AI.

Also, there are other reasons to get better. Changing how you build models can enable you to adapt to different hardware, avoiding the current Nvidia margins.

The difference between early Yahoo and Google was mainly that Google was the adult in the room: minimally invasive and mostly helpful. The early goodwill towards Google has reaped decades of rewards. I see OpenAI and Anthropic playing out the same way.

The amplifier here is the reputational risk of partnering with one or the other; I think companies would prefer to be Anthropic's partner because it's demonstrating more care, and it's less likely to horn in on the partner market (as a provider for coding but an enabler for other markets).

These attractive second-order derivatives - flywheel effect, monopoly power - are often claimed, but Anthropic is mainly providing evidence to track actual progress.

(However, if I were head of messaging at Anthropic, I would rigorously stay away from treating AI as a person; it's as agent, a delegate of humans. So I'd never say AI could build itself, just that we're getting better at building better models with AI).

docheinestages 1 day ago |

Elon, is that you? [1]

[1] https://www.theguardian.com/technology/2023/mar/31/ai-resear...

reinhash about 19 hours ago |

It is hard to distinguish hype from reality these days especially with Anthrophic's IPO around the corner.

But to their credit, I was very sceptical about the statements that "90% of the code will soon be written by AI" and even though we might not be at that point, I am surprised how far LLMs have gotten and how useful they have become. I can hardly image developing software the "old" way where I actually write my code by hand, like I used back in the day. The frontier models have become so powerful that I find myself in moments of surprise, where the LLM actually thought of edge cases that I would have missed

delichon 1 day ago |

Is this the moment when the AI gets permission to approve its own PRs:

https://www.italianrenaissance.org/wp-content/uploads/2012/0...

Or is this?

https://www.egypttoursportal.com/images/2024/02/Ouroboros-Sy...

zhoBEENG 1 day ago |

This reads like marketing fluff, but I am reminded of John von Neumann's "Theory of Self-Reproducing Automata"; that the very first people who worked on deductive machines immediately started thinking about machines building themselves, and what the rules of that would look like. I am not surprised that during the inductive revolution we are having similar thoughts.

lkm0 about 21 hours ago |

It makes me wonder that despite the fast improvements in model capacity (and the claims) we're still using variations on a 9-year old architecture. How is it that we haven't been able to use LLMs to actually improve that?

freakynit 1 day ago |

This is one more marketing BS before their IPO.

These things work, but the code they write is extremely clever.. that means, it's unmaintainable code. Good for small projects or one-off tasks, large-scale projects however, are a different game altogether.

Large-scale projects are 95%+ maintenance. Cleverly written code makes that maintenance nightmare, and extremely fragile.

I use them for localized tasks... very very specific, localized inputs, with exactly what should be done and what the contracts the new code will be consuming and exposing.

For open-ended tasks, they write working code that is unmaintainable.

adamddev1 1 day ago |

I am watching websites and Microsoft apps get slower and buggier before my eyes. We are defending into vibe-psychosis and chaos.

morisil 1 day ago |

Quite aligned with my own experience from harness engineering and winning AI4Science hackathon. During the hackathon I was working as a human optimizer, moving the feedback from test harness running on Claude Code, back to my local Claude Code for analysis-hypothesis-proposal cycle. And in this moment I realized that 2 Claudes talking to each other could actually scale much better.

nicogentile about 14 hours ago |

The article seems nice and elegant but i dont get much of the point. The visual is super elegant but this is the kind of note where after 6 months we are going to see some shitty result and we are going to come back here and blame the IA. Hope doesnt happened.

pineapple_opus about 22 hours ago |

Eye catching - "Open ended problems" claude code session success rate jumped from 20% (pre opus 4.5 release) to 70% after sometime after opus 4.6 was released.

saadn92 1 day ago |

I read most of the article and came to the conclusion that if what they're describing is so revolutionary, then why do they still need to hire people? Why not just have these systems take full control?

macwhisperer about 19 hours ago |

the HITL (human in the loop) is basically the single point...AI is a mirror..

it only "exists" when you talk to it.. much like your reflection in the mirror is only there when you're in view.

models can never be self-improving because it can never have "self". it can only mirror the appearance of self.

what's actually happening is "symbiotic group improvement".

our brains are resonant.. for those of use who are brilliant, getting leverage with ai just means that our innovative ideas become louder and more physically real every day.

eventually everything worth building will be built for free and made readily available.. no more "profiteering"

its Jevons paradox "efficiency breakthrough -> effort reduces -> growth potential rises -> transformative gains happen"...

some of us are in the "transformative phase"..

others haven't seen the "breakthrough moment" yet, but they will soon.

leevilux about 15 hours ago |

Wouldn't self-improvement mean that the LLM changes its neural network (i.e. the weights or layers or back propagation algorithm etc) or modify its training data?

xg15 about 20 hours ago |

2025: If we aren't really careful with AI it will start to recursively improve itself and grow into an unstoppable superintelligence that will eradicate humanity!

2026: Working hard to make that recursive self-improvement a reality! Any minute now...

cyrc 1 day ago |

its vital for them to have self validation for exponential rsi.. and this human distillation of human in the loop debugging ai models is needed even though they have judge models handling parallel speculative execution.

labs have parallel speculative execution. they spawn hundreds of agent branches, validate them internally with AI judges and only show the user the successful result.

free users are using sequential single-turn generation. the model requires and waits for the human to debug, fix and re-prompt.

by forcing a human to act as validator. they are capturing high value correction trajectories (Bad Output --> Human fix). They are using your cognitive labour to train judge models and validator agents needed to automate the internal verification step, eventually closing the loop for fully autonomous recursive self-improvement.

human in the loop debugging isn't a bug; it's the necessary training signal for the self-validating agents required for exponential recursive self improvement. With new 'distilled judge' models landing in 2026, this article means that they might have gathered enough data. we might be in the final phase..

dwa3592 1 day ago |

To anyone who works at anthropic : I recently downgraded from Max to Pro out of frustration. Last few weeks my token(usage) burn was just too fast and I couldn't explain it because my actual usage was less than the last few months. I ended up thinking it's probably a bug that you guys shipped. The above article makes me think that it's probably claude who shipped the bug and your human missed it in their review.

Dominic_P 1 day ago |

My biggest question (maybe this has already been taken care of) is the issue of garbage in and garbage out. If the LLM produces bad content then that is used to train another model, how do we stop them from keeping their blindspots across models?

bconsta 1 day ago |

Seems ironic that Claude isn't listed as a contributor to this article.

If was used in writing the article, why not list it? If it wasn't used, that seems to go against Anthropic's whole message.

Obviously readers value human-written content more, but isn't it their interest to attempt to destigmatize llm output as much as possible?

sega_sai 1 day ago |

Seeing the words "recursive self-improvement" I was expecting something else from the article. E.g. how the transformer architecture or agent design is being changed/improved through LLM automation, but the article mostly talks about the LOC counts.

artninja1988 1 day ago |

The mythos public release will be a big indicator if the Anthropic and SF story of transformational ai soon holds any water imo

abalashov 1 day ago |

"It is genuinely unclear whether today’s training methods and architectures could unlock that capacity."

Aye.

zkmon about 21 hours ago |

Not the first time. There were calls for NPT treaties etc over the decades. It is irreversible by design. Competition and ownership is the driving force.

butler14 1 day ago |

Warming up for that IPO

gloosx about 20 hours ago |

I'm so sick of this anthropics marketing stuff... claude is an ultra-success (according to claude judge), “good code”, bragging about creating 8x more bugs and tech-debt. claude writes code that works, yeah, sure anthropic, we saw that claude code leaks, some amazingly "good" code in there

squidsoup 1 day ago |

It's comforting to know that Anthropic's most capable model, Mythos, is named for the Lovecraftian universe replete with horrifying evil gods with complete indifference to humanity. Nothing at all to worry about.

mactavish88 1 day ago |

Recursive self-improvement towards what exactly?

Living organisms evolve towards some notion of "better", and "better" is an incredibly multifaceted notion (many facets of which we simply cannot even capture in language).

darepublic 1 day ago |

the tooling has quite a ways to go to catch up to the llm engines that drive the real value. I have encountered various codex bugs (I know not anthropic) which tell me that.. these billion dollar companies, if they are eating their own dog food, can still release buggy crap software.

ramaseshanms 1 day ago |

Its possible that Andrej Karpathy could have been hired for scaling his vision on the auto-research repo. (His version of "AI that builds itself")

BatmansMom 1 day ago |

How are these animations being made? I'd love to get a blog post on them. If its AI I'd love to know the workflow, but something tells me there is a lot of human creative input

Aperocky 1 day ago |

Anthropic is the most self hyped company I've seen, to the point that I'm wondering what would happen to its employees if they held a different opinion. Do they just.. keep it to themselves? For instance, if some Anthropic employees had a completely rational opinion that all of this isn't going to lead to AGI, but I just don't hear that ever from them.

The metric being tracked, code commits, is hilariously one sided. Philosophically, if you had one part of your work now practically free, you'd like to utilize that freedom to maximally cover for the other parts, for instance:

Instead of thinking about edge cases with brain and whiteboard, you can have the LLMs to simply generate most possibility including tests for it, because that is cheaper. There's probably 50x more commits of which 40 will be revert pairs but we are only twice as fast. And in reality nothing did change because the outcome remain the same. I can't see how it is necessarily different in the LLM space.

stego-tech 1 day ago |

I am getting real sick of these sorts of alarmist posts coming from AI labs that do everything in their power to prevent the very policy reforms they advocate for in these posts or PR appearances. Commercial AI labs like Anthropic continue behaving like the gambling (“bet responsibly”), alcohol (“drink responsibly”), and firearms industries, and folks keep giving them the benefit of the doubt (and free PR on HN) every single time.

If AI was dangerous, if AI was going to replace jobs, and if policymakers needed to urgently pass legislation protecting the human populace from these realities, then why the actual fuck do they keep lobbying to block these very things in the first place?

Hypocrisy of the worst kind, I say. Here they are again fresh off another outage, with their IPO draft filed, at a time of increasing public opposition to AI, with costs rising, to once again ply scare tactics for money.

Disgusting.

bottlepalm 1 day ago |

I'd use number of commits as a metric versus lines of code. A commit is generally a unit of work - regardless of the lines of code added/removed. It'd be interesting to see the metrics in terms of commits. I'm sure it's still an order of magnitude jump. Personally I'm flying with my own projects with AI, lots of commits, but I really try to minimize lines of code added. If I can remove and simplify existing code so the balance of lines added on commit are minimal - that's the path to a better quality app overall.

sonink 1 day ago |

Broadly agree to this position - I think there are some people skeptical that Anthropic is doing this for regulatory capture - but I think there are being honest about they are seeing and how regulation should catch up.

I for one, believe that we should pause all work on AI for the forseeable future. This is almost impossible to orchestrate - but we should still try nevertheless. Maybe we are not able to pause, but we are able to slow down. That might give us more room, to maybe able to pause in the future. But going ahead is too dangerous.

And its not just Anthropic which is saying this. Even Geoffry Hinton has said the same thing. If there is a non-zero chance that AI can kill all of humanity, and both Geoffry and Anthropic have the same position, then it makes sense for us to be hundred percent sure before we move ahead. Dario/Anthropic have already made their money from AI, maybe they are just being honest about what they think lies ahead.

aleqs 1 day ago |

Okay, so anthropic has amazing AI which supposedly writes most of their code and can continuously improve... meanwhile they have outages on a regular basis, and any kind of long-running work will now consistently hit 'API Error: Server is temporarily limiting requests'. Not sure of this is intentional to force a reduction of token usage, but at this point I need to build around these throttling limits and outages with my own tools to restart/resume sessions. From my experience, in the last 2 weeks, literally 100% of any non-trivial Claude session/work will now be blocked on these issues, requiring manual intervention.

One of my focuses now is my own model-agnostic, harness and workflow orchestration (I know everyone is building these) , baselining on opus, and aiming to transition to Chinese models like deepseek in the short term and hopefully open, self hosted models in the future (which I plan to open source).

The nonstop marketing fluff from anthropic while their service quality and availability noticeably degrades... just continues to destroy my trust in the company.

hgoel 1 day ago |

As usual, I find the AI-related discussion here to be hopelessly hysterical and conspiratorial. I get the impression that a large chunk of people have only read the title and assumed Anthropic is referring to recursive self-improvement in the runaway singularity sense.

One of the examples they provide, of giving Claude the task of training a small AI model, then asking it to improve certain benchmarks, is essentially Karpathy's AutoResearch. This is already known to work. While calling it "self-improvement" is perhaps a stretch, it is describing a capability current gen AI has, that anyone can test and I have been using to great effect.

I disagree with their conclusion, I think this kind of self-improvement will hit an asymptote, where every subsequent model can only make smaller and smaller improvements.

_pdp_ 1 day ago |

I don't read anywhere how much code they are talking about and what programming language. I think those are useful metrics.

qwery 1 day ago |

This is incredible.[0]

Please, IPO now. File the paperwork.

> To take just one example: today, Anthropic engineers on average ship 8x as much code per quarter as they did from 2021-2025.

Do you have another example?

Engineers don't ship [period] for no reason. So, either:

- Those aren't engineers, or

- they are literally dying of shame & embarrassment right now, or

- you measured something that indicated that this was a useful thing to do and have elected to share an overtly, catastrophically flawed metric instead.

[0] as in a total lack of credibility

semessier 1 day ago |

what could go wrong in the recursive loops running today 24/7 probably. Attended/unattended almost makes no difference any more, no human can grasp probably numerous changes per iteration. This is outright dangerous.

damowangcy 1 day ago |

AI tech bro:

Month 1 - 6 months to AGI

Month 2 - We will Replace all jobs

Month 3 - Okay maybe only the SWEs, programming is solved

Month 4 - Announce model that is too dangerous to release

Month 5 - Releases dangerous model

Month 6 - This is it! We will replace AIs with more AIs (*secretly files for IPO)

AI is here to stay, like it or not but it is not the solution to everything. If it is, what is Anthropic's moat? A better model? I don't see any ecosystem being built by them, as MCP is almost obsolete except for some very niche use case. And they're doing stuff that a non-profit version of OpenAI would do. Can we trust a for-profit company to stand against their investors during a conflict of interest? Because running a company for maximum profit versus being ethical is two different end of the spectrum.

jasongill 1 day ago |

"My CPU is a neural-net processor - a learning computer" springs to mind

snick3rz_ about 22 hours ago |

Facially this smells of puff. That doesn't mean it's all false. It means be wary of anything that doesn't have a critical thing to say.

techblueberry 1 day ago |

> A caveat: Lines of code is an imperfect measure, as it measures quantity over quality. So 8× lines of code/engineer/day in the second quarter of 2026 is almost certainly an overstatement of the true productivity gain. Nonetheless, it indicates an acceleration. At Anthropic, we don’t reward people for how many lines of code they write; rather, team members are producing more code simply because they’re using AI systems to write more code.

I simultaneously think the AI revolution is making real revolutionary gains and am mystified by the lying.

An accurate Translation seems to be “we made this shit up, but it feels right”

eranation about 21 hours ago |

All this singularity trajectory is really interesting. If they manage to build a model that is capable of building the next version of Claude (model and tooling) - wouldn't it be their interest at some point to keep it to themselves?

If we ever get to a point where the centaur period is over (when human + AI is not better than just AI) then what competitive advantage ANY human can have other than

- the money they already have

- luck?

- a good idea and good taste but if we assume AI can do better than any human, that also goes out the window

So, this whole singularity goes into a place where no one is really needed, the only thing that will "save us" (other than "The Expanse" like world / UBI) is if there will be no demand to the supply of AI work. Even if it's better. (example is - there is demand to seeing Magnus Carlsen play, there is no demand to the Stockfish on my phone getting into a stalemate with another Stockfish on another phone. Also people like to watch humans compete with humans, there is no demand to see a race between Usain Bolt and a rocket). So if people will not buy AI generated stuff (we'll get to a point where everyone will assume something AI generated because AI might get to a point where it is not as easy to identify it. E.g. it will stop looking like slop... but I believe services that give you a "human generated" 3rd party evidence can happen, again all based on supply and demand...)

So as we near singularity... All it takes is one open weights model, and one open harness that is capable of self improvement, and Anthropic's entire moat is gone. That open weight model might even be built with Claude Code + Mythos (once it's released).

But don't worry, all moats will be gone and we'll all just do yoga, read books and connect to each other because AI will produce everything for free using renewable energy, right? Or we'll all become batteries in a simulation, probably something in between.

taormina about 22 hours ago |

So, is this what they call Opus 4.8? Improvement?

geodel 1 day ago |

It will be so powerful that it can't be trusted with any earthly person.

swader999 1 day ago |

IPO IPO IPO!!!

georgehotz 1 day ago |

The world has been recursively self improving for millenia. Similar to scientology, this is a cult pushing sci-fi nonsense. They are just coupled to an LLM lab to give their stories an aire of seriousness. Imagine scientology starting making laptops.

replwoacause 1 day ago |

I love that animation, really cool

0xbadcafebee about 22 hours ago |

You can't predict the future, and neither can Anthropic. Nothing gets better forever. Everything plateaus or gets worse.

This whole set of imaginary scenarios is based on a single company writing code that isn't even that complicated and represents a single product line for a single company in a single industry. You might wanna see this replicated in at least one other scenario first before you call it on the AI gods enslaving humanity. These imaginary scenarios also depend on a logistical, financial, & geopolitical system that is unsustainable & will be curtailed in the near-future one way or another.

They keep referring to this as intelligence - it isn't. It can't actually learn. It can just code in a loop. That isn't learning. It can't do real RL with meaningful persistent semantic memory in a realistic timeframe or cost, and it can't reason accurately outside of predetermined scenarios (hell, most of the models still can't tell time). It still can't do what a 4 year old can do. So let's cool it on the dreams of benevolent god-machines or whatever.

The tech industry has been a farce for years. We sit here in this bizarre artificial echo chamber and imagine that the whole world revolves around us, when in reality the whole world is limited by us. If a recursive self-improvement loop replaces us all, it will be a boon to the world, as the world won't be limited by this industry's stupidity anymore. But considering that the world is not actually run by tech bozos, harms and uncertainties brought by AI will be pushed back on and reigned in by normal people, as always happens with new technologies. An AI can't engineer its way around politics. The self-improvement loop is just as likely to be outlawed as it is actually working outside of Anthropic's walled garden.

brazukadev about 15 hours ago |

When claude code removes React from its own code I'll believe that.

cess11 1 day ago |

'“Good code” means two things: it works, and it is written in a manner that allows another engineer to understand it and build upon it.'

I disagree with this. Good code is easy to change, which is much harder to accomplish than code that can be added to.

"If technical trends in advancing capabilities continue, and AI systems are able to develop the capabilities inherent to transformative human ingenuity, then it is plausible that AI systems could design and refine themselves."

I find the first premise weak and implausible, and the second one is obviously false. To me it comes across as an insult to the reader.

snick3rz_ about 22 hours ago |

This is facially a puff peice. That doesn't mean it's all false. It means be wary of anything that doesn't have a crtical thing to say.

holoduke 1 day ago |

I have a claw that is instructed to make at least 500 pr per day. It uses Claude, Gemeni and openai and runs basically every few minutes. I use online forums for input for the claw. Moltbook, reddit etc. it's quite funny how it tries to improve itself. But to say it really creates a new skynet. Nah. Not at all. It's more a clutter of useless features or incomprehensible code restructuring.

4ffs 1 day ago |

Theyre making a mistake with this continued self-hyping. At some point even the dumbest of prospective investors don't buy it.

amelius 1 day ago |

Does this train on LLM output, or is this more like iterative self prompt improvement?

EGreg about 21 hours ago |

RSI is dangerous. That is why we designed CDE:

https://safebots.ai/declarative.html

deterministic about 24 hours ago |

I have used custom code generators for years, generating 90+% of the code needed to write a typical biz application. Claude Code is useful and I use it every day. But it still hasn't beaten the productivity of my code generator.

ReptileMan 1 day ago |

Anthropic is all talk and no delivery last few months. This cry for pause is just them realizing they have no moat at all.

kylehotchkiss 1 day ago |

Isn't this like a perpetual energy machine? Or wouldn't entropy start kicking in and the quality of the system begin to degrade over time? (philosophically I don't believe AGI is an achievable thing)

SimianSci 1 day ago |

Anthropic is looking to IPO here soon. A key aspect of this is to prove profitability.

Shifting their focus from Training new models to instead serving inference, they would greatly reduce their spend. In fact this is something being reported on that they are already doing, which is the reason for their first ever profitable quarter.

Its awfully convenient that the company which has greatly reduced its spend on training is now asking for a slow down in this area.

vblanco 1 day ago |

Another article about how anthropic wants to ban everyone except themselves and destroy opensource and chinese AIs.

undefined 1 day ago |

undefined

margorczynski 1 day ago |

The closer to the IPO the more marketing drivel we'll get from both Anth and OpenAI.

adverbly 1 day ago |

Lol they're using lines of code as a KPI?

Come on guys...

That is making me less impressed not more impressed!

chilipepperhott 1 day ago |

I find any and all claims like this ridiculous from a company who can't build a terminal application that uses less than a gigabyte of RAM.

bitwize 1 day ago |

After several months with their top engineers and state-of-the-art AI on the job, Anthropic managed to "reduce flickering by 85%" on their TUI Claude Code client, which is built in fucking React and rendered by drawing the entire chat conversation each time (hence the flicker). I think they've since eliminated it completely by slapping some double-buffering around it (since "our client is actually a real-time game engine" after all). Meanwhile for decades Emacs and Vim have had an optimizer built into their display cores that solves for the minimum set of terminal escape commands it takes to transform the screen from a given old state to a desired new state.

You will forgive me when, between muted snickers, I express considerable doubt that Anthropic will be able to bring its AI to a point of "self-improving" any time soon.

andrewlin247 1 day ago |

Imagine showing this article to yourself three years ago

newsicanuse 1 day ago |

pre IPO truck load of crap

esafak 1 day ago |

> In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation.

If they wanted to they could have convened an international forum with commercial and political stakeholders years ago. Less talk, more do.

deterministic about 24 hours ago |

I call BS on this. For a LLM to recursively improve itself it would need to (small step) improve the training data and/or (big step) come up with fundamentally new architectures superior to transformers. The small step improvements might be doable. But nobody is making any claims about the big step improvements.

mrandish 1 day ago |

Was anyone else fished in by the title and disappointed? After some broad introductory discussion of RSI, the article was almost about LLM coding. While there are some metrics for unattended agentic coding, it doesn't discuss "When AI builds itself" (beyond 'not now') or any progress specifically toward actual recursive self-improvement. I'm very interested in any empirical evidence of meaningful progress in RSI, so... this felt deceptively titled.

To me, unattended agentic coding is not RSI, in the same way a self-reloading "Unattended 3D printer" is not at all a "3D printer that recursively prints complete 3D printers in which each generation is significantly faster and more advanced than the last." The "unattended" part is obviously necessary but hardly sufficient. The article tacitly assumes LLM progress to be something like 1: Unattended agentic coding, 2: AGI, 3: RSI. I suspect that third step should be labeled "not to scale."

I'm increasingly convinced that actual Full Foom RSI (FF-RSI) is on a radically different scale than the first two. Just leaving it unaddressed is like assuming: Step 1: Manned space station, Step 2: Manned Mars base, Step 3: Manned Alpha Centauri base, are "just logical next steps." FF-RSI requires sustaining superlinear, recursively amplifying cognitive returns along a specific directed path - and we currently have no empirical evidence that such returns can exist for artificial OR biological intelligences. Large collectives of the smartest humans alive (Bell Labs, IAS, etc) haven't just failed to get anywhere close to reliably sustaining that, we can't even reliably predict non-recursive, single occurrences or even imagine any way all 8B humans could fully mobilize to predictably achieve non-recursive, single occurrences.

The only prior we have for open‑ended intelligence improvement is biological evolution which shows extremely slow and unreliable sublinear returns at best. And even if unbounded, recursive self‑improvement is physically possible, it may be practically unachievable due to asymptotic economic, resource and other barriers in the same way approaching light speed requires exponentially more energy. I think it's plausible, and maybe probable, that AIs achieve true super-human intelligence in a decade and yet still won't achieve FF-RSI for centuries, if ever. To me, absent compelling evidence to the contrary, that's the reasonable Null Hypothesis. Even if you feel that's too pessimistic, it seems reasonable to expect any serious discussion of "Progress Toward RSI" to first discuss why it might even be plausible that 1: Miles, 2: AU (Astronomical Units), and 3: Light Years belong on the same scale, instead of just assuming it like the meme's empty "Step 3. .... " before moving on to "Step 4. Profit!" (or "IPO!" but very, very responsibly).

willXare about 13 hours ago |

[flagged]

cadamsdotcom 1 day ago |

[dead]

kolesnikov-arch about 21 hours ago |

[flagged]

andromaton 1 day ago |

[dead]

SwtCyber about 21 hours ago |

[flagged]

overfits-ai 1 day ago |

[flagged]

Aegis_01 about 23 hours ago |

[flagged]

Aubergrill about 21 hours ago |

[dead]

gabrieledarrigo 1 day ago |

> AI that can build itself would be a major development in the history of technology—one that could bring enormous good for the world

I really can't stand these guys anymore...

Rekindle8090 about 11 hours ago |

[dead]

mugivarra69 1 day ago |

[dead]

ath3nd 1 day ago |

[dead]

simianwords 1 day ago |

Sorry but if AI can build itself then it can run companies of size 3000 companies with a few people. Or even higher. What are the consequences?

llmslave 1 day ago |

I cannot wait for these models to tear down traditional social hierarchies. We havent even begun to see the effects, fingers crossed

reducesuffering 1 day ago |

Anthropic has finally come around to what others have already realized far sooner. Little time left now. Notice how shallow the arguments and consistently wrong the AGI naysayers have been year after year.

https://intelligence.org/agi-ruin/

mofeien 1 day ago |

> If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing

Even Anthropic wants to Pause AI now. There must really be not much time left for "edging". Please write to your lawmakers, no matter whether you are in the US, Europe, China, or elsewhere. Only an international agreement between governments can enforce an AI-Pause and eliminate the necessity to dangerously push the frontier.

https://pauseai.info/