Hacker news

Top
New
Past
Ask
Show
Jobs

Cursor's latest “browser experiment” implied success without evidence (https://embedding-shapes.github.io)

660 points by embedding-shape 1 day ago | 288 comments | View on ycombinator

pavlov about 23 hours ago |

The comment that points out that this week-long experiment produced nothing more than a non-functional wrapper for Servo (an existing Rust browser) should be at the top:

https://news.ycombinator.com/item?id=46649046

paulus_magnus2 1 day ago |

The blog[0] is worded rather conservatively but on Twitter [2] the claim is pretty obvious and the hype effect is achieved [2]

CEO stated "We built a browser with GPT-5.2 in Cursor"

instead of

"by dividing agents into planners and workers we managed to get them busy for weeks creating thousands of commits to the main branch, resolving merge conflicts along the way. The repo is 1M+ lines of code but the code does not work (yet)"

[0] https://cursor.com/blog/scaling-agents

[1] https://x.com/kimmonismus/status/2011776630440558799

[2] https://x.com/mntruell/status/2011562190286045552

[3]https://www.reddit.com/r/singularity/comments/1qd541a/ceo_of...

embedding-shape 1 day ago |

I'm eager to find out if this was actually successfully compiled at one point (otherwise how did they get the screenshots?), so I'm running `cargo check` for each of the last 100 commits to see if anything works. Will update here with the results once it's ready.

Edit: As mentioned, I ran `cargo check` on all the last 100 commits, and seems every single of them failed in some way: https://gist.github.com/embedding-shapes/f5d096dd10be44ff82b...

undefined about 1 hour ago |

undefined

deng 1 day ago |

If you look at the original Cursor post, they say they are currently running similar experiments, for instance, this Excel clone:

https://github.com/wilson-anysphere/formula

The Actions overview is impressive: There have been 160,469 workflow runs, of which 247 succeeded. The reason the workflows are failing is because they have exceeded their spending limit. Of course, the agents couldn't care less.

ankit219 about 23 hours ago |

Like it or not, it's a fundraising strategy. They have followed it mutliple times (eg: vague posts about how much their inhouse model is writing code, online RL, and lines of code etc. earlier) and it was less vague before. They released a model and did not give us the exact benchmarks or even tell us the base model for the same. This is not to imply there is no substance behind it, but they are not as public about their findings as one would like them to be. Not a criticism, just an observation.

wilsonzlin about 13 hours ago |

Hey, Wilson here, author of the blog post and the engineer working on this project. I've been reading the responses here and appreciate the feedback. I've posted some follow up context on Twitter/X[0], which I'll also write here:

The repo is a live incubator for the harness. We are actively researching the behavior of collaborative long running agents, and may in the future make the browser and other products this research produces more consumable by end users and developers, but it's not the goal for now. We made it public as we were excited by the early results and wanted to share; while far off from feature parity with the most popular production browsers today, we think it has made impressive progress in the last <1 week of wall time.

Given the interest in trying out the current state of the project, I've merged a more up-to-date snapshot of the system's progress that resolves issues with builds and CI. The experimental harness can occasionally leave the repo in an incomplete state but does converge, which was the case at the time of the post.

I'm here to answer any further questions you have.

[0] https://x.com/wilsonzlin/status/2012398625394221537?s=20

Snuggly73 about 24 hours ago |

The latest commit now builds and runs (at least on my Mac). It’s tragically broken and the code is…dunno…something. 3m lines of something.

I couldn’t make it render the apple page that was on the Cursor promo. Maybe they’ve used some other build.

geooff_ 1 day ago |

I think the original post was just headline bait. There is such a fast news cycle around AI that many people would take "Thousands of AI agents collaborate to make a web browser" at face value.

jadenpeterson about 18 hours ago |

For my 11th or 12th birthday, I got a pet porcupine and I was ecstatic. It was my first pet, and I spent hours researching what they eat, what habitats they like, etc. I carefully curated my room to accommodate him (him being 'Sonic'), even keeping it clean for the first time in forever so I wouldn't lose him amidst the mess of soiled undergarments and such. He loved it, and I loved him. Of course, it made no difference when my uncle sat on him on Christmas morning. We rushed him to the vet, but they told us his scans showed fractures on several vertebrae or something like that. We took him home, and waited for him to die, but the waiting was too painful. I'll spare the details, but what transpired next involved my dad, his shovel, and a lot of tears.

About an hour later, we got a call from the vet - they'd misread the scan, and Sonic was gonna be fine. I think I was traumatized at the time, but the whole thing later became an inside joke (?) for my family - "Don't kill your porcupine before the vet calls" (a la "Don't count your chickens before they hatch").

I guess my point, as it pertains to Cursor, its AI offerings, and other corporations in the space is that we shouldn't jump the gun before a reasonable framework exists to evaluate such open-ended technologies. Of course Cursor reported this as a success, the incentive structure demands they do so. So remember - don't kill your porcupine before the vet calls.

elzbardico about 5 hours ago |

I think that the companies that have the mindset "Let's give engineers tools that can leverage their strengths and eliminate toil" have way more success than those scammy "get-rich-fast let's automate software development and stop paying those sv salaries, invest in us!!!" gigs like Cursor and Devin.

Their whole attitude leads to them wasting time with those Willy the Coyote Plans instead of building good products like Amp.

nindalf 1 day ago |

The CEO said

> It's 3M+ lines of code across thousands of files. The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, layout, text shaping, paint, and a custom JS VM.

"From scratch" sounds very impressive. "custom JS VM" is as well. So let's take a look at the dependencies [1], where we find

- html5ever

- cssparser

- rquickjs

That's just servo [2], a Rust based browser initially built by Mozilla (and now maintained by Igalia [3]) but with extra steps. So this supposed "from scratch" browser is just calling out to code written by humans. And after all that it doesn't even compile! It's just plain slop.

[1] - https://github.com/wilsonzlin/fastrender/blob/main/Cargo.tom...

[2] - https://github.com/servo/servo

[3] - https://blogs.igalia.com/mrego/servo-2025-stats/

thedelanyo 1 day ago |

These are stories that solely exist just to sell shovels and would cause one uninformed CEO to layoff actual humans.

chaosprint about 23 hours ago |

I really doubt this marketing approach is effective. Isn't this just shooting themselves in the foot? My actual experience with Cursor has been: their design is excellent and the UX is great—it handles frontend work reasonably well. But as soon as you go deeper, it becomes very prone to serious bugs. While the addition of Claude's new models has helped somewhat, the results are still not as good as Google's Antigravity (despite its poor UX and numerous bugs). What's worse, even with this much-hyped Claude model, you can easily blow through the $20 subscription limit in just a few days. Maybe they're betting on models becoming 10x better and 10x cheaper, but that seems unlikely to happen anytime soon.

nubskr about 10 hours ago |

That's actually the state of autonomous coding in 2026, scale the output, skip the verification.

utopiah about 4 hours ago |

That's kind of hilarious (...ly sad) to read knowing that I have on my desk https://browser.engineering so I literally went the opposite direction some months ago.

Not only did I actually build a Web browser myself, from scratch (ok OK of course with a working OS and Python, and its libraries ;) but mine, did work! And it took me what, few hours, maybe few days if adding it altogether but, not only it did work (namely I did browse my own Website with it) but I had fun with it (!), I learned quite a bit with it (including the provable fact that I can indeed build a Web browser, woohoo!) and finally I did it on... I want say few kilowatts at most, including my computer (obviously) but also myself and the food I ate along the way.

So... to each their own ̄\_ (ツ)_/ ̄

orourke about 8 hours ago |

I feel that getting anywhere into the neighborhood of “kind of working” for a project like this is noteworthy and a huge milestone. Maybe a better headline would be, however: Agents almost create a working browser.

motbus3 about 9 hours ago |

If it just forks chromium because it found it on the web it would also claim it made a browser from scratch. LLM does not know. It is not a person, it is a thing, just an algorithm

7777777phil 1 day ago |

I wonder who they actually tried to impress with that? People who understand and appreciate the difficulty of building a browser from scratch would surely be interested to understand what you (or your Agent) did to a degree that they would understand if you didn’t.

ares623 about 20 hours ago |

Can’t help but draw parallels to how working with AI feels like. Your coworker opens a giant impressive looking PR and marks it ready for review. Meanwhile it’s up to someone else in the team to do the actual work of checking. Meanwhile the PR author gets patted on the back by management for being forward thinking and pro-active while everyone else is “nitpicky” and holding progress back.

solid_fuel about 16 hours ago |

This is par for the course with this AI slop. Most of the big claims about LLM productivity have completely lacked any backing evidence. Big claims require big evidence, but all I've seen so far is loud assertions and pathetic results.

DeathArrow about 12 hours ago |

So they prove that if you have enough money to burn you can use AI to generate terabytes of useless junk?

Who would have thought of that?

josefritzishere 1 day ago |

Key phrase "They never actually claim this browser is working and functional " This is what most AI "successes" turn out to be when you apply even a modicum of scrutiny.

undefined 1 day ago |

undefined

Pinus 1 day ago |

I haven’t studied the project that this is a comment on, but: The article notices that something that compiles, runs, and renders a trivial HTML page might be a good starting point, and I would certainly agree with that when it’s humans writing the code. But is it the only way? Instead of maintaining “builds and runs” as a constant and varying what it does, can it make sense to have “a decent-sized subset of browser functionality” as a constant and varying the “builds and runs” bit? (Admittedly, that bit does not seem to be converging here, but I’m curious in more general terms.)

heliumtera about 22 hours ago |

Making it compile will considerably decrease productivity. PR number go up

m00dy 1 day ago |

Cursor CEO got grilled in HN for a good reason.

Matthyze 1 day ago |

Out of curiosity, what is the most difficult thing about building a browser?

ironbound about 22 hours ago |

Devin 2.0

mikojan about 23 hours ago |

Dear god please let AI get forever stuck at this point because it would be so funny

sidgarimella about 16 hours ago |

there’s a curve where something of a conservative middle in AI marketing stunts are held to a higher level of criticism than headlines on either side

devmor about 13 hours ago |

I am just so utterly tired of AI companies lying about everything, constantly without end.

The things that modern machine learning can do are absolutely incredible, mindblowing and have myriad uses. But this culture of startup scams to siphon money out of the economy and into the bank accounts of a few investment firms and a couple "visionaries" has just turned what should be an exciting field full of technical advancement into a deluge of mental sewage that's constantly pumped into our faces.

noosphr about 23 hours ago |

If this is what makes the AI bubble pop I'll laugh so hard.

emp17344 1 day ago |

This is why AI skeptics exist. We’re now at the point where you can make entirely unsubstantiated claims about AI capability, and even many folks on HN will accept it with a complete lack of discernment. The hype is out of control.

lifetimerubyist about 23 hours ago |

> company claims they "built a browser" from scratch

> looks inside

> completely useless and busted

30 billion dollar VS Code fork everyone. When we do start looking at these people for what they are: snake oil salesmen.

They slop laundered the FOSS Servo code into a broken mess and called it a browser, but dumbasses with money will make line go up based on lies. EFF right off.

LegitShady about 18 hours ago |

AI hype is just lying until you get caught

user432678 1 day ago |

Are you telling me AI bros lying about their products? No way that ever happened…

AIorNot about 24 hours ago |

Lesson 1:

Always take any pronouncement from an AI company (heavily dependent on VC and public sentiment on AI) with a heavy grain of salt..

hype over reality

I’m building an AI startup myself and I know that world and its full of hypsters and hucksters unfortunately - also social media communication + low attention span + AI slop communication is a blight upon todays engineering culture

xcvxvdf 1 day ago |

[dead]

logicallee about 20 hours ago |

(this has been fixed)

jonathanstrange about 20 hours ago |

I think it's only a matter of time until this becomes reality. It's almost inevitable.

My prediction last year was already that in the distant future - more than 10 years into the future - operating systems will create software on the fly. It will be a basic function of computers. However, there might remain a need for stable, deterministic software, the two human-machine interaction models can live together. There will be a need for software that does exactly what one wants in a dumb way and there will be a need for software that does complex things on the fly in an overall less reliable ad hoc way.

ryanisnan about 23 hours ago |

The amount of negativity in the original post was astounding.

People were making all sorts of statements like: - “I cloned it and there were loads of compiler warnings” - “the commit build success rate was a joke” - “it used 3rd party libs” - “it is AI slop”

What they all seem to be just glossing over is how the project unfolded: without human intervention, using computers, in an exceptionally accelerated time frame, working 24hr/day.

If you are hung up on commit build quality, or code quality, you are completely missing the point, and I fear for your job prospects. These things will get better; they will get safer as the workflows get tuned; they will scale well beyond any of us.

Don’t look at where the tech is. Look where it’s going.