Hacker news

Top
New
Past
Ask
Show
Jobs

AI coding is gambling (https://notes.visaint.space)

347 points by speckx 3 days ago | 428 comments | View on ycombinator

itsgrimetime 3 days ago |

All of this new capability has made me realize that the reason i love programming _isn't_ the same as the OP. I used to think (and tell others) that I loved understanding something deeply, wading through the details to figure out a tough problem. but actually, being able to will anything I can think of into existence is what I love about programming. I do feel for the people who were able to make careers out of falling in love w/ and getting good at picking problems & systems apart, breaking them down, and understanding them fully. I respect the discipline, curiosity, and intellect they have. but I also am elated w/ where things are at/going. this feels absurd to say, but I finally feel like I'm _good_ at programming, which is insane, because I literally haven't written a line of code myself in months, but having tools that can finally match the speed my ideas come to me is intoxicating

watzon 3 days ago |

I think this article makes a valid point. However, if AI coding is considered gambling, then being a project manager overseeing multiple developers could also be seen as a form of gambling to a certain degree. In reality, there isn't much difference between the two. AI models are non-deterministic, and humans are also non-deterministic. You could assign the same task to two different developers and end up with entirely different results.

Terr_ 3 days ago |

I'd emphasize that prompting LLMs to generate code isn't just metaphorical gambling in the sense of "taking a risk", the scary part is the more-literal gambling involving addictive behaviors and how those affect the way the user interacts with the machine and the world.

Heck, this technology also offers a parasocial relationship at the same time! Plopping tokens into a slot-machine which also projects a holographic "best friend" that gives you "encouragement" would fit fine in any cyberpunk dystopia.

FL4TLiN3 3 days ago |

In my corner of the world, average software developers at Tokyo companies, not that many people are actually using Claude Code for their day-to-day work yet. Their employers have rolled it out and actively encourage adoption, but nobody wants to change how they work.

This probably won't surprise anyone familiar with Japanese corporate culture: external pressure to boost productivity just doesn't land the same way here. People nod, and then keep doing what they've always done.

It's a strange scene to witness, but honestly, I'm grateful for it. I've also been watching plenty of developers elsewhere get their spirits genuinely crushed by coding agents, burning out chasing the slot machine the author describes. So for now, I'm thankful I still get to see this pastoral little landscape where people just... write their own code.

copypaper 3 days ago |

You got to know when to Ship it,

Know when to Re-prompt,

Know when to Clear the Context,

And know when to RLHF.

You never trust the Output,

When you’re staring at the diff view,

There’ll (not) be time enough for Fixing,

When the Tokens are all spent.

dzink 3 days ago |

It’s variable rewards and even with large models the same question can lead to dramatically different answers. Possibly because they route your request through different models. Possibly because the model has more time to dig through the problem. Nonetheless we have some illusion of control over the output (you we wouldn’t be playing it) but it is just the quality of the model itself that leads to better outcomes - not your input. If you can’t let go of the feeling thought, it’s definitely addictive. And as I look back, it’s a fast iteration on the building cycle we had before AI. But the brain really likes low latency - it is addicted to the fast reward for its actions. So AI, if it gets fast enough (sub 400ms) it will likely become irreversibly addictive to humans in general, as the brain will see is at part of itself. Hope it has our interest at heart by then.

thisisbrians 3 days ago |

It is and will always be about: 1) properly defining the spec 2) ensuring the implementation satisfies said spec

minimaxir 3 days ago |

The gambling metaphor often applied to vibecoding implies that the outcome cannot be fully controlled or influenced, such as a slot machine. Opus 4.5 and beyond show that it not only can be very much can be influenced, but also it can give better results more consistently with the proper checks and balances.

comboy 3 days ago |

Fascinating how HN is torn about vibe coding still. Everybody pretty much agrees that it works for some use cases, yet there is a flamewar (I mean, cultured, HN-type one) every time. People seem to be more comfortable in a binary mindset.

some_random 3 days ago |

How often do you have to win before it's no longer gambling?

selixe_ 3 days ago |

I think "gambling" is a bit too strong, but there is a real shift in how we evaluate correctness. With traditional coding, you reason step by step and with AI-assisted code, you're often validating outputs after the fact.

The risk isn't randomness per se it's over trusting something that looks correct. The skill ceiling is moving from "can you write it" to "can you reliably verify it"

macinjosh 3 days ago |

I disagree. I have a successful software product that I vibe coded using claude code starting last June. It does something novel and useful that wasn't yet offered on the App Store or any app on Android.

I am not going to say what it is because all of the AI haters will immediately flock to leave it bad reviews and overwhelm my support systems with bad faith requests (something that has already happened).

I've been writing software for 25 years, I know what I am doing. Every bug I shipped was my fault either because I didn't test well enough or I did not possess enough platform knowledge to know myself the right way to do things. "Unknown unknowns"

But I have also learned better ways to do things and fixed every bug using AI tools. I don't read the code. I may scan it to gain context and then tweak a single value myself, but beyond that I don't write or read code anymore.

Its not a magical few shot prompt then reap profits machine. I just feel like a solopreneur ditch digger who just got a lease on a new CAT excavator. I can get work done faster I can also do damage faster if I am not careful.

Beyond this concern,

mpalmer 3 days ago |

I do not think "AI coding" - as distinct from the human who drives it - is gambling. More like a delayed footgun for the uneducated. I don't mean that disparagingly, but I do mean it literally.

    I’ve certainly been spending more time coding. But is it because it’s making me more efficient and smarter or is it because I’m just gambling on what I want to see?

Is this really a difficult question to answer for oneself? If you can't tell if you're learning anything, or getting more confident describing what you want, I would suggest that you cannot be thinking that deeply about the code you're producing.

    Am I just pulling the lever until I reach jackpot?

And even then, will you know you've won?

At the very least, a gambler knows when they have hit jackpot. Here, you start off assuming you've won the jackpot every time, and maybe there'll be an unpleasant surprise down the line. Maybe that's still gambling, but it's pretty backwards.

undefined 3 days ago |

undefined

jsLavaGoat 3 days ago |

Everything is "fast, cheap, good--pick two." This is no different.

wolandomny 3 days ago |

Obviously the following isn't a completely original take, but it's worth stating that AI coding is just a fundamentally different job than "traditional" or "manual" coding. The previous job was to spec something out to a comfortable degree without spending all of your time on a spec when there are so many unknowns that will come up during the engineering stage. Then, the job was to engineer at a snail's pace (compared to today) and adjust the spec.

Now, the job is to nail the spec and test HARD against that spec. Let the AI develop it and question it along the way to make sure it's not repeating itself all over the place (even this I'm sure is super necessary anymore...). Find a process that helps you feel comfortable doing this and you can get the engineering part done at lightning speed.

Both jobs are scary in different ways. I find this way more fun, however.

aderix 3 days ago |

Sometimes I feel that subsidising these packages (vs cost via API) is meant to make more and more people increasingly addicted

cmiles8 3 days ago |

It’s like any powerful tool. If you use it right it’s amazing. If you get careless or don’t watch it closely you’ll get hurt really badly.

Overall I’m a fan, but yes there are things to watch for. It doesn’t replace skilled humans but it does help skilled humans work faster if used right.

The labor replacement story is bullshit mostly, but that doesn’t mean it’s all bad.

darrinm 3 days ago |

I hear it a lot but this gambling analogy breaks when you look at actual outcomes. If you went to Vegas and after a few pulls on a one-armed bandit could _reliably_ walk away with the jackpot we wouldn’t even call it gambling anymore.

simonw 3 days ago |

Assigning work to an intern is gambling: they're inherently non-deterministic and it's a roll of the dice whether the work they do will be good enough or you'll have to give them feedback in order to get to what you need.

abcde666777 3 days ago |

Bespoke suits are still a thing. Meaning where the highest quality is desired or valued, things still get handmade, and the rest of the time it happens on a factory line.

I suppose what's happening with software development is we're exploring where the line between the two is going to land. It's pretty clear that something like a simple and generic website can be reliably vibecoded, but on the other extreme I wouldn't expect the software for something like a space shuttle to be vibe coded due to the stringent safety requirements.

CodingJeebus 3 days ago |

For me, the feedback loop accelerating the way that AI now permits is so addictive in my day-to-day flows. I've had a really hard time stepping away from work at a reasonable hour because I get dopamine hits seeing Claude build things so fast.

Addiction and recovery is part of my story, so I've done quite a bit of work around that part of my life. I don't gamble, but I can confidently say that using LLMs has been an incredible boost in my productivity while completely destroying my good habits around setting boundaries, not working until 2AM, etc.

In that sense, it feels very much like gambling.

undefined 3 days ago |

undefined

7777332215 3 days ago |

The problem with AI coding is that you no longer own the foundational tools.

LetsGetTechnicl 3 days ago |

Yes, that's literally how LLM's work, they're probabilistic.

rustyhancock 3 days ago |

Life is full of variable reward schemes. Probably why we evolved to be so enamoured by them.

Sometimes I think we put the Carr before the horse. We gamble because evolution promotes that approach.

Yes I could go for the reliable option. But taking a punt is worth a shot if the cost is low.

The cost of AI is low.

What is a problem is people getting wrapped up in just one more pull of the slot machine handle.

I use AI often. But fairly often I simply bin its reponse and get to work on my own. A decent amount of the time I can work with the response given to make a decent result.

Sometimes, rarely, it gives me what I need right off the bat.

QuiEgo 3 days ago |

If I were going to make a gambling analogy, I'd imagine a casino game as follows: draw a card from a standard playing card deck. If it's a spade you lose. Otherwise you win.

For any one hand you may win or lose, but on average you should still take the "gamble" every time, because the odds are so good you'll win. That's what using AI is like right now.

This is a new state of things - a year ago, the analogy would have been reversed.

wolttam 3 days ago |

I used to write code by hand.

AI has removed some of the tedium, and freed up more of my bandwidth to think about the problems I’m trying to solve and what the actual best ways to solve those problems are.

Only once I have a good feel for the problem I am solving do I go to the AI for help implementing.

My style of prompting usually leads to code that is very close to what I would have manually typed. I review it and tweak it until it is effectively identical to what I would have typed.

The speed up is significant. YMMV.

zzzeek 3 days ago |

coding with an LLM works if the model you are following is: you have the role of architect and/or senior developer, and you have the smartest junior programmer in the world working for you. You watch everything it does, check its conclusions, challenge it, call it out on things it didnt get quite right

it's really extremely similar to working with a junior programmer

so in this post, where does this go wrong?

> I am not your average developer. I’ve never worked on large teams and I’ve barely started a project from scratch. The internet is filled with code and ideas, most of it freely available for you to fork and change.

Because this describes a cut-and-paster, not a software architect. Hence the LLM is a gambling machine for someone like this since they lack the wisdom to really know how to do things.

There's of course a huge issue which is that how are we going to get more senior/architect programmers in the pipeline if everyone junior is also doing everything with LLMs now. I can't answer that and this might be the asteroid that wipes out the dinosaurs....but in the meantime, if you DO know how to write from scratch and have some experience managing teams of programmers, the LLMs are super useful.

PaulHoule 3 days ago |

I think somebody like Nate Silver might say “everything is gambling” if you really pressed them.

A big theme of software development for me has been finishing things other people couldn’t finish and the key to that is “control variance and the mean will take care of itself”

Alternately the junior dev thinks he has a mean of 5 min but the variance is really 5 weeks. The senior dev has mean of 5 hours and a variance of 5 hours.

Retr0id 3 days ago |

> But now either the AI can handle it or it can pretend to handle it. Frankly it's pretending both times, but often it's enough to get the result we need.

This has been how I think about it, too. The success rates are going up, but I still view the AI as an adversary that is trying to trick me into thinking it's being useful. Often the act is good enough to be actually useful, too.

tonymet 3 days ago |

As always, scope the changes to no larger than you can verify. AI changes the scale, but not the strategy.

Now you have more resources to test, reduce permissions scope, to build a test bench & procedure. All of the excuses you once had for not doing the job right are now gone.

You can write 10k + lines of test code in a few minutes. What is the gamble? The old world was a bigger gamble.

yoyohello13 3 days ago |

I was just thinking about this. I was reading those tweets about the SV party were people were going home early to “check on their agents” or the “token anxiety” people are having over whether they are optimizing their agent usage. This is all giving me addiction vibes. Especially at the end of the day it seems like there is not much to show for it.

AbstractH24 2 days ago |

To an extent entrusting anyone to do something other than yourself is gambling.

But you can't survive alone on an island. So you need to determine where to make gambles and where not.

anal_reactor 3 days ago |

An idea just occurred to me: why not tell AI to code in Coq? AFAIK the selling point of that language is that if it compiles, then it's guaranteed to work. It's just that it's PITA to write code in Coq, but AI won't get annoyed and quit.

amw-zero 3 days ago |

So is human coding.

cjlm 3 days ago |

Totally agree, wrote something similar last year: https://cjlm.ca/posts/it-feels-like-gambling/

ryoshu 3 days ago |

Like video gaming, but similar.

wagwang 3 days ago |

> I divide my tasks into good for the soul and bad for it. Coding generally goes into good for the soul, even when I do it poorly.

Lmk how you feel when you're constantly build integrations with legacy software by hand.

halotrope 3 days ago |

idk it works for me it build stuff that would have taken weeks in hours ymmv

__MatrixMan__ 3 days ago |

Inductive reasoning of any kind (e.g. the scientific method) is gambling.

rvz 3 days ago |

It is indeed gambling. You are spending more tokens hoping that the agent aligns with your desired output from your prompt. Sometimes it works, sometimes it doesn't.

Watching vibe gamblers hooked onto coding agents who can't solve fizz buzz in Rust are given promotional offers by Anthropic [0] for free token allowances that are the equivalent in the casino of free $20 bets or free spins at the casino to win until March 27, 2026.

The house (Anthropic) always wins.

[0] https://support.claude.com/en/articles/14063676-claude-march...

Gagarin1917 3 days ago |

Trying to decide whether to refinance now or not feels like gambling too. Yet it’s financially beneficial to make some bet.

Defining “Gambling” like isn’t really helpful.

luckydata 3 days ago |

it's gambling until you learn how to set up proper harnesses then it just becomes normal administration. It's no different than running a team, humans make mistakes too, that's why we have CI pipelines, automated testing etc... AI assisted coding "JUST" requires you to be extra good at that part of the job.

mika-el 3 days ago |

depends on bet size. small scoped tasks with tight specs — agents are reliable. "build this feature" with no constraints — yeah that's gambling. I am 90% positive most agent failures I see are from vague task definitions, not model limitations. basically the fix is better scoping not better models

rob_c 3 days ago |

So.

Is.

Life.

You've discovered probability, there was an 80% change of that. Roll a dice and do not pass go.

Again. The output from llm is a probable solution, not right, not wrong.

dwa3592 3 days ago |

few thoughts on this- it's not gambling if the most expected outcome actually occurs.

It also depends on what you're coding with;

- If you're coding with opus4.6, then it's not gambling for a while.

- If you'r coding with gemini3-flash, then yeah.

One thing I have noticed though is- you have to spend a lot of tokens to keep the error/hallucination rate low as your codebase increases in size. The math of this problem makes sense; as the code base has increased, there's physically more surface where something could go wrong. To avoid that you have to consistently and efficiently make the surface and all it's features visible to the model. If you have coded with a model for a week and it has produced some code, the model is not more intelligent after that week- it still has the same layers and parameters, so keeping the context relevant is a moving target as the codebase increases (and that's why it probably feels like gambling to some people).

fittingopposite 3 days ago |

Background image makes the website fairly hard/unpleasant to read (in mobile view)

nativeit 3 days ago |

I have had very similar experiences. I am not a professional software developer, but have been a Linux sysadmin for over a decade, a web developer for much longer than that, and generally know enough to hack on other people’s projects to make them suit my own purposes.

When I have Claude create something from scratch, it all appears very competent, even impressive, and it usually will build/function successfully…on the surface. I have noticed on several occasions that Claude has effectively coded the aesthetics of what I want, but left the substance out. A feature will appear to have been implemented exactly as I asked, but when I dig into the details, it’s a lot of very brittle logic that will almost certainly become a problem in future.

This is why I refuse to release anything it makes for me. I know that it’s not good enough, that I won’t be able to properly maintain it, and that such a product would likely harm my reputation, sooner or later. What frightens me is there are a LOT of people who either don’t know enough to recognize this, or who simply don’t care and are looking for a quick buck. It’s already getting significantly more difficult to search for software projects without getting miles of slop. I don’t know how this will ultimately shake out, but if it’s this bad at the thing it’s supposedly good at, I can only imagine the kinds of military applications being leveraged right now…

cadamsdotcom 3 days ago |

> My job went from connecting these two things being the hard and reward part, to just mopping up how poorly they’ve been connected.

That’s only half of the transition.

The other half - and when you know you’ve made it through the “AI sux” phase - is when you learn to automate the mopping up. Give the agent the info it needs to know if it did good work - and if it didn’t do good work, give it information so it knows what to fix. Trust that it wants to fix those things. Automate how that info is provided (using code!) and suddenly you are out of the loop. The amount of code needed is surprisingly small and your agent can write it! Hook a few hundred lines of script up to your harness at key moments, and you will never see dumb AI mistakes again (because it fixed them before presenting the work to you, because your script told it about the mistakes while you were off doing something else)

Think of it like linting but far more advanced - your script can walk the code AST and assess anything, or use regex - your agent will make that call when you ask for the script. If the script has an exit code of 2, stderr is shown to the agent! So you (via your script) can print to stderr what the agent did wrong - what line, what file, wha mistake.

It’s what I do every day and it works (200k LOC codebase, 99.5% AI-coded) - there’s info and ideas here: https://codeleash.dev/docs/code-quality-checks

This is just another technique to engineer quality outcomes; you’re just working from a different starting point.

asaiacai 3 days ago |

it's the perfect drug. You don't know how to code something up. Ask AI to implement it. It's broken? Ask AI to fix it for you. Will people become unable to fix things without it?

NickNaraghi 3 days ago |

It's only "gambling" for now...

The odds of success feel like gambling. 60%, or 40%, or worse. This is downstream of model quality.

Soon, 80%, 95%, 99%, 99.99%. Then, it won't be "gambling" anymore.

hodder 3 days ago |

Depending on anyone for anything is gambling.

apf6 3 days ago |

Hiring a human is gambling too.

samschooler 3 days ago |

I think there are levels to this.

- One shot or "spray and pray" prompt only vibe coding: gambling.

- Spec driven TDD AI vibe coding: more akin to poker.

- Normal coding (maybe with tab auto complete): eating veggies/work.

Notably though gambling has the massive downside of losing your entire life and life savings. Being in the "vibe coding" bucket's worse case is being insufferable to your friends and family, wasting your time, and spending $200/month on a max plan.

6thbit 3 days ago |

Come on now. I pull the slot machine every time I ask my coworker Digbert to work on a ticket.

Will Digbert be able to handle it or will he pretend to handle it? Or will he handle it in a way that it will break again in six weeks and will evolve into his full time job for a year?

If this is gambling, middle management has been gambling for too long.

batuhandumani 3 days ago |

Life is a gamble

undefined 3 days ago |

undefined

lasgawe 3 days ago |

haha.. I agree with the points mentioned in the article. Literally every model does this. It feels like this even with skills and other buzzword files

apitman 3 days ago |

himata4113 3 days ago |

I really hate when people write about the AI of the past, opus 4.6 and gpt 5.4 [not as much imo, it's really boring and uncreative] have increased in capabilities so much that it's honestly mind numbing compared to what we had LESS than a year ago.

Opus specifically from 4.1 to 4.5 was such a major leap that some take it for granted, it went from getting stuck in loops, generally getting lost constantly, needing so so much attention to keep it going to being able to get a prompt, understand it from minimal context and produce what you wanted it to do. Opus 4.6 was a slight downgrade since it has issues with respecting what the user has to say.

undefined 3 days ago |

undefined

1234letshaveatw 3 days ago |

Is using a calculator gambling?

undefined 3 days ago |

undefined

1970-01-01 3 days ago |

"60% of the time, it works every time"

xnx 3 days ago |

...and the payouts are fantastic.

artursapek 3 days ago |

“hiring people is gambling”

pugchat 1 day ago |

[dead]

eu_93 about 7 hours ago |

[dead]

jackfranklyn 3 days ago |

[dead]

KnowFun 3 days ago |

[dead]

leontloveless 3 days ago |

[dead]

scm7k 3 days ago |

[dead]

webagent255 3 days ago |

[dead]

devcraft41 3 days ago |

[dead]

taintlord 3 days ago |

[dead]

webagent255 2 days ago |

[dead]

ratrace 3 days ago |

[dead]

Iamkkdasari74 3 days ago |

[dead]

balinha_8864 3 days ago |

[dead]

balinha_8864 3 days ago |

[dead]

WyldeDany23 3 days ago |

[dead]

x4v13r_112046 3 days ago |

[dead]

Iamkkdasari74 3 days ago |

[dead]

lokimoon 3 days ago |

[dead]

irarrazaval26 3 days ago |

[flagged]

extr 3 days ago |

I mean, this completely falls apart when you're trying to do something "real". I am building a trading engine right now with Claude/Codex. I have not written a line of code myself. However I care deeply about making sure everything works well because it's my money on the line. I have to weight carefully the prospect of landing a change that I don't fully understand.

Sometimes I can get away with 3K LoC PRs, sometimes I take a really long time on a +80 -25 change. You have to be intellectually honest with yourself about where to spend your time.

vermilingua 3 days ago |

Not only is it gambling, it has the full force of the industry that built the attention market behind it. I find it extremely hard to believe that these tools have not been optimised to keep developers prompting the same way tiktok keeps people scrolling.

bensyverson 3 days ago |

This "slot machine" metaphor is played out. If you're just entering a coin's worth of information and nudging it over and over in the hopes of getting something good, that's a you problem, not a Claude problem.

If, on the other hand, you treat it like a hyper-competent collaborator, and follow good project management and development practices, you're golden.

DiscourseFan 3 days ago |

When a code doesn't compile, it doesn't kill anyone. But if a Waymo suddenly veers off the road, it creates a real threat. Waymos had to be safer than real human drivers for people to begin to trust them. Coding tools did not have to be better than humans for them to be adopted first. Its entirely possible for a human to make a catastrophic error. I imagine in the future, it will be more likely that a human makes such errors, just like its more likely that a human will make more errors driving a car.

CraftingLinks 3 days ago |

I see whole teams pushed by c- level going full in with spec driven + tdd development. The devs hate it because they are literally forbidden to touch a single line if code. but the results speak for themselves, it just works and the pressure has shifted to the product people to keep up. The whole tooling to enable this had to be worked out first. All Cursor and extreme use of a tool called Speckit, connected to Notion to pump documentation and Jira.

post-it 3 days ago |

> But this doesn't really resemble coding. An act that requires a lot of thinking and writing long detailed code.

Does it? It did in the past. Now it doesn't. Maybe "add a button to display a colour selector" really is the canonical way to code that feature, and the 100+ lines of generated code are just a machine language artifact like binary.

> But it robs me of the part that’s best for the soul. Figuring out how this works for me, finding the clever fix or conversion and getting it working. My job went from connecting these two things being the hard and reward part, to just mopping up how poorly they’ve been connected.

Skill issue. Two nights ago, I used Claude to write an iOS app to convert Live Photos into gifs. No other app does it well. I'm going to publish it as my first app. I wouldn't have bothered to do it without AI, and my soul feels a lot better with it.