208 points by cgwu 3 days ago | 47 comments | View on ycombinator
Nevermark 3 days ago |
jackfranklyn 3 days ago |
Proprietary training data for foundation models? Sure, that's a real moat - until someone figures out how to generate synthetic equivalents or a new architecture makes your dataset less relevant.
But the more interesting moat is often contextual data - the stuff that accumulates from actual usage. User preferences, correction patterns, workflow-specific edge cases. That's much harder to replicate because it requires the product to be useful enough that people keep using it.
The catch is you need to survive long enough to accumulate it, which usually means having some other differentiation first. Data as a moat is less of a starting position and more of a compounding advantage once you've already won the "get people to use this thing" battle.
jackfranklyn 3 days ago |
In my domain, every user correction teaches the system something new about how actual businesses operate vs how you assumed they did when you wrote the first version. Six months of real usage with real corrections creates something a competitor can't just replicate by having more compute or a bigger training set.
The tricky part is that this kind of moat is invisible until you try to build the same thing. From the outside it looks simple. From the inside you're sitting on thousands of learned exceptions that make the difference between "works on demos" and "works on real data."
light_triad 3 days ago |
Also as foundation models improve, today's "hard to solve" problems become tomorrow's "easy to solve" problems
weinzierl 3 days ago |
I don't know about prospecting, but "answer support tickets accurately"? Seriously, this must be ironic, right?
netdevphoenix 3 days ago |
ralusek 3 days ago |
inb4 "then why do Meta's models still suck?"
whatever1 3 days ago |
burntcaramel 3 days ago |
- Which brands do people trust? - Which people do people of power trust?
You can have all the information in the world but if no one listens to you then it’s worthless.
NiloCK 3 days ago |
The biggest data hoarders now compress their data into oracles whose job is to say whatever to whoever - leaking an ever-improving approximation of the data back out.
DeepSeek was a big early example of adversarial distillation, but it seems inevitable to me that frontier models can and will always be siphoned off in order to produce reasonably strong fast-follow grey market competition.
andy99 3 days ago |
Code generation, you don’t see what’s wrong right away, it’s only later in project lifecycle that you pay for it. Writing looks good to skim, is embarrassingly bad once you start reading it.
Some things (slides apparently) you notice right away how crappy they are.
I don’t think it’s just better training data, I think LLMs apply largely the same kind of zeal to different tasks. It’s the places where coherent nonsense ends up being acceptable.
I’m actually a big LLM proponent and see a bright future, but believe a critical assessment of how they work and what they do is important.
Hrun0 2 days ago |
jongjong 3 days ago |
Companies always try to make it seem like data is valuable. Attention is valuable. With attention, you get the data for free. What they monetize is attention. Data is a small part to optimize the sale of ads but attention is the important commodity.
Why else are celebrities so well paid?
dangoodmanUT 3 days ago |
PeterStuer 2 days ago |
adverbly 3 days ago |
You get some anecdotal evidence and immediately post a hot take claiming to have discovered a new invariant?
I guess a bunch of us, including myself have taken the engagement bait here but does it really take somebody saying something stupid to start a conversation on something?
CuriouslyC 3 days ago |
niemandhier 2 days ago |
The law even demands that the data is machine readable.
The only real moat is your own, observational data.
guelo 3 days ago |
Vertical integration.
Horizontal integration.
Cross- and/or mass-relationship integration.
Individual relationship investment/artifacts.
Reputation for reliability, stability, or any other desired dimension.
Constant visibility in the news (good, neutral, sometimes even bad!)
A consistent attractive story or narrative around the brand.
A consistent selective story or narrative around the brand. People prefer products designed for "them".
On the dark side: intimidation. Ruthless competition, acquisitions, law suits, reputation for dominance, famously deep pockets.
To keep someone is easier. Tiny things hold onto people: An underlying model that delivers results with less irritation/glitches/hoops. Low to no-configuration installs and operation. Windows that open, and other actions that happen, instantly. Simple attention to good design can create fierce loyalty, for those for whom design or friction downgrades feel like torture.
Obviously, many more moats in the physical world.