Hacker news

Top
New
Past
Ask
Show
Jobs

Transformers Are Bayesian Networks (https://arxiv.org)

7 points by Anon84 about 15 hours ago | 2 comments | View on ycombinator

malcolmgreaves 28 minutes ago |

> Hallucination is not a bug that scaling can fix. It is the structural consequence of operating without concepts.

NNs are as close to continuous as we can get with discrete computing. They’re flexible and adaptable and can contain many “concepts.” But their chief strength is also their chief weakness: these “concepts” are implicit. I wonder if we can get a hybrid architecture that has the flexibility of NNs while retaining discrete concepts like a knowledge base does.

westurner about 14 hours ago |

https://news.ycombinator.com/item?id=45256179 :

> Which statistical models disclaim that their output is insignificant if used with non-independent features? Naieve Bayes [...]

Ironic then, because if transformers are Bayesian networks then we're using Bayesian networks for non-independent features.

From "Quantum Bayes' rule and Petz transpose map from the minimum change principle" (2025) https://news.ycombinator.com/item?id=45074143 :

> Petz recovery map: https://en.wikipedia.org/wiki/Petz_recovery_map :

> In quantum information theory, a mix of quantum mechanics and information theory, the Petz recovery map can be thought of as a quantum analog of Bayes' theorem

But there aren't yet enough qubits for quantum LLMs: https://news.ycombinator.com/item?id=47203219#47250262

"Transformer is a holographic associative memory" (2025) https://news.ycombinator.com/item?id=43028710#43029899