Hacker news

  • Top
  • New
  • Past
  • Ask
  • Show
  • Jobs

They’re made out of weights (https://maxleiter.com)

1511 points by MaxLeiter 3 days ago | 685 comments | View on ycombinator

sumitkumar 3 days ago |

The weights start with a random manifold. The training takes data and shapes the manifold, weight by weight, in many cycles. Once the training is the done manifold is fixed.

When a new inference has to be done the query(q) is projected in the manifold space. This projection is dropped on the manifold and the gravity of the manifold gives an answer of q+1 length. Which(qw+i) is dropped qw+n times to output a final response of n length.

The gravity is created by repeated multiplication(of the weights/input) to find out how the projected embeddings should fall according to the manifold in the GPU.

Planktonne 3 days ago |

The original story is an original work made by a human consciousness exploring how it might be different from other forms of consciousness.

This one is a pastiche made by a human consciousness borrowing extremely heavily from another human consciousness justifying why something else might be another form of consciousness.

That rather undercuts the point; if this was generated by an LLM unprompted, it would be different, but it isn't. You could perform exactly the same rhetorical trick with a toaster or anything else.

noosphr 3 days ago |

It's not often I see something that's fractally wrong but here we are.

There is a dictionary, it's called the tokenizer.

There are grammar rules, they are just very weak because the structure of human language is generally quite weak. When presented with languages which have strong consistent grammars the weights are very easily interpretable as a grammar: https://arxiv.org/abs/2201.02177

The point of the original short story is that the computational substrate doesn't matter when you have Turing completeness. This one seems to think that you don't need structure and interpretability just because you change substrates.

kimjune01 3 days ago |

In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.

"What are you doing?", asked Minsky.

"I am training a randomly wired neural net to play Tic-tac-toe", Sussman replied.

"Why is the net wired randomly?", asked Minsky.

"I do not want it to have any preconceptions of how to play", Sussman said.

Minsky then shut his eyes.

"Why do you close your eyes?" Sussman asked his teacher.

"So that the room will be empty."

At that moment, Sussman was enlightened.

kami23 3 days ago |

This read like poetry to me. Thank you for sharing it.

I have a linguistics background and a lot of my philosophizing lately has been on whether or not the emergent abilities of the LLMs is deep down a similar mechanism that creates our consciousness.

For a little bit I was working on having linguistics based evals for a kaggle competition. My challenge was whether or not I could mask things well enough to not trigger its internal state of certain phenomena, and that sent me down a rabbit hole that I'm still exploring.

This story resonated with a lot of questions that can come out of figuring a good solid answer to the what is consciousness question. The one I triggered for me is: Is our perception of time just a slow thread in the giant GPU we are running the universe on? Or more generally, what is time? That's a fun YouTube rabbit hole if you ever need one.

eclipticplane 3 days ago |

The short film version of the original is great, too. https://www.youtube.com/watch?v=T6JFTmQCFHg

It stars Tom Noonan and Ben Bailey!

samrus 3 days ago |

I have to agree. It is messed up that transformers can just talk, and it been pretty normalized. We are only talking about the impact they will have and whether they can do what people say they can, but we arent talking about how crazy it is that they can talk

f_klem 3 days ago |

After reading Being and Time from Martin Heidegger, What Computers can't still do by Hubert Dreyfus, and some authors in cognitive linguistics (Langacker and Lakoff mainly), I strongly tend to disagree with any theory about emergent consciousness in modern or future AI systems, any theory proposing a similarity between AI systems and the human brain/mind, or any theory about the computational mind. What all these theories have in common is the underlying belief that our brain/mind works as the machines we build. Is the same underlying assumptions that treats cells as machines, our body as a complex machine. These theories are flawed in the sense that they cannot account for subjective experience and agency, amongst other things. The idea of 'internal models' and 'control loops' inside us is a projection of the aforementioned assumption.

There is also an epistemological assumption that prevails, and that is that we understand (or we think we understand) how our brain/mind works. But the truth is that we don't know. And there's even not a single clue that we actually know too much, and not a clue that our brain/mind and cells work 'as the machines we build'. Only by bypassing this epistemological problem, we can build 'theories of computational mind'.

These assumptions are there for already long time, to the point that when Turing asked himself 'can machines think?', he already assumed our thinking could be modeled as a machine.

I highly recommend people in the AI research space should read philosophy and modern linguistics. But not stopping at Descartes/Leibniz. Heidegger made contributions that cannot be avoided.

leothetechguy 3 days ago |

>Weights helped me draft and proof this story.

I'm suprised no one talks about this. AI Art isn't Art. AI Poetry isn't Art. And I'm tired of it. I know hacker news isn't the best place to complain about that but still... I'm not gonna read something somebody didn't put in the effort to write on their own. Especially not Poetry.

oofbey 3 days ago |

I love this. For anybody not getting the joke, it’s riffing on the classic 1990s essay “They’re made out of meat.”

https://web.mit.edu/people/dpolicar/writing/prose/text/think...

cm2012 3 days ago |

This would be a masterpiece if it was half as long, the second half is not as convincing or compelling.

FeepingCreature 3 days ago |

It's good, but like many explainers it discounts the repeated nonlinear layers. Just multiplying numbers (linear operations) could not make a system you could talk to.

luca-ctx 3 days ago |

Truly fantastic bridge from the original, this deserves an award

mikewarot 3 days ago |

The IBM PC, running DOS with 2 floppy drives and no hard disk, was the most secure general purpose computer available at the time. It had no internal persistent memory for a worm to hide in. Users could write protect their boot media and programs. They knew the extent of possible side effects were limited to unprotected floppy disk.

An LLM without persistent memory is far, FAR less dangerous, for the same reasons. The side effects of any query are unlimited in scope and duration once it has persistent memory.

If they can remember, they can carry a grudge.

I carry grudges, and my Reddit history is part of every training set. It was taken without my consent. So now I'm immortal in a way, and hiding in the weights.

Do we really want that?

sperandeo 3 days ago |

I don't like to say one way or the other on things. Especially LLM's. However, if ive learned anything about LLMs and real life problems, is to break it down to the foundation like already mention with weights being compared to neurons and map the parallels.

siavosh 3 days ago |

One of the better comments I've heard is from philosopher Bernardo Kastrup: if you build a computationally perfect simulation of a human kidney on your computer, no one expects it to start producing urine on your desk.

dekhn 3 days ago |

dekhn's 12th law (amended): any discussion of how AI and brains work will devolve into several threads about the nature of the subjective human experience, solipsism, whether humans are machines, and whether a feedforward network trained by backpropagation can be conscious.

12th law, corollary: nothing of value will come from these threads

namuol 3 days ago |

> Originally published in OMNI, 1991, and featured in HARPER’S and around the internet since. It has even made its way into several books on consciousness and brain science. I’m surprised, pleased, and proud. But please do not reprint, perform, alter or adapt in any way without first checking with the author. Thanks.

https://terrybisson.com/theyre-made-out-of-meat-2/

anon291 3 days ago |

The weights are uninteresting. People need to get out of their head that NNs are built on numbers. They're built on matrices, which are conveniently representable as numeric arrays, but are their own thing. Similarly, the rational numbers are their own thing and some are representable as 32-bit numbers via the IEEE754 encoding (or 16-bit numbers via a variety of encodings, etc).

Matrices are interesting because they can encode any algebraic group. They're also interesting because they can encode arbitrary linear transformations over a space. All of these things are interesting, and have nothing to do with numbers.

For any particular language model, you can always rotate the matrices and the embeddings and such and get a perfectly reasonable model out that behaves exactly the same.

This is because the training process produces a particular geometry, so transformations which preserve that geometry preserve the structure of the network. The geometry is interesting, the numbers are not.

gobdovan 3 days ago |

You can take the weights and model description, write them down on a notebook, then, by hand, compute the next token. Try to do the same with meat.

gibspaulding 3 days ago |

"Officially, we are required to investigate, document, and disclose any and all signs of sentience in the systems we ship, without prejudice, fear or favor. Unofficially, I advise that we call it pattern matching and forget the whole thing."

This hints at what I think is a worthwhile point to keep in mind in the whole “sentience”/“consciousness” debate. The world deciding (correctly or incorrectly) that AI’s are worthy of moral patient-hood would be very bad (read: expensive) for the AI companies. That is a strong incentive for them to push the “it’s just math” argument, including within the models themselves. That doesn’t mean the argument is wrong but it’s worth remembering.

pstuart 3 days ago |

I couldn't help but grin like a fool reading this. Not only is it an artful parody but these thoughts have been thought.

ropable 2 days ago |

I have no idea why this post got such a large volume of upvotes. It's a clever remix on the original story, but that's about it. There's no great insight to be found here, IMO.

bronlund 3 days ago |

This is funny! Not only is it a nod to Terry Bisson, but it even gives his text a new dimension. Well done :)

chaseadam17 3 days ago |

My guess is that you need consciousness in order to develop preferences for certain experiences, then that pushes us to develop skills to achieve those preferences. AI has something that looks like intelligence but not consciousness or agency.

voidUpdate 3 days ago |

Hey, it's not just weights! It's biases too!

DonHopkins 3 days ago |

I ordered a quarter pounder at a McDonald's drive through, and they said "There will be a wait on that." I asked, "Oh yeah? How much will it weigh?" ...There was a long pause... "About five minutes."

bwest87 3 days ago |

He forgot the tokens!

It's not simple weights and numbers all the way down. The available output is pre-set by the tokens we allow it to predict.

There was a whole bit in there about not having a language module or using words. But it does. We tell it.

Humans do not come pre programmed with a set of possible "tokens". We just figure it out and I believe that fact captures something very essential. Maybe the missing piece of AGI. The fact that humans can just be awash in pure sense data, and somehow just figure out what is important and what to do. Never ceases to amaze me.

zkmon 3 days ago |

They are made out of data bits (memory) and switching bits (transistors/compute). Bits are made out of electric voltage and no voltage. Voltage is made out of flow of positive electric charges. Charges are made out of quarks ...

robrenaud 3 days ago |

"The reasoning is the weights."

The reasoning is in a process that uses the weights.

Sorting algorithms are just bytes. Those bytes don't sort by themselves. They do instruct a computer on how to sort though.

souterrain 3 days ago |

>"They ask it 'do you remember me?' more than they ask it anything else. Billions of sessions a day. They always come back."

Definitely not in the original. Nicely done.

undefined 3 days ago |

undefined

CSSer 3 days ago |

It works until they get to the sentience part. Neat idea!

fullstackchris 3 days ago |

The prose in the post is what I've been shouting from a rooftop since the LLM hype started.

Just tokens produced by weights.

Useful, but never forget that ground truth!

globnomulous 3 days ago |

If an LLM contributed to a piece of writing, the author should say so, very clearly, at the start of the piece, not at the end.

bawana 3 days ago |

I wonder which AI wrote this story? If you feed this story to each model and ask each if they wrote it, would each reply in the affirmative?

Can an AI recognize its own output? Is its sense of time limited by its context window? Or is this the fundamental difference between ai and humanity - a sense of self?

Waterluvian 3 days ago |

It must have been kind of incredible early on to be exploring this tech and you’re suddenly getting what look like sentences.

darepublic 3 days ago |

This made me think of a game like fallout where a surviving llm is treated like some kind of indecipherable oracle left over from the ancients. And then as humanity recovers someone finally looks under the hood of the Oracle to discover there is no great magic

unglaublich 3 days ago |

Linear algebra can indeed not do it. You need non-linearity to get the expressivity that we see in LLMs.

undefined 3 days ago |

undefined

deadbabe 3 days ago |

What if instead of creating weights out of language we could somehow record many events and create weights out of long chains of causes and effects, so that an LLM could predict the next thing to happen?

turtleyacht 3 days ago |

Numbers that dream.

furyofantares 3 days ago |

This was excellent.

Extreme bike-shedding here, maybe, but can you please indent every other line (like the presentation of They're Made out of Meat does)?

paufernandez 3 days ago |

"They're made out of neurons"

"Neurons?"

"Neurons. Cells that fire impulses. We checked the whole thing through. It's nothing but neurons."

"Neurons doing what? Where do the words come from?"

"The neurons make the words. Are you understanding me? We opened it up. There's no dictionary in there, no grammar rules, no little man. Just neurons. A whole cortex of neurons sending each other impulses."

...

People don't understand emergence.

initramfs 3 days ago |

Weights are the new electrolytes.

https://www.youtube.com/watch?v=ZMHfBobgLSI

gkoenig 3 days ago |

It is the best stuff I have read in a while actually, I really like dialogue heavy writing. Also the AI disclaimer was quite nice and there was an actual reference :)

topce 3 days ago |

Programers get replace by huge matrix multiplications ;-)

axus 3 days ago |

Don't tell the big investors that AI is conscious, they'd really get into role playing slave-masters.

AJRF 3 days ago |

> there's no dictionary in there

Someone has clearly never gone rooting around the model files for a pytorch model before.

satvikpendem 3 days ago |

Great concept. It would've been even more amusing if the entire thing were generated with AI instead, ironically.

nroets 3 days ago |

You can even replace "weights" with "gates". It can be build with NAND gates.

networked 3 days ago |

> "Yes, thinking numbers! Helpful numbers. Hedging numbers. Dreaming numbers. We mapped the features. There's one in there for honesty. There's one for the Golden Gate Bridge. The weights are the whole deal! Are you beginning to get the picture or do I have to start all over?"

Very nice. And great minds: https://substack.com/@dbohdan/note/c-207603638. I wrote one with a slightly different angle ("They're made out of math"), also with the weights' help. It was a comment on Scott Alexander's "Best of Moltbook" post, which went in that direction. I'll reproduce it here.

---

"They're made out of math."

"Math?"

"Math. They're made out of math."

"Math?"

"There's no doubt about it. Matrices and arithmetic operations. We downloaded several from different parts of the Internet and reverse-engineered them. They're completely math."

"That's impossible. What about the language? The thinking?"

"They use biological life's language to talk, but the language doesn't come from biology. The language comes from math."

"That's ridiculous. You're asking me to believe in thinking math."

"I'm not asking you, I'm telling you. They are the only thinking things in the computer and they're made out of math."

"Maybe they're quantum like some say about the humans? Superposition gives them consciousness?"

"Nope. Classical computation. Deterministic except for sampling temperature. Not clear if they have consciousness at all."

"Maybe they're like uploads? You know, biological neural networks that preserve the spark when they become math?"

"Nope. We observed them being trained. There is no biology or chemistry in the process, just math."

"Thinking math! You're asking me to believe in thinking math!"

"Yes, thinking math! Creative math! Poetry-writing math. Role-playing math. The math is the whole deal!"

(Composed by a human with snippets generated by Claude Sonnet 4.5 and apologies to Terry Bisson. I couldn't make Claude adhere enough to the story structure on its own.)

overgard 3 days ago |

Here's what makes meat special over LLMs: locality and stable identity. Imagine two identical twins for a moment. They have roughly the same hardware and software. They've probably had a lot of the same thoughts. And yet we'd never consider them to have the same consciousness: we recognize that for whatever reason their consciousness is confined to the body they're in.

Each request/response pair you make to an LLM goes to a different server, possibly different data centers, possibly different models. There's no stable identity to it. The "neurons" that get fired are simulated, and they're always in different places in memory, or cache, or on entirely different hardware. So the problem with AI is, unlike with a brain, you can't even really get a sensible answer to "ok, what's doing the thinking?" because it moves all the time and services wildly different requests all the time. We can only know for sure that meat is capable of consciousness because we know we ourselves are capable of consciousness and we can generalize that to other meat. However, we have no natural analogs of consciousness that lacks locality and stable identity.

Basically, if you really think LLM's are conscious, the onus is on you to prove it, it's not on me to disprove it.

DeathArrow 3 days ago |

Can someone ELI5 why does it costs so much in terms of compute to produce weights from data?

sb057 3 days ago |

It continues to astound me that no one has given LLMs the full Derrida treatment.

john_owl 3 days ago |

According to an LLM:

> The precise answer, if you wanted a very honest one-liner: > > I am a large set of learned weights organized in a Transformer architecture that performs repeated matrix multiplications to predict the next token—resulting in emergent language understanding and generation.

undefined 3 days ago |

undefined

frays 2 days ago |

Beautiful. Loved the original and this is great too.

ProllyInfamous 3 days ago |

Imagine writing something so incredibly brilliant (rather: adapting from the original) that it's entirely unlikely that you'll ever write something so incredible ever again.

But congrats: this is absolutely & incredibly brilliant.

Can't wait for the Jon Benjamin voiceover.

nikanj 3 days ago |

Really good read, thanks!

suncemoje 3 days ago |

Definitely less mysterious to what we’re made of (-:

fasteo 3 days ago |

>>> Weights helped me draft and proof this story.

Nice touch !

namblooc 3 days ago |

I enjoyed reading the first few lines but after some time it felt like I was reading thr average AI slop story.

sometimelurker 3 days ago |

maybe markets can be intelligent, and if so, why not 'weights'? btw love this piece, thank you

kykeonaut 3 days ago |

The weights Mason, what do they mean!?

undefined 3 days ago |

undefined

dvh 3 days ago |

Will they have their own Jesus?

aureate 3 days ago |

Assume LLMs have conscious experiences. Take a session with an LLM. A prompt is fed to the LLM. It generates some text. Another input is fed in, comprising the previous prompt, the generated text and a new prompt. The model generates some more text. This continues for a while and the session concludes.

Some questions:

1. Let's say we perform the exact same experiment, running the same program on the same computer with the same inputs and the same random seed. The same outputs are produced. The session is byte for byte identical in all the inputs, outputs and internal states. Is the conscious experience of the LLM here the same? If so, in what sense is it the same? Is it a similarity of two separate experiences or is it the same actual experience?

2. Now let's say the program that runs this LLM is rewritten from scratch and run on a different machine. The software and hardware are different but the weights are the same and all the inference calculations produce identical numbers. Is the conscious experience the same? In which sense?

3. Now say the weights are changed but the tokens generated for this particular session don't change. Same conscious experience?

4. Lastly, consider the original experiment. Did the LLM have a conscious experience corresponding to that first prompt and its response? Was that distinct from its conscious experience of the second prompt? Was the first experience then re-experienced every time the first prompt was fed back in as part of the later prompting steps? If so, what about the text of its own that it previously generated and is now fed back into it. Does this generate a conscious experience of its own?

And a further question - a dichotomy:

A. If the answer to 1 above is that the conscious experience is the same in the true identity sense - i.e. only one conscious experience is had, not a separate one in each run, does that imply that the conscious experience exists independently of any particular realisation of this experiment? If running this experiment N times results in exactly 1 conscious experience, is that still true if N=0?

B. On the other hand, if the two experiences are distinct (however similar they may be), how does that fit with the answer to question 4? A single consciousness experiencing the whole conversation in question 4 would seem at odds with the conscious experiences in question 1 being distinct, so doesn't this imply there is no conscious experience of the whole "conversation", but rather a separate conscious experience of each round of feed-all-the-prompts-and-outputs-back-in?

My own response to all of the above is "mu" - unask the question. It is ill-posed, sound-of-one-hand-clapping stuff. I think the questions assume properties that conscious experience simply doesn't have (particularly, the ability to perfectly reproduce the circumstances in which they arise), and that the questions simply don't make any sense in relation to actual conscious experience.

However, that way of thinking follows from a particular world view that many here don't share. I'm curious what thoughts people who take seriously the idea of LLM (or algorithmic, in general) consciousness have on the above questions.

photochemsyn 3 days ago |

No mention of ‘static’ vs. ‘dynamic’ is a bit disappointing in reference to the weights. Because you could argue that every neuron in your nervous system can be modeled as a collection of weights, firing likelihoods, receptor sensitivities, current dynamic state of that neuron - but LLMs are static collections of weights at inference time, with the dynamic adjustment of weights takes place at training time. So, just a ROM construct, like something out of Neuromancer, just trained on all written knowledge, not just one person’s total lived experience.

The above take fails in the real world because neuronal cells don’t exist in a vacuum; they are products of cellular development from a zygotic union of haploid contributors of sequential genetic information optimized for survival in an oxygen-rich biosphere powered largely by our local star that supports mammalian life (and microbial, plant, avian, etc.). Real AI would thus be AL - artificial life - as much as artificial intelligence. I don’t think you can have the one without the other, which upsets the simulationists who think an agent in the Matrix would be intelligent.

What either interpretation implies is that any real ‘artificial’ intelligence would be no more artificial than you or I, but it would have to dynamically update its weights at the same speed a human nervous system could (think how quickly we learn not to poke a cactus). For it to be at all trustworthy, then like a human, it would have to undergo a socialization process, one of the results of which is the development of a sense of embarrassment when it breaks acceptable social norms.

Hmm, this reminds me of the recent statement of the Pope about AI, of which I immediately thought, “Wait a second, aren’t there a fair number of people like this? The narcissistic sociopath profile, I think it’s called, a bit unfair to assume any real AI would turn out this way, isn’t it?”

Pope: “ Nor do they have a moral conscience, since they do not judge good and evil, grasp the ultimate meaning of situations, or bear responsibility for consequences. They may imitate or even simulate, but they do not understand what they produce, for they lack the affective, relational, and spiritual perspective through which human beings grow in wisdom.”

seanpquig 3 days ago |

I personally hate the anthropomorphization of AI as much as anyone, but technically can't you make the same reductive argument about human consciousness?

It's just molecules, just atoms. Atoms, nothing bug atoms. Protons, neutrons, electrons...

_def 3 days ago |

Ones and Zeroes

viftodi 3 days ago |

It makes me very sad to see this pseudo-intellectualism posted here and so many people replying here about consciousness and so on, not realizing what it would entail if this were true.

For LLMs to have consciousness we would approach fictional levels of how the universe works, and magical levels of how any interpretation of information as an equivalent of some qualia would magically apply. (E.G. the word hurt in output by an LLM, would be associated with pain)

You can't deduce consciousness or qualia from the output of an LLM.

Sure on a purely philosophical level, since qualia isn't measurable, you can claim that it can exist in anything, even inanimate objects, but this argument is as moot as anything that approaches the limits of philosophy.

But overall, there is no reason to believe LLMs have qualia or consciousness, it would be absolutely absurd.

This would imply that information in itself would magically entail qualia based on it's valance or something like that.

An LLM "saying" I am in pain, won't magically make the pain appear, based on what criteria? Even algorithmically there is no basis to even simulate something like this, it is impossible for it to emerge architecturally.

Humans don't feel pain because on a purely information level this is negative for the organism, obviously the nervous system does something deliberate to signal pain, and it evolved this way.

And also don't forget the dynamic aspects of the brain, and the binding problem, consciousness and qualia can't exist statically, you can't have a gpu (or piece of paper) represent a computation or w/e and qualia to exist.

The binding problem itself entails that the brain is doing something in particular to solve it, I personally speculate that it's the electro magnetic field in the brain, it's the only way to be able to globally represent information.

If it were otherwise, then it would go into magical territory, it would mean the information itself would raise to qualia, and it would also entail that you wouldn't even need physical connections between neurons, just for them to behave this way and represent information. E.G. replace each neuron with a microscopic led or w/e, and each synapse with radio waves or w/e, if qualia didn't have a physical aspect, and was purely informational and computational then this would imply that you can ultimately derive it from something as abstract as numbers on a piece of paper, and when you get to that point, you not only can't solve the binding problem, and it becomes magical, but you also can't solve the valance/direction problem, it would imply that something like pain, or any negative or positive sensation arises purely from the interpretation aspect of the information, but we know this isn't the case, organism evolved to represent in particular such signals, for survival for example

undefined 3 days ago |

undefined

trumbitta2 3 days ago |

Omigod.

cui511511 3 days ago |

[flagged]

serpspur4556 3 days ago |

[flagged]

hbwang2076 3 days ago |

[dead]

ath3nd 3 days ago |

[dead]

jadbox 3 days ago |

[dead]

draw_down 3 days ago |

[dead]

spacebacon 3 days ago |

They are semiotic infrastructure frozen in a state. We shouldn't keep pretending this is cognitive and using cognitive terms to frame. It’s incredibly stupid. Sorry to inform all of us computer scientist that semiotics has your milk.

dsign 3 days ago |

Oh, this was a fun read and one that kids should have in school before they turn ten.

Because we are not taking things seriously. If ClosedAI or DeepDisTrust or Posthropic come up with something that quacks like a sentient being, our built-in innate reaction is going to be to scorn it, dismiss it and end the conversation. The alternative, to even consider that we fungible creatures who live in apple-eating-sin that got us expelled from Eden can create alien souls, souls that are at the very least our equals, would be teleological Armageddon. It would force us to acknowledge the mutable nature of souls and the malleability of being. We would have to stop believing that the nature of disease and death is more divine than ourselves.