168 points by sdpmas 2 days ago | 46 comments | View on ycombinator
NooneAtAll3 1 day ago |
andai 1 day ago |
Maybe not quite a fair comparison since my human brain has been "learning" for half a billion years before I was born.
I wonder if there's an equivalent of that for AI. Evolving the architectures?
pastescreenshot 1 day ago |
nsnzjznzbx 2 days ago |
"Train yourself to solve this problem see OBJECTIVE.md"
QubridAI 1 day ago |
abeppu 1 day ago |
I think someone during the copy-editing process told them this needed to look more complicated?
phr4ts 1 day ago |
naasking 1 day ago |
littlestymaar 2 days ago |
I'm not convinced this is particularly true in today's world, if you have more compute, you can simply generate more, and higher quality, artificial data. That's what all labs have been doing since at least 2023.
Also, the post references the Chinchilla-optimal training as a comparison baseline, but everyone has moved far beyond Chinchilla scaling, small models are routinely trained on 10-400 times more data than (1-40T tokens) than the Chinchilla-optimal number, so the entire industry went the complete opposite of what they are proposing.
That doesn't mean the techniques presented here are useless or anything (I'm not qualified to judge) but you should take the introduction with a grain of salt.
yorwba 2 days ago |
webagent255 1 day ago |
myylogic 2 days ago |
aledevv 1 day ago |
AliEveryHour16 2 days ago |
1425curlz80 1 day ago |
instead it's more parameters with less training data... but I don't really see any quality control?