180 points by surprisetalk 4 days ago | 251 comments | View on ycombinator
airstrike 4 days ago |
mekpro 4 days ago |
They’re using security concerns to mask their inability to deliver the model at scale, while still trying to maintain their lead over OpenAI. As a result, they’ve chosen to release it privately under the banner of an “ethical” rollout.
ianm218 4 days ago |
- Valkey/ Redis port here https://github.com/ianm199/valdr (passes ~99% of single node test suite, real prod features like replication/ clustering/ HA early or not implemented) - Further along port of Lua 5.1-5.5 https://github.com/ianm199/lua-rs-port/tree/main - I have a less developed nginx version that would be the north star - These projects are very alpha at the moment
If anyone is interested in getting involved in this or has done similar experiments I'd love to collaborate! There is so much variation in how you can run these large scale agent fleets I don't think anyone has a perfect system yet.
mentalgear 4 days ago |
If society can't trust banks and other institutions to safely control their data, what follows ?
Do we we collectivelly switch off the internet?
waffleiron 4 days ago |
[1] https://www.anthropic.com/news/statement-department-of-war :
> But using these systems for mass domestic surveillance is incompatible with democratic values.
827a 4 days ago |
aspectop 4 days ago |
aliljet 4 days ago |
strictnein 4 days ago |
I mention this because if you're frustrated that you can't access it, you're not alone. Even with our company's heft and a security org that is very well known in the industry we're getting nowhere.
tantalor 4 days ago |
3asdkab 4 days ago |
These entities will now give all their IP to an American company that only promises not to spy on Americans.
Subsequently, the NSA can audit the leaked sources manually and find real exploits.
yalogin 4 days ago |
bushido 4 days ago |
Will likely give them time to expand capacity as well. And make them harder to dislodge in these orgs.
fontain 4 days ago |
The only trend Mythos continues is Anthropic’s trend of warning that disaster is always 6 to 12 months away.
merrvk 4 days ago |
yanis_t 4 days ago |
I'm afraid that the usual mantra that "we just need more scale" that worked well for attracting investments, is not working anymore - bigger models provide marginal improvements while naturally get much more expensive to run.
Is this why both Anthropic and OpenAI are rushing for IPOs this year?
CephalopodMD 4 days ago |
cassianoleal 4 days ago |
philipwhiuk 4 days ago |
cmxch 4 days ago |
iamniels 4 days ago |
undefined 4 days ago |
catigula 4 days ago |
andrewjneumann 4 days ago |
No comparison to human teams, and I’m sure that $1 million in tokens was used by humans, in a team. So like most AI, they’ve developed a tool that capable people can use to be better, but unlike most tools, they’re claiming this to be outright magic. The magic is the hype train.
jofzar 4 days ago |
I mean most nasdaq tech companies would be in 13+ countries, why are they writing this like it's a big number, is hilariously small?
aplthrowaway67 4 days ago |
jb_briant 4 days ago |
Step2: offer to test it, but only for the biggest companies in the world
Step 3: onboard those big players on your tooling and product
Step 4: profit
This is genius.
maipen 4 days ago |
andai 4 days ago |
frays 4 days ago |
3sk_ask8 4 days ago |
- They still claim 10000 issues, but they found only one in curl.
- They did not find rsync issues but Claude rather introduced rsync issues.
- Facebook is a member of this cult program but Mythos did not find the account takeover flaw.
- Mythos did not find the issues in Anthropic's own Bun rewrite.
They will not release Mythos because it would be exposed as a fraud before the IPO.
cyanydeez 4 days ago |
testfrequency 4 days ago |
mrbonner 4 days ago |
> We've used it at work
> it is... not as hype as everyone is concerned about
> I'd argue the framework around it for security scanning is the arguably more useful side of the tool, definitely doesnt take a huge model to get all the issues it flagged on our systems
> For us, it absolutely flooded us with noise
> I mean hundreds if not thousands of false positives or minor issues or not applicable
> For every one reasonable issue
> The biggest issue it created was the execs treated every issue it produced like it was a drop everything and fix the issue type deal
> I'm talking company wide drop all things "we need to patch nginx because this module that no one uses and is disabled by default has this RCE vulnerability™
> Or "all ec2 AMIs need to be upgraded because it flagged a a version specific docker vulnerability", it flagged every single machine with docker regardless of if the actual vulnerability was relevant
> Vulnerability was with a very specific Auth plugin configuration you could enable with docker and specifically the Mosley docker compatible tool, but it is clear it only knew there was a vulnerability in docker, not if it was applicable or not
> Meanwhile dirtyfrag and friends not a single peep from btw despite it allowing for container escape
> Idk, I was underwhelmed with the quality of the reporting it gave really. If the company allowed me to get information about all the infrastructure in our entire organisation to run Claude over it repeatedly looking for recent CVEs I'm sure I could produce the same results...