802 points by bookstore-romeo 3 days ago | 273 comments | View on ycombinator
frankling_ 3 days ago |
swiftcoder 3 days ago |
Is a mid-to-high engineering salary outlandish for a CEO of what is likely to be a fairly major non-profit? Even non-profits have to be somewhat competitive when it comes to salary, and the ideal candidate is likely someone who would be balancing this against a tenured position at a major university
halperter 3 days ago |
whiplash451 2 days ago |
Google "sorted out" a messy web with pagerank. Academic papers link to each others. What prevents us from building a ranking from there?
I'm conscious I might be over-simplifying things, but curious to see what I am missing.
krick 2 days ago |
And, FWIW, I do think that arXiv truly has a vast potential to be improved. It is currently in the position to change the whole process of how the research results are shared, yet it is still, as others have said, only a PDF hosting. And since the universities couldn't break out of the whole Elsevier & co. scam despite the internet existing for the 30 years, to me, breaking free from the university affiliation sounds like a good thing.
But, of course, I am talking only about the possibilities being out there. I know nothing about the people in charge of the whole endeavor, and ultimately in depends on them only, if it sails or sinks.
psalminen 3 days ago |
beezle 1 day ago |
APS and BNL Host XXX e-Print Archive Mirror Feb. 1, 2000
The APS is establishing, in cooperation with Brookhaven National Laboratory, the first electronic mirror in the United States for the Los Alamos e-Print Archive.
Today, from the landing page, it describes itself as "arXiv is a free distribution service and an open-access archive for nearly 2.4 million scholarly articles in the fields of [long list]. Materials on this site are not peer-reviewed by arXiv.
Well, that's a large part of the problem. A lot of the stuff there now will never see a journal (even of dubious quality) and there is limited filtering of what new submissions will be stored. GIGO.
Best thing ArXiv could do is go back to their roots - limit the fields and return to preprint only. Spin off the comp sci stuff for sure to someone else along with all its headaches.
fixed: url
lifeisstillgood 2 days ago |
Then getting peer reviewed is a harder process but one can see some form of credit on the site coming from doing a decent reviewers job.
I suspect I am missing a lot of nuance …
taormina 2 days ago |
tokai 2 days ago |
Its especially problematic because while ArXiv love to claim to be working for open science, they don't default to open licensing. Much of the publications they host are not Open Access, and are only read access. So there is definitely the potential to close things off at some point in the future, when some CEO need to increase value.
OutOfHere 3 days ago |
tamimy 1 day ago |
ide0666 2 days ago |
dataflow 3 days ago |
Could they not have made it into some legal structure that puts universities at the top? Say, with a bunch of universities owning shares that comprise the entirety of the ownership of arXiv, but that would allow arXiv to independently raise funds?
contubernio 2 days ago |
A setup as a US-based "non-profit" is worrisome, if only because 300K is an obscene salary even in a for-profit setting. That the US-based posters can't see this is evidence of the basic problem which is that the US, both left and right, has been taken over by a neoliberal feudal antidemocratic nativist mindset that is anathema to the sort of free interchange of ideas that underlay the ArXiv's development in the hands of mathematicians and physicists now swept aside and ignored by machine learning grifters and technicians who program computers.
hereme888 2 days ago |
asimpleusecase 3 days ago |
MetaMonk 2 days ago |
bonoboTP 3 days ago |
Any change to the basic premise will be a negative step.
They should just be boring quiet unopininionated neutral background infrastructure.
tornikeo 3 days ago |
AccessScan 2 days ago |
shevy-java 3 days ago |
I am wary of that. IMO the business model is damaged therein. You can say in 2022 we had 27; bankrupt in 2030.
juped 2 days ago |
arXiv is doomed. It was nice while it lasted.
Garlef 3 days ago |
You need your favourite academic gatekeeper (= thesis advisor) to vouch for you in order to be allowed to upload.
Then AI slop gets flagged and the shame spreads through the graph. And flaggings need to have evidence attached that can again be flagged.
Aerolfos 3 days ago |
OpenAI shows exactly how well that works and what that kind of governance does to a company and to its support of science and the commons.
TL;DR, it's fucked.
vedantxn 2 days ago |
adamnemecek 3 days ago |
jeremie_strand 2 days ago |
hirako2000 2 days ago |
I read a dozen papers a month, typically on arxiv, never from paywalled journals. I find the quality on par. But maybe I'm missing something.
losvedir 2 days ago |
Oh, wait.
Peteragain 3 days ago |
Drblessing 2 days ago |
pugchat about 9 hours ago |
stefantalpalaru 2 days ago |
ryguz 2 days ago |
bobokaytop 3 days ago |
Ghengeaua 3 days ago |
unit149 3 days ago |
eastern-sun 2 days ago |
tgtracing 3 days ago |
ACCount37 2 days ago |
I had to tell my AI to set up an MCP for "fetch while bypassing arXiv's rate limit" so that it doesn't burn 40k tokens looking for workarounds every time it wants to look at a paper and gets hit with a "sorry, meatbags only" wall.
Very annoying, given how relevant arXiv papers are for ML specifically, and how many of papers there are. Can't "human flesh search" through all of them to pick the relevant ones for your work, and they just had to insist on making it harder for AIs to do it too.
davnicwil 3 days ago |
That is, it's not readily parseable, it really gives an insider term vibe - like this isn't for you if you don't already know what it means or how you should read or say it. It sort of reminds me of the overuse of latin and latinate terms generally in the old professions and, well, the academy.
Just always struck me as being somewhat at odds with the goal.
The vacuum that arXiv originally filled was one of a glorified PDF hosting service with just enough of a reputation to allow some preprints to be cited in a formally published paper, and with just enough moderation to not devolve into spam and chaos. It has also been instrumental in pushing publishers towards open access (i.e., to finally give up).
Unfortunately, over the years, arXiv has become something like a "venue" in its own right, particularly in ML, with some decently cited papers never formally published and "preprints" being cited left and right. Consider the impression you get when seeing a reference to an arXiv preprint vs. a link to an author's institutional website.
In my view, arXiv fulfills its function better the less power it has as an institution, and I thus have exactly zero trust that the split from Cornell is driven by that function. We've seen the kind of appeasement prose from their statement and FAQ [1] countless times before, and it's now time for the usual routine of snapshotting the site to watch the inevitable amendments to the mission statement.
"What positive changes should users expect to see?" - I guess the negative ones we'll have to see for ourselves.
[1] https://tech.cornell.edu/arxiv/