Hacker news

  • Top
  • New
  • Past
  • Ask
  • Show
  • Jobs

Castor: CERN Advanced STORage Manager (https://castor.web.cern.ch)

64 points by naves 1 day ago | 31 comments | View on ycombinator

tempay 1 day ago |

I’m a little confused by this submission. CASTOR is the old system that has since been replaced by the CERN Tape Array since ~2020: https://cta.web.cern.ch/cta/

This is mentioned on the page but it’s easy to miss.

For the current status of tape storage at CERN see: https://indico.cern.ch/event/1471803/contributions/6967379/a...

For reference, most disk storage for physics data uses an in-house solution called EOS: https://eos-web.web.cern.ch/eos-web/

john_strinlai 1 day ago |

looks like the image on the right is broken, but it is supposed to be: https://cta.web.cern.ch/cta/assets/images/namespace_statisti...

(looks like this submission uses https://castor.web.cern.ch/content/home.html instead of https://castor.web.cern.ch/castor/ the second link does not have the broken image)

pezezin 1 day ago |

Fun fact: CERN sells old data tapes as souvenirs, I got myself one of the old LHC tapes :)

_pferreir_ 1 day ago |

As others have said, CASTOR has been discontinued, and replaced with CTA:

https://gitlab.cern.ch/cta/CTA

Its memory is still alive in CTA, however:

https://gitlab.cern.ch/cta/CTA/-/blob/main/catalogue/TapeSea...

Davidbrcz 1 day ago |

I was an intern at CERN in mid 2010s and worked on this !

Melatonic 1 day ago |

This is actually super useful for real world stuff. Thanks for this.

Tape is boring but when an intern / AI / tectonic plate accidently destroys your database setup it is a huge lifesaver

Anybody know what these fancy Oracle tapes are? Is it just their implementation of a regular standard?

perlgeek 1 day ago |

"Castor" was the name of a storage system used for transporting nuclear waste in Germany. There were quite a few protests against shipping nuclear waste through the country.

Wouldn't have been my choice for a software project :-)

adev_ 1 day ago |

A few historical additions for anybody interest:

- CASTOR at CERN had also its disk centric derivative named DPM (Disk pool manager) that helped to power the LHC computing grid for multiple decades (WLCG) before getting deprecated.

- Interestingly DPM had an architecture quite aligned with the original Google File system even if developed completely separately: (One metadata node, multiple disk node. Design to do Write-once-read-many with very partial POSIX semantics).

- The LHC computing Grid is an association of research centers with their own infrastructure. As such, they had (historically) many diffent storage systems with diffent protocols and interface.

- To unify this madness, an attempt to do a "standard" protocol was made in the 2000s: the SRM protocol (storage resources manager). In a pure XKCD fashion, it went as bad as you can imagine. It tried to rely on the tech of the time (XML, SOAP, WSDL) and is a school case of terrible protocol design (bloated, slow, weak consistency, massive server overhead, stupidly complex to implement and quite insecure). The spec are worth a read if you want a good laugh [1].

- After 20y of struggle, SRM was eventually dropped for a more pragmatic and ad hoc solution based on HTTP + xrootd [2]. EOS itself uses xrootd quite extensively. (if this did not change)

- The history of computing at CERN is globally interesting because it is a pretty good image of the evolution of computing and of the "tech fashions" associated with it.

[1]: https://sdm.lbl.gov/srm-wg/doc/SRM.spec.v2.1.1.html

[2]: https://xrootd.org/

boznz 1 day ago |

The various CERN web pages such as this were a treasure trove of information when I was working on my last novel. I actually included a few paragraphs on Castor thinking of using it as a side-plot, but my editor cut the plot out along with a few other technical niceties. Sigh!

dokyun 1 day ago |

Wonder how this compares to Venti[1]. It looks a lot more complicated (not really a good thing).

[1]: https://doc.cat-v.org/plan_9/4th_edition/papers/venti/

mrlonglong 1 day ago |

They now have over an exabyte worth of data on tapes.

bitbytebane 1 day ago |

[dead]

Lapsa 1 day ago |

[dead]