Hacker news

  • Top
  • New
  • Past
  • Ask
  • Show
  • Jobs

Cessation of public development of Kefir C compiler (https://kefir.protopopov.lv)

141 points by f311a 5 days ago | 134 comments | View on ycombinator

kator 5 days ago |

> Yet, this shift made me re-evaluate the open source code publishing. Prior to that, I have been positive about free and open software, and considered this to be the default mode for work such as kefir. I did not require any justifications from myself to publish something. Now, however, I feel more and more that the main beneficiaries of my unpaid work are companies scraping the internet to train large language models. Currently accepted status quo in this area goes against my own intentions in licensing this work under GNU GPLv3. Publication has ceased to be the "null hypothesis" for me, and requires explicit mental justification which I am not able to provide.

I feel this pain, one of my small donation driven sites has been destroyed by crawlers who just ignore robots.txt and burn the site into the ground.

Sort of jokingly I proposed an update to the "spam fax" law:

https://www.karlbunch.com/random/website-protection-act/

keyle 5 days ago |

   This project in particular has been unconcerned with new coding practices so far, primarily, because I derive pleasure from hand-written implementations of my ideas, and believe that overcoming challenges the hard way is the main value I get from it.
This 100% the same for me. Outside of work where speed is more important than quality, and I work with people that use AI, I don't use AI at all on my own projects. It poisons the mind and the soul. Ok that sounds dramatic, but I felt down up until the point where I started hand writing everything again. Software engineering is still fun and powerful, and the hell with where the world is going.

binaryturtle 5 days ago |

I'm also very hesitant to release any new works (code, artworks, etc.) to the public. I usually release code under the GPL or AGPL, but I don't think any of those choices are properly respected by the AI crawlers, and subsequent "mixing into" those models.

Multiple times I got partially broken "citations" of GPL licensed code out of the models as answers to basic research questions (aka prompts) w/o any mentioning of the original license applied to the code. Just adding some random bugs every 10th line doesn't make it not a direct derivate. Image generators happily generated Sonics or Bart Simpsons (w/o directly prompting for that either). No mentions that those are copyrighted characters either.

rgoulter 5 days ago |

Seems to me LLMs have changed some things. I'm not sure how it's best put, but it used to be:

- Seeing code (or a blogpost or whatever) was a result from effort where thought had gone into it. The writer paid effort so the reader didn't have to.

- There'd be some level of attachment to what you've put effort into.

With LLMs, that's undermined: it's easy to produce thoughtless imitations. Code or comments where thought didn't go into it. So, seeing some result isn't an indication of skill, but also not even an indication thought went into it.

I guess there's still something lost if someone isn't going to share code they've put thought into. -- But on the other hand, if it's just for me & I don't have to share it with a wider audience, getting LLMs to write out code isn't so expensive.. so code itself isn't necessarily something to value so much.

rurban 5 days ago |

One of the very few small compilers which passes the full gcc torture tests. But for me kefir is good enough as the reference small compiler. Not as fast as tcc, but more correct

genxy 5 days ago |

Surprised no one has yet linked to the source https://sr.ht/~jprotopopov/kefir/

altmanaltman 5 days ago |

What a well-rounded nicely written announcement that touches on all parts of the argument without any rage baiting or flex etc. It would be easy to just ramble against AI and how its the end of the world etc but the author focused on a point that's not even related to use or misue of AI in software but rather how we have made it acceptable that large corporate companies can skirt copyright without any issue and make rivers of money with it. This problem extends not only to coding but other industries as well.

Max-Ganz-II 5 days ago |

I put my site behind a username/password wall, to block LLM bots.

RetroTechie 5 days ago |

So how big is the community around this project?

If a one-person show, closing it up would effectively kill it? Or (re?)turn it into a hobby project developed at snail pace.

If some community exists: fork coming up?

nianderwallace 4 days ago |

People in other professions are jumping on this bandwagon - Tony Gilroy decided not to publish Andor TV show scripts to prevent AI companies using them for training.

see https://variety.com/2025/tv/news/andor-creator-refuses-publi...

turtleyacht 5 days ago |

It was nice hearing about it. If this is a healthy direction for the project, then so be it. At least source to previous versions is still available.

bjourne 5 days ago |

People taking your work and not giving anything back was ALWAYS the risk you took when writing free software. LLM training doesn't change that much. That the us military no doubt is using gcc to compile embedded software for their icbm:s no doubt irks the gnu people. But you can't have it any other way. "You can only use my software for good things" just is not consistent with "free software".

kazinator 5 days ago |

I'm finding it hard to be motivated to continue on language dev work. I feel it may also have to do with AI. Not so much the predatory aspect of it, like this author, but something else: shall we say, certain revelations about the nature of the target audience.

fithisux 5 days ago |

Same situation some time ago with Solar assembler

jdw64 5 days ago |

[dead]

ryanshrott 5 days ago |

The gcc torture tests are no joke. I skimmed them once thinking I’d write a toy C compiler. Thousands of test cases covering edge cases I’d never even thought about. Respect to anyone who gets through the full suite.

neoparker 5 days ago |

[flagged]

neoparker 5 days ago |

[flagged]

34aSHGAS 5 days ago |

[flagged]

sneak 5 days ago |

> I also do not want my future work to be exploited for naught in commercial purposes.

Other people using your code to enrich their lives or businesses doesn't exploit you in any way, as it doesn't cost you a thing. This is irrational.

Rochus 5 days ago |

I have many GPL projects (e.g. https://github.com/rochus-keller/Oberon, https://github.com/rochus-keller/Luon, https://github.com/rochus-keller/Micron) and spend a significant amount of time in them. GPL has always explicitly permitted commercial use; that's a feature, not a bug, dating back to Stallman's original vision. Any person or company can use my code (or Kefir code) under the terms of the GPL, as I use code given away by companies under GPL or even more liberal licences for free. That's the deal. GPL is a license explicitly designed to maximize use, so it doesn't make sense to object to a specific form of use. The claim that AI companies are somehow violating GPL by training on GPL code is legally baseless (I studied law here in Switzerland and had lectures about international IP law); also the FSF itself has not claimed otherwise; even if it were prohibited, it would be a copyright enforcement problem, and not a reason to stop publishing. I don't know Kefir, but it looks like a great (even optimizing) compiler. So it's really a pitty that its development is no longer open source.