Hacker news

  • Top
  • New
  • Past
  • Ask
  • Show
  • Jobs

25 Years of Eggs (https://www.john-rush.com)

280 points by avyfain 5 days ago | 75 comments | View on ycombinator

cheschire about 22 hours ago |

Absolutely loved the article, the process, and the results. Hated the price.

You could pay a human to read receipts, 1 every 30 seconds (that’s slow!), $15/hr (twice the US federal minimum wage!), plus tax and overhead ($15x1.35) comes out to $20.25/hr over 5 hours. $101 all in.

Sure, sure, a human solution doesn’t scale. But this sort of project makes me feel like we haven’t hit the industrialization moment that i thought we had quite yet.

ProllyInfamous about 20 hours ago |

>Everyone needs a rewarding hobby. I’ve been scanning all of my receipts since 2001. I never typed in a single price - just kept the images. I figured someday the technology to read them would catch up, and the data would be interesting.

This is perhaps among the best openers I've ever read.

[spoiler: the tech caught up, the data is interesting]

I read a lot. This article, entirely.

egeozcan about 22 hours ago |

I usually avoid shallow comments but I feel like this time it has to be said as a conversation starter: That's a lot of eggs!

Also ignoring the benefits of subscriptions, an estimate in the magnitude of thousands of dollars for extracting egg prices still makes me feel like we aren't "there" yet. This should have been a problem with a much more efficient solution given the advancements in the AI, data analysis and OCR space. I am sort of disillusioned.

rkagerer about 6 hours ago |

10 years ago I wrote a reconciliation tool in VBA in Excel. I scan I all the (mostly thermal-printed) receipts and it matches them to credit card charges. I always envisioned incorporating OCR to automatically extract the totals, but the libraries were never good enough for my taste (and I've used industry-leading ones in work settings that process millions of reads a day).

So instead, I made a very simple UI where you just key in the amount (literally 5 keystrokes on average per image) and it finds the matching charge (or hit enter to instantly cycle through all matches). I've done bookeeping/taxes that way for a decade and keying has never been the bottleneck.

Recently I realized Amazon accounts for around a third of my credit card charges, by volume (yikes!). Unfortunately their transactions are more difficult to reconcile as portions of orders are charged piecemeal as they ship. Further, their webpage that is supposed to list your credit card charges with the matching order numbers is broken (lots of data missing - have reproduced and filed a bug report with their exec team which is still being worked on a month later).

So I wrote another tool. You download your order data and invoices via a personal data request, and it goes out and reconciles all of them. I wind up with a nice spreadsheet i can scroll around in, and whenever the cursor hits a row with an Amazon charge all the paperwork along with a generated order summary (granular down to the shipments and items) comes up on the screen to the right.

Pretty slick. And took less time to code up than his vibecoded project (but hats off to him anyway, sounds like a nice little project to hone your AI skills on). Sometimes these simple little bespoke tools are a far superior "productivity force multiplier" than fancy, generic commercial equivalents.

PaulHoule about 19 hours ago |

I am amused that this in the classic 1955 Asimov story

https://en.wikipedia.org/wiki/Franchise_(short_story)

the protagonist is interviewed as a one-man "focus group" in lieu of a national election and one of the questions he is asked is "What do you think about the price of eggs?" and he said roughly "I have no idea, my wife does the shopping."

JumpCrisscross about 8 hours ago |

Expensive eggs are a political choice. Canada has eggs [1]. Mexico, too [2]. Meanwhile we have Tyson notching record profits [3] while facing zero antitrust scrutiny.

[1] https://www.npr.org/2025/03/18/nx-s1-5330454/egg-shortages-r...

[2] https://www.globalproductprices.com/rankings/egg_prices/

[3] https://farmaction.us/farm-action-calls-for-an-investigation...

ismailmaj about 20 hours ago |

I don't know why people mess with tesseract in 2026, attention-based OCRs (and more recently VLMs) outperformed any LSTM-based approach since at least 2020.

My guess is that it's the entry-point to OCR and the internet is flooded by that, just like pandas for data processing.

rendaw about 1 hour ago |

Okay, so this is good for tracking egg price changes (I guess? It was $1,591).

But if you put this into your accounting spreadsheet or whatever, you'd be off by a few cents all over the place, your account balances wouldn't match up. Then what do you do?

I've been looking into this and 96% isn't great. The solution is digital receipts... which are still being blocked by industry interests etc etc.

krogenx about 8 hours ago |

A bit of a shill comment but… I have chickens and have been tracking egg production in an app that I’ve built, a livestock manager of sorts called Manger.

Looking at my data, since we’ve had our first egg 743 days ago, our hens have produced 9,393 eggs, or an average of just above a dozen a day.

The app can also count chickens, since each chicken has a UHF RFID.

https://m.youtube.com/watch?v=_iGn_pZ3IkY

ttul about 11 hours ago |

Tokens consumed: 1.6 billion Estimated token cost: $1,591

Wow.

PowerElectronix about 22 hours ago |

Inflation adjusted dsta just comes to tell us that either eggs have been outdoing the CPI for 25 years or that actual CPI is way higher than what the BLS calculates.

rdiddly about 17 hours ago |

This is the perfect job for AI, in that it's handling work the human didn't care enough to do manually. Although of course I don't care either. No value judgment there, just an observation. Imagine a place - a field let's say, part of a farm, long ago, but it had a road built through it, and thereby became a non-place, a patch of ground nobody dwells in or pays attention to or cares about, because when they're on it they're always heading somewhere else. The AI phenomenon is like that.

EdNutting about 20 hours ago |

The AI writing of the article made me give up halfway through. It’s a neat idea but the writing style of these AI models is brain-grating, especially when it’s the wrong style choice for this kind of technical report.

hbarka about 14 hours ago |

It’s so exciting to read more and more articles like this, using LLMs to discover clever solutions. I mean how many of us have dreamed of scanning years of receipts, waiting for that moment when you know a DIY solo application is at hand. I’m not being sarcastic, I too have a drawer full of Costco receipts which to me are data waiting for insight, not just crinkly paper. It’s more than being clever, it’s the realization of using a device not as a tool, but an equal partner who can suggest what tools and approaches to do. The end product of the LLM is not the point (although it can produce it better than ever), it’s the way an LLM can elevate messy knowledge work. A single person can now say that analysis knows no bounds.

eeixlk about 21 hours ago |

Apart from the comical cost of extracting this data from paper receipts, is it more likely that stores will publish their product costs over time so trends can be observed or be more like gas stations where no prices are listed. I have no idea why a box of Cheerios costs $7 for processed oats but i see millions of reasons to obscure that data.

gib444 about 22 hours ago |

> Estimated token cost $1,591

I can assume this person does in fact NOT need to worry about the price of eggs ?

jtwaleson about 11 hours ago |

Hmm, I've been sending receipts straight into Gemini 3 Flash and it handles them just fine. No need for this whole pipeline and definitely MUCH cheaper. Am I missing something?

MarceliusK about 19 hours ago |

Overall this feels less like a quirky egg project and more like a blueprint for how messy real-world data pipelines are going to look going forward

tkgally about 22 hours ago |

I haven't tried it with receipts, but I've gotten excellent OCR results with Gemini 3.0 and now 3.1 on some challenging texts: handwritten letters I couldn't fully decipher myself, vertically printed Japanese texts with tiny furigana readings next to the kanji, a 19th century book in English with extensive use of italics and small caps. Gemini is good at extracting text and formatting from complex layouts, and it might work with egg receipts, too.

flurb about 21 hours ago |

Great article through and through. The total number of places you've bought eggs at made me feel a tad depressed though: 4 places where you lived at or spent a longer time, 5 you traveled to *.

I tend to grow bored of a location after a year or two, though I'm certainly in the minority.

* Of course you didn't buy eggs every time you traveled somewhere, so probably not the entire truth.

sgbeal about 22 hours ago |

> Estimated token cost $1,591 > Confirmed egg receipts 589 > Total egg spend captured $1,972 > Total eggs 8,604

...

> I can’t wait to see what 30 years of eggs looks like.

At $2.70 per receipt, i'd be in no hurry to find out!

smcg about 16 hours ago |

Many states passed requirements for cage free eggs that went into effect by end of 2024 so that has had some effect on prices.

dinohlm about 14 hours ago |

The most surprising thing about this whole story is that he's been scanning all his receipts for the past 25 years. I've never heard of anyone doing this before and don't really know why you would want to.

Still, it made for a somewhat interesting exploration of AI techniques.

Metacelsus about 17 hours ago |

And if the price reflected the externalities of factory farming, eggs would be even more expensive!

s1mn about 14 hours ago |

I'm such a sucker for a good, data-driven article. Love this.

brcmthrowaway about 15 hours ago |

Question: Do big chat providers tool call an dedicated OCR, or is it part of the LLM?

BoredPositron about 21 hours ago |

There is a reason why reciept transcription is still the task with the highest demand on mechanical turk.

DeathArrow about 20 hours ago |

Without 25 years of photographing receipts, weeks of agents coding and billions of token spent, I can predict that egg prices increased, and the graph of my egg consumption over time is concave, part because my income has risen, part because while all prices get inflated, eggs are still cheaper than other sources of protein, and I did in less than 1 microsecond.

I will use them tokens to be able to afford more eggs.

twinpost_rules about 5 hours ago |

[dead]