31 points by acc_10000 4 days ago | 13 comments | View on ycombinator
acc_10000 4 days ago |
ata-sesli 1 day ago |
stephantul 1 day ago |
KeplerBoy 1 day ago |
If that's your bar for quality, I'll think less of your code. I can't help it.
Also your saxpy example seems to be daxpy. s and d are short for single or double precision.
alephnerd 1 day ago |
Plenty of people on HN wish to bury their head under the sand, but this highlights how critical it is becoming to be both a good engineer and adept at using agentic tooling within your development lifecycle.
ironkernel lets you write element-wise expressions with a Python decorator, compiles them to a Rust expression tree at definition time, and executes via rayon on all cores. ~2k lines of Rust, ~500 lines of Python.
The win is expression fusion: NumPy evaluates `where(x > 0, sqrt(abs(x)) + sin(x), 0)` as 5 passes with 4 temporaries. ironkernel fuses into 1 pass, zero temporaries, and skips dead branches (no NaN from sqrt of negatives). 2.25x NumPy on compound expressions at 10M elements. For BLAS ops like SAXPY, NumPy is faster — ironkernel doesn't call BLAS.
Early stage: f64 only, 1-D only, expression subset only (intentional — parallel safety guarantee). Numba warm is 3.2x faster (LLVM JIT vs interpreter).