(github.com)

46 points mousomashakel | 2 comments | 17 May 25 04:39 UTC | HN request time: 0.407s | source

Hey HN, I’ve built Fahmatrix, a minimal, fast Java library for working with tabular data — inspired by Python’s pandas, but designed for performance and simplicity on the JVM.

After working extensively with Python’s data stack, I often ran into limitations related to speed, especially in larger or long-running data workflows. So I built Fahmatrix from scratch to offer similar APIs for manipulating CSVs, performing summary statistics, slicing rows/columns, and more — but all in Java.

Features:

Lightweight and dependency-free

CSV/TSV import with auto-headers

Series/DataFrame structures (like pandas)

describe(), mean(), stdDev(), percentile() and more

Fast parallel operations on numeric columns

Java 17+ support

Docs: https://moustafa-nasr.github.io/Fahmatrix/ GitHub: https://github.com/moustafa-nasr/fahmatrix

I’d love feedback from the Java and data communities — especially if you’ve ever wanted a simple dataframe utility in Java without needing full-scale ML libraries.

Happy to answer any questions!

1. skanga ◴[17 May 25 05:57 UTC] No.44012325[source]▶

>>44012074 (OP) #

What about Tablesaw, Apache Arrow? How does this compare ...

replies(1): >>44019074 #

2. mousomashakel ◴[18 May 25 04:48 UTC] No.44019074[source]▶

>>44012325 (TP) #

Good question. I’ll publish benchmarks soon, but the core difference is that Fahmatrix is fully Java, no JNI, and minimalistic — ideal for small projects or environments like Android. Tablesaw and Arrow are more powerful, but heavier. Fahmatrix aims to be the “just enough” middle ground.

↑

Show HN: Fahmatrix – A Lightweight, Pandas-Like DataFrame Library for Java