←back to thread

94 points jpiech | 2 comments | | HN request time: 0s | source

Process large (e.g. 4GB+) data sets in a spreadsheet.

Load GB/32 million-row files in seconds and use them without any crashes using up to about 500GB RAM.

Load/edit in-place/split/merge/clean CSV/text files with up to 32 million rows and 1 million columns.

Use your Python functions as UDF formulas that can return to GS-Calc images and entire CSV files.

Use a set of statistical pivot data functions.

Solver functions virtually without limits for the number of variables.

Create and display all popular chart types with millions of data points instantly.

Suggestions for improvements are welcome (and often implemented quite quickly).

Show context
tianqi ◴[] No.43799435[source]
It is an interesting tool. I've been struggling with Office Excel's inability to open large files. I always work with csv in python, and if a client must review the data in Excel, I take a random sample to generate a smaller file, then explain to the client that we can't open the whole in Excel. This really doesn't seem like a modern work.

"a slow, old pc with 8GB RAM"

By the way, this struck me like a humour of era. Oh god it has 8GB RAM. Cheers! To the good old days.

replies(2): >>43800204 #>>43801703 #
1. samzub ◴[] No.43801703[source]
Fully agree, I have found that above 300k rows Excel struggles even on a good laptop. Not even mentioning the Python integration into MS Excel that is so unbearably slow that it is much better performing the calculations outside of Excel first.

I am sold on the website looks and license model!

replies(1): >>43802013 #
2. microflash ◴[] No.43802013[source]
These days I use DuckDB to read massive excel files. DuckDB now ships with a nice local UI and it also works beautifully with Datagrip, my preferred database IDE. With SQL, it just becomes a matter of applying old grease to do whatever analysis I want.