←back to thread

173 points daviducolo | 1 comments | | HN request time: 0.203s | source
Show context
daviducolo ◴[] No.43333994[source]
You can read my blog post about the project at https://dev.to/daviducolo/introducing-krep-building-a-high-p...
replies(4): >>43335277 #>>43335647 #>>43337748 #>>43339241 #
geocar ◴[] No.43335277[source]
Hi David.

    $ (for x in `seq 1 100000`; do echo 'I am a Test Vector HeLlO World '"$x"; done) > /dev/shm/krep_tmp
Best of three runs shown:

    $ time ./krep -i hello /dev/shm/krep_tmp
    Found 43721 matches
    Search completed in 0.0017 seconds (2017.44 MB/s)
    Search details:
      - File size: 3.52 MB
      - Pattern length: 5 characters
      - Using AVX2 acceleration
      - Case-insensitive search
    real        0m0,005s
    user        0m0,001s
    sys         0m0,004s
    $ time ./krep HeLlO /dev/shm/krep_tmp
    Found 82355 matches
    Search completed in 0.0014 seconds (1259.72 MB/s)
    Search details:
      - File size: 1.71 MB
      - Pattern length: 5 characters
      - Using AVX2 acceleration
      - Case-sensitive search
    real        0m0,004s
    user        0m0,003s
    sys         0m0,004s
    $ time ./krep -i "HeLlO World" /dev/shm/krep_tmp
    Found 99958 matches
    Search completed in 0.0021 seconds (1700.54 MB/s)
    Search details:
      - File size: 3.52 MB
      - Pattern length: 11 characters
      - Using AVX2 acceleration
      - Case-insensitive search
    real        0m0,005s
    user        0m0,002s
    sys         0m0,004s
    $ time ./krep "I am a Test Vector HeLlO World" /dev/shm/krep_tmp
    Found 3964 matches
    Search completed in 0.0149 seconds (235.83 MB/s)
    Search details:
      - File size: 3.52 MB
      - Pattern length: 30 characters
      - Using AVX2 acceleration
      - Case-sensitive search
    real        0m0,016s
    user        0m0,015s
    sys         0m0,001s
    $ time ./krep -i "I am a Test Vector hello World" /dev/shm/krep_tmp
    Found 3964 matches
    Search completed in 0.0178 seconds (197.70 MB/s)
    Search details:
      - File size: 3.52 MB
      - Pattern length: 30 characters
      - Using AVX2 acceleration
      - Case-insensitive search
    real        0m0,021s
    user        0m0,017s
    sys         0m0,004s
Benchmark with fgrep (the first run was good enough):

    $ time fgrep -ci hello /dev/shm/krep_tmp
    100000
    real        0m0,003s
    user        0m0,003s
    sys         0m0,000s
    $ time fgrep -ci "I am a Test Vector hello World" /dev/shm/krep_tmp
    100000
    real 0m0,010s
    user 0m0,009s
    sys         0m0,000s
    $ time fgrep -c "I am a Test Vector HeLlO World" /dev/shm/krep_tmp
    100000
    real 0m0,005s
    user 0m0,004s
    sys         0m0,001s
This is a model name: Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz. There's 40gb of ram free and 10 cores doing nothing. shell is cpuset. On commit 95ed1853b561396c8a8bcbbdd115ed6273848e3f (HEAD -> main, origin/main, origin/HEAD). gcc is 13.3.0-6ubuntu2~24.04

tl;dr: krep produces obviously wrong results slower than fgrep.

replies(2): >>43335306 #>>43340892 #
burntsushi ◴[] No.43335306[source]
Consider using a bigger haystack. Your timings are so short that you're mostly just measuring the overhead of running a process.

This is relevant to krep because it spawns threads to search files (I guess for files over 1MB?).

This does not mean your benchmark is worthless. It just means you can't straight-forwardly generalize from it.

replies(3): >>43335857 #>>43336268 #>>43340461 #
fanf2 ◴[] No.43336268[source]
The incorrect results are far more important than the times!
replies(1): >>43337371 #
1. burntsushi ◴[] No.43337371[source]
I agree.