Most active commenters

matthewolfe(3)

Popular/hot comments

>>44424439 #
>>44424671 #
>>44424691 #

Show HN: TokenDagger – A tokenizer faster than OpenAI's Tiktoken

(github.com)

TokenDagger is a drop-in replacement for OpenAI’s Tiktoken (the tokenizer behind Llama 3, Mistral, GPT-3.*, etc.). It’s written in C++ 17 with thin Python bindings, keeps the exact same BPE vocab/special-token rules, and focuses on raw speed.

I’m teaching myself LLM internals by re-implementing the stack from first principles. Profiling TikToken’s Python/Rust implementation showed a lot of time was spent doing regex matching. Most of my perf gains come from a) using a faster jit-compiled regex engine; and b) simplifying the algorithm to forego regex matching special tokens at all.

Benchmarking code is included. Notable results show: - 4x faster code sample tokenization on a single thread. - 2-3x higher throughput when tested on a 1GB natural language text file.

Show context

npalli ◴[30 Jun 25 13:13 UTC] No.44422888[source]▶

>>44422480 (OP) #

Kudos, I think (in the short term at least) there is a large amount of perf. optimization to be found by coding parts of the whole AI/ML infrastructure in C++ like this one, not as a rewrite (god no!) but drop in and fix key bottlenecks. Anytime I see someone (seems Chinese engineers are good at this) put something out in C++, good chance some solid engineering tradeoffs have been made and dramatic improvement will be seen.

replies(4): >>44424382 #>>44424572 #>>44424990 #>>44427963 #

1. matthewolfe ◴[30 Jun 25 15:16 UTC] No.44424382[source]▶

>>44422888 #

Agreed. A former mentor of mine told me a nice way of viewing software development:

1. Make it work. 2. Make it fast. 3. Make it pretty.

Transformers & LLMs have been developed to a point where they work quite well. I feel as though we're at a stage where most substantial progress is being made on the performance side.

replies(3): >>44424439 #>>44425934 #>>44426074 #

2. diggan ◴[30 Jun 25 15:22 UTC] No.44424439[source]▶

>>44424382 (TP) #

Heh, seems people I've been learning from been biased away from beauty, as I know that as "Make It Work, Make It Right, Make It Fast".

replies(5): >>44424671 #>>44425051 #>>44425719 #>>44428459 #>>44428747 #

3. abybaddi009 ◴[30 Jun 25 15:44 UTC] No.44424671[source]▶

>>44424439 #

What's the difference between make it work and make it right? Aren't they the same thing?

replies(4): >>44424691 #>>44424757 #>>44424773 #>>44424931 #

4. stavros ◴[30 Jun 25 15:47 UTC] No.44424691{3}[source]▶

>>44424671 #

Yeah, if it's not right, it doesn't work.

replies(3): >>44424765 #>>44424906 #>>44425794 #

5. robertfw ◴[30 Jun 25 15:53 UTC] No.44424757{3}[source]▶

>>44424671 #

Making it work can be a hacky, tech debt laden implementation. Making it right involves refactoring/rewriting with an eye towards maintainability, testability, etc etc

6. darknoon ◴[30 Jun 25 15:54 UTC] No.44424765{4}[source]▶

>>44424691 #

In ML, often it does work to a degree even if it's not 100% correct. So getting it working at all is all about hacking b/c most ideas are bad and don't work. Then you'll find wins by incrementally correcting issues with the math / data / floating point precision / etc.

7. ◴[30 Jun 25 15:55 UTC] No.44424773{3}[source]▶

>>44424671 #

8. DSingularity ◴[30 Jun 25 16:06 UTC] No.44424906{4}[source]▶

>>44424691 #

Not true. Things can work with hacks. Your standards might consider it unacceptable to have hacks. So you can have a “make it right” stage.

9. gopalv ◴[30 Jun 25 16:08 UTC] No.44424931{3}[source]▶

>>44424671 #

> make it work and make it right?

My mentor used say it is the difference between a screw and glue.

You can glue some things together and prove that it works, but eventually you learn that anytime you had to break something to fix it, you should've used a screw.

It is trade off in coupling - the glue binds tightly over the entire surface but a screw concentrates the loads, so needs maintenance to stay tight.

You only really know which is "right" it if you test it to destruction.

All of that advice is probably sounding date now, even in material science the glue might be winning (see the Tesla bumper or Lotus Elise bonding videos - every screw is extra grams).

10. kevindamm ◴[30 Jun 25 16:20 UTC] No.44425051[source]▶

>>44424439 #

I've usually heard/said it as

  1. Make it
  2. Make it work
  3. Make it work better

(different circumstances have different nuances about what "better" means, it isn't always performance optimization; some do substitute "faster" for "better" here, but I think it loses generality then).

replies(1): >>44430314 #

11. gabrielhidasy ◴[30 Jun 25 17:17 UTC] No.44425719[source]▶

>>44424439 #

I always heard the "Make it Right" as "Make it Beautiful", where Right and Beautiful would mean "non-hacky, easily maintainable, easily extendable, well tested, and well documented"

12. gabrielhidasy ◴[30 Jun 25 17:24 UTC] No.44425794{4}[source]▶

>>44424691 #

Depends on your definition of "right" and "work". It could be a big ball of mud that always returns exactly the required response (so it 'works'), but be hellish hard change and very picky about dependencies and environment (so it's not 'right').

replies(1): >>44425846 #

13. stavros ◴[30 Jun 25 17:29 UTC] No.44425846{5}[source]▶

>>44425794 #

Nope, it's right, but it's not pretty.

14. binarymax ◴[30 Jun 25 17:38 UTC] No.44425934[source]▶

>>44424382 (TP) #

The Huggingface transformers lib is currently undergoing a refactor to get rid of cruft and make it more extensible, hopefully with some perf gains.

15. jotux ◴[30 Jun 25 17:51 UTC] No.44426074[source]▶

>>44424382 (TP) #

A similar concept dates back to 30BC: https://en.wikipedia.org/wiki/De_architectura

Firmitas, utilitas, venustas - Strong, useful, and beautiful.

16. matthewolfe ◴[30 Jun 25 22:11 UTC] No.44428459[source]▶

>>44424439 #

Fair chance I'm remembering it wrong :D

17. mindcrime ◴[30 Jun 25 22:48 UTC] No.44428747[source]▶

>>44424439 #

I've always heard it (and said it) as:

  1. Make it work
  2. Make it correct
  3. Make it fast

18. acosmism ◴[01 Jul 25 03:34 UTC] No.44430314{3}[source]▶

>>44425051 #

i like this version best

↑