←back to thread

3883 points kuroguro | 4 comments | | HN request time: 0.627s | source
Show context
z92 ◴[] No.26296975[source]
Also note that the way he fixed it, strlen only caches the last call and returns quickly on an immediate second call with the same string.

Another reason why C styled null terminated strings suck. Use a class or structure and store both the string pointer and its length.

I have seen other programs where strlen was gobbling up 95% of execution time.

replies(3): >>26297107 #>>26299247 #>>26299304 #
1. ip26 ◴[] No.26297107[source]
Could this be worked into a compiler/stdlib from the back-end? Could a compiler/stdlib quietly treat all strings as a struct of {length,string} and redefine strlen to just fetch the length field? Perhaps setting a hook to transparently update "length" when "string" is updated is not trivial.

Edit: hah, I'm decades late to the party, here we go:

Most modern libraries replace C strings with a structure containing a 32-bit or larger length value (far more than were ever considered for length-prefixed strings), and often add another pointer, a reference count, and even a NUL to speed up conversion back to a C string. Memory is far larger now, such that if the addition of 3 (or 16, or more) bytes to each string is a real problem the software will have to be dealing with so many small strings that some other storage method will save even more memory (for instance there may be so many duplicates that a hash table will use less memory). Examples include the C++ Standard Template Library std::string...

https://en.wikipedia.org/wiki/Null-terminated_string

replies(3): >>26297286 #>>26298102 #>>26298111 #
2. toast0 ◴[] No.26297286[source]
I don't think you could do it transparently, because it's expected to pass the tail of a character array by doing &s[100] or s + 100, etc. I don't think that would be easy to catch all of those and turn them into a string fragment reference.

From c++ class, std::string was easy enough to use everywhere, and just foo.c_str() when you needed to send it to a C library. But that may drags in a lot of assumptions about memory allocation and what not. Clearly, we don't want to allocate when taking 6 minutes to parse 10 megs of JSON! :)

3. ◴[] No.26298102[source]
4. magicalhippo ◴[] No.26298111[source]
If only C had introduced an actual string type...