←back to thread

3883 points kuroguro | 1 comments | | HN request time: 0s | source
Show context
ufo ◴[] No.26297612[source]
The part that puzzles me the most was this comment about sscanf:

> To be fair I had no idea most sscanf implementations called strlen so I can’t blame the developer who wrote this.

Is this true? Is sscanf really O(N) on the size of the string? Why does it need to call strlen in the first place?

replies(1): >>26298300 #
JdeBP ◴[] No.26298300[source]
I think that the author hasn't checked them all. Even this isn't checking them all.

The MUSL C library' sscanf() does not do this, but does call memchr() on limited substrings of the input string as it refills its input buffer, so it's not entirely free of this behaviour.

* https://git.musl-libc.org/cgit/musl/tree/src/stdio/vsscanf.c

The sscanf() in Microsoft's C library does this because it all passes through a __stdio_common_vsscanf() function which uses length-counted rather than NUL-terminated strings internally.

* https://github.com/tpn/winsdk-10/blob/master/Include/10.0.16...

* https://github.com/huangqinjin/ucrt/blob/master/inc/corecrt_...

The GNU C library does something similar, using a FILE structure alongside a special "operations" table, with a _rawmemchr() in the initialization.

* https://github.com/bminor/glibc/blob/master/libio/strops.c#L...

* https://github.com/bminor/glibc/blob/master/libio/strfile.h#...

The FreeBSD C library does not use a separate "operations" table.

* https://github.com/freebsd/freebsd-src/blob/main/lib/libc/st...

A glib summary is that sscanf() in these implementations has to set up state on every call that fscanf() has the luxury of keeping around over multiple calls in the FILE structure. They're setting up special nonce FILE objects for each sscanf() call, and that involves finding out how long the input string is every time.

It is food for thought. How much could life be improved if these implementations exported the way to set up these nonce FILE structures from a string, and callers used fscanf() instead of sscanf()? How many applications are scanning long strings with lots of calls to sscanf()?

replies(6): >>26298762 #>>26298773 #>>26300532 #>>26301737 #>>26307663 #>>26352655 #
1. froh ◴[] No.26352655[source]
fmemopen is standard these days.

just wrap the string into a FILE, explicitly setting the buffer size to strlen(s), use fscanf the loop and fasten your seatbelts...

https://pubs.opengroup.org/onlinepubs/9699919799/functions/f...