SIMD can work quite well here too - with 128-bit SIMD (available on both baseline x86-64 and aarch64) this can be just ≤8 loop iterations checking for the newline character (each iteration counting the number of newline characters encountered, and a lzcnt on the last iteration), and similar for characters (assuming valid UTF-8, it's a single comparison to test if a byte starts a new char).