There are places for clever hand code, even in C, even in the modern world. Data interchange is very not much not one of them. Just don't do this. If you want .ini, use toml. Use JSON if you don't. Even YAML is OK. Those with a penchant for pain like XML. And if you have convinced yourself your format must be binary (you're wrong, it doesn't), protobufs are there for you.
But absolutely, positively, never write a parser unless your job title is "programming language author". Use a library for this, even if you don't use libraries for anything else.
[1] Fine fine, lexer. We are nitpicking, after all.
If CR is used correctly in windows, then its behaviour is already covered by the LF case (as required for POSIX systems) and if CR is used incorrectly then you end up with all kinds of weird edge cases. So you’re much better off just jumping over that character entirely.
If the format is not sensitive to additional empty lines then converting them all CR to LF in-place is likely a safer approach, or a tokenizer that coalesces all sequential CR/LF characters into a single EOL token.
I write a lot of software that parses control protocols, the differences between the firmware from a single manufacturer on different devices is astonishing! I find it shocking the number that actually have no delimiters or packet length.
If you’re targeting iMacs or the Commodore 64, then sure, it’s something to be mindful of. But I’d wager you’d have bigger compatibility problems before you even get to line endings.
Is there some other edge cases regarding CR that I’ve missed? Or are you thinking ultra defensively (from a security standpoint)?
That said, I do like your suggestion of treating CR like LF where the schema isn’t sensitive to line numbering. Unfortunately for my use case, line numbering does matter somewhat. So would be good to understand if I have a ticking time bomb