←back to thread

349 points dgl | 1 comments | | HN request time: 0.217s | source
Show context
10000truths ◴[] No.44502931[source]
This is a big problem with using ad-hoc DSLs for config - there's often no formal specification for the grammar, and so the source of truth for parsing is spread between the home-grown serialization implementation and the home-grown deserialization implementation. If they get out of sync (e.g. someone adds new grammar to the parser but forgets to update the writer), you end up with a parser differential, and tick goes the time bomb. The lesson: have one source of truth, and generate everything that relies on it from that.
replies(3): >>44503902 #>>44504346 #>>44507893 #
xorcist ◴[] No.44507893[source]
The problem here isn't that the parser was updated. The parser and writer did what they did for a reason, that made sense historically but wasn't what the submodule system expected. The submodule system is a bit "tacked on" to the git design and it's not the first time that particular abstraction cracks.

Every file format is underspecified in some way when you use it on enough platforms and protocols, unless the format is actually a reference implementation, and we've had enough problems with those. There's a reason IETF usually demands two independent implementations.

Similar problems can affect case insensitive filesystems, or when moving data between different locales which affect UTF-8 normalization. It's not surprising that an almost identical CVE was just one year ago.

Be careful what you wish for. They could have used yaml instead of ini for the config format, and we would have had more security issues over the years, not less.

replies(1): >>44508216 #
1. account42 ◴[] No.44508216[source]
No, the writer encodes values in a way that they will be read back as different values. This underlying issue is absolutely an encoder/decoder mismatch and has nothing to do with submodules.