←back to thread

185 points thunderbong | 1 comments | | HN request time: 0s | source
Show context
kazinator ◴[] No.43647015[source]
Fun fact: you can stick a null byte into the shebang line to terminate it, as an alterantive to the newline.

It's possible to have a scripting language support extra command line arguments after the null byte, which is less disruptive to the syntax than recognizing arguments from a second line.

I.e.

  #!/path/to/interpreter --arg<NUL>--more --args<LF>
Or

  #!/usr/bin/env interpreter<NUL>--all --args<LF>
On some OS's, you only get one arg: everything after the space, to the end of the line, is one argument.

When we stick a <NUL> there, that argument stops there; but our interpreter can read the whole line including the <NUL> up to the <LF> and then extract additional arguments between <NUL> and <LF>

https://www.nongnu.org/txr/txr-manpage.html#N-74C247FD

The interpreter could get the arguments in other ways, like from a second line after the hash bang line. But with the null hack, all the processing revolves around just the one hash bang line. You can retrofit this logic into an interpreter that already knows how to ignore the hash bang line, without doing any work beyond getting it to load the line properly with the embedded nul, and extract the arguments. You dont have to alter the syntax to specially recognize a hash bang continuation line.

replies(3): >>43647221 #>>43649695 #>>43649823 #
CalChris ◴[] No.43647221[source]
Less fun fact: you can't substitute a <cr><nl> for <nl>.

I had a Perl script (way) back in the day that came from a Windows system and it wouldn't work on Linux. After I figured out <cr><nl> was causing the problem, I figured it out what bin_script (might have been in bin_misc) was doing wrong. bin_script sees "/bin/perl<cr>" and then fails to find that interpreter.

So I proposed a one line change which fixed the glitch and posted it to LKML … and promptly got yelled at by Allan Cox for breaking compatibility. I dunno if the null byte breaks the same compatibility. Chapter and verse weren't cited.

replies(1): >>43647775 #
1. kazinator ◴[] No.43647775[source]
Null de facto works, and it's almost certainly due to a consequence of the kernel treating the result of extracting the argument as a C string. For instance, it might actually be scanning past the NUL and earnestly finding the newline. Even if that entire datum is copied into the argument vector and passed to the interpreter. the interpreter will only see the argument up to the null terminator, due to it being a C string.

About the only way it could break would be if the kernel used a string function to look for the newline, like a range-limited form of strchr, and then aborted the hash bang dispatch with an error upon not finding the newline, rather than accepting that the argument is delimited by a null.

I tested it on various platforms like MacOS, Solaris, some BSDs, Cygwin, Linux. Far from exhaustive but a good coverage of the modern desktop and server landscape.

The null byte would have fixed your Perl script without having to convert the line endings; the argument would have been delimited, in spite of the line ending in <CR><LF>.