Demystifying the (Shebang): Kernel Adventures

(crocidb.com)

189 points thunderbong | 3 comments | 10 Apr 25 18:21 UTC | HN request time: 0.783s | source

Show context

kazinator ◴[10 Apr 25 18:55 UTC] No.43647015[source]▶

Fun fact: you can stick a null byte into the shebang line to terminate it, as an alterantive to the newline.

It's possible to have a scripting language support extra command line arguments after the null byte, which is less disruptive to the syntax than recognizing arguments from a second line.

I.e.

  #!/path/to/interpreter --arg<NUL>--more --args<LF>

  #!/usr/bin/env interpreter<NUL>--all --args<LF>

On some OS's, you only get one arg: everything after the space, to the end of the line, is one argument.

When we stick a <NUL> there, that argument stops there; but our interpreter can read the whole line including the <NUL> up to the <LF> and then extract additional arguments between <NUL> and <LF>

https://www.nongnu.org/txr/txr-manpage.html#N-74C247FD

The interpreter could get the arguments in other ways, like from a second line after the hash bang line. But with the null hack, all the processing revolves around just the one hash bang line. You can retrofit this logic into an interpreter that already knows how to ignore the hash bang line, without doing any work beyond getting it to load the line properly with the embedded nul, and extract the arguments. You dont have to alter the syntax to specially recognize a hash bang continuation line.

replies(3): >>43647221 #>>43649695 #>>43649823 #

1. ElectricalUnion ◴[11 Apr 25 02:09 UTC] No.43649823[source]▶

>>43647015 #

Yeah, OpenBSD says no to that.

> If during parsing lines in the script, ksh finds a NUL byte on the line, it should abort ("syntax error: NUL byte unexpected").

https://www.undeadly.org/cgi?action=article;sid=202409241057...

replies(2): >>43651203 #>>43657048 #

2. hnlmorg ◴[11 Apr 25 06:58 UTC] No.43651203[source]▶

>>43649823 (TP) #

I think the GP was talking about a kernel parsing and how to stuff additional parameters in a hypothetical new scripting language. Whereas you’re talking about ksh specifically, which has its own specific parsing rules.

3. kazinator ◴[11 Apr 25 18:45 UTC] No.43657048[source]▶

>>43649823 (TP) #

OpenBSD is one of the platforms I tested the Hash Bang Null Hack on.

ksh is being successfully run, and is able to read a line of the script and find the null byte.

However, ksh rejecting it means we couldn't use the trick with ksh, like to get it to ignore a <CR> in the hash bang line of a script that has <CR><LF> line endings. (Something discussed in a sibling subthread, in regard to a Perl script that failed due to a trailing <CR> in the hash bang line.)

↑