←back to thread

439 points david927 | 1 comments | | HN request time: 0.202s | source

What are you working on? Any new ideas which you're thinking about?
Show context
ashdnazg ◴[] No.44417426[source]
I'm writing a decompiler for Turbo Pascal 3.0, to reverse engineer an educational game from the 80s.

Since TP 3.0 does no optimisations, and looking at the progress so far (~25% decompiled), it seems like matching decompilation should be achievable.

If/when I get to 100%, I hope to make the process of annotating the result (Func13_var_2_2 is hardly an informative variable name) into a community project.

replies(2): >>44417634 #>>44417687 #
simmons ◴[] No.44417687[source]
Neat! I sometimes play around with the idea of reverse engineering and transcompiling a tiny game that I think was probably written in Turbo Pascal 4.0. Maybe 4.0 supported optimizations, but this program seems to have been compiled in a debug mode. (At least, it seems to have no optimization, and has the default {$S+} stack overflow checking at the start of every function.) The lack of optimization makes it (and perhaps other programs written in Turbo Pascal) a really attractive artifact to experiment with transcompiling. When I realized that only the first segment was the actual game, and the other three segments corresponded to standard units used for I/O (etc.), which could be harder to analyze, I realized I could just omit those segments and replace them with new functions suitable for the transcompilation target. Maybe some day I'll get around to finishing it.

Good luck!

replies(2): >>44420572 #>>44488848 #
1. ashdnazg ◴[] No.44420572[source]
Thank you!

It's similar with Turbo Pascal 3.0, but there's only one segment since it's a good old COM file. The compiler just copies its own first ~10000 bytes, comprising the standard library, and splices the compiled result to the end.

I can see how this makes transcompilation relatively straightforward, although the real mode 16-bit code is a bit unpleasant with all the segment stuff going on, so you might as well just decompile :D. It's very possible that similar instructions will be emitted in 3.0 and 4.0 for the same source input.

My program also has the stack checking calls everywhere before calling functions. I think that people using Pascal weren't worried about performance that much to begin with, so they didn't bother disabling it.