←back to thread

1901 points l2silver | 2 comments | | HN request time: 0.445s | source

Maybe you've created your own AR program for wearables that shows the definition of a word when you highlight it IRL, or you've built a personal calendar app for your family to display on a monitor in the kitchen. Whatever it is, I'd love to hear it.
Show context
boricj ◴[] No.35738758[source]
I've modified Ghidra in order to unlink pieces of an executable back into relocatable object files.

To keep things simple, source code files are compiled into object files which are linked into an executable. Object files have sections (named array of bytes), symbols (either defined as an offset within a section or undefined) and relocations (a request to patch up an offset within a section with the final address of a symbol) while executable files only have sections. The linker takes all the object files, lays out the sections in memory, fixes up the relocation and writes out an executable file without the symbols or relocations.

With Ghidra I can reverse-engineer an executable and recreate symbols, data types and references between symbols. Then, with my modifications I can recreate relocations with that information and, once a range of addresses has been fully processed, I can select it and export it as a relocatable ELF object file.

Why? This allows me to extract parts of an executable as object files and reuse these by linking them my own source code ; I don't need to fully-reverse engineer these extracted parts, I just have to basically identify every relocation there was originally in that part. I can also divide and conquer my way to decompiling an executable by splitting an executable into multiple object files and recreate its source code one object file at a time, like the Ship of Theseus.

So far it works with what I've tested it with and I've been meaning to write a series of articles to explain that process in detail, but writing quality technical articles with illustrations on a topic this esoteric is very hard.

  - My Ghidra fork: https://github.com/boricj/ghidra/tree/feature/elfrelocatebleobjectexporter
  - My initial prototype in Jython (has a readme): https://github.com/boricj/ghidra-unlinker
Note: this works only with 32-bit MIPS, little endian, statically-linked executables. It can be made to work with other architectures by writing a relocation synthesizer for it, but so far I only care about decompiling PlayStation 1 games.
replies(3): >>35738851 #>>35741037 #>>35741151 #
1. j-krieger ◴[] No.35741037[source]
Amazing. Do you have any intention of opening a merge request to get this into Ghidra? Or maybe in the way of a plugin?
replies(1): >>35741495 #
2. boricj ◴[] No.35741495[source]
I tried to upstream some of my refactorings/modifications needed to support this, but it was rejected by upstream [1]. I don't blame the Ghidra project for this decision ; my modifications are fairly intrusive (modifying the relocation table after the initial load, extensive refactoring of the ELF support code...) and my workflow is essentially unproved in public.

By that I mean I have no documentation, no series of technical articles describing this process and no public, non-trivial project to demonstrate it in real life. I do have a currently private decompilation project that uses this successfully [2], but it's not currently public and it's nowhere near finished.

Also, I only wrote a relocation synthesizer for statically-linked, 32-bit, little endian MIPS ELF. That's a fairly obscure platform, I'd expect most people care about mainstream instruction sets like x86_64 or ARM64.

If you can suggest a forum where people would be interested in this, I can drop a message there and answer more in-depth questions if you want. So far I've worked on this all on my own and I'm kinda out of the loop from the rest of the reverse-engineering community.

[1] https://github.com/NationalSecurityAgency/ghidra/pull/5010#i...

[2] https://news.ycombinator.com/item?id=35739949