←back to thread

-2000 Lines of code

(www.folklore.org)
506 points xeonmc | 1 comments | | HN request time: 0.201s | source
1. rottc0dd ◴[] No.44385121[source]
Hi,

I think I have mentioned this before in HN too. I am not from CS background and just learnt the trade as I was doing the job, I mean even the normal stuff.

We have a project that tries reify live objects into human readable form. Final representation is so complicated with lot of types and the initial representation is less complicated.

In order to make it readable, if there is any common or similar data nodes, we have to compare and try to combine them i.e. find places that can be made into methods and find the relevant arguments for all the calls (kind of).

Initial implementation did the transformation into the final form first, and then started the comparison. So, the comparison have to deal with all the different combinations of the types we have in final representation now, which made the whole thing so complex and has been maintained by generation of engineers that nobody had clear idea how it was working.

Then, I read about hashmap implementation later (yep, I am that dumb) and it was a revelation. So, we did following things:

1. We created a hash for skeleton that has to remain the same through all the set of comparisons and transformation of the "common nodes", (it can be considered as something similar to methods or arguments) and doing the comparison for nodes with matching skeletal hashes and

2. created a separate layer that does the comparison and creating common nodes on initial primitive form and then doing the transformation as the second layer (so you don't have to deal with all types in final representation) and

3. Don't type. Yes. Data is simplest abstraction and if your logic can made into data or some properties, please do yourself a favor and make them so. We found lot of places, where weird class hierarchies can be converted into data properties.

Basically, it is a dumb multi pass decompiler.

That did not just speed up the process, but resulted in much more readable and understandable abstractions and code. I do not know, if this is widely useful but it helped in one project. There is no silver bullet, but types were actual problem for us and so we solved it this way.