Most active commenters

    ←back to thread

    412 points xfeeefeee | 19 comments | | HN request time: 1.84s | source | bottom
    1. kleiba ◴[] No.43749288[source]
    I've been using a shitty streaming website whose player interrupts the playback of a video in irregular intervals and presents a cryptic error message. I've started looking into the JavaScript code to see if I can't code up a work-around mechanism (basically debugging their garbage implementation), and of course (why actually?) their player code is also obfuscated.

    And I've gotta say, emplying an AI assistant has proven to be an invaluable help in trying to understand obfuscated code. It's actually really cool to take a function of gobbledegook JavaScript and ask the AI to rewrite it in a more canonical and easily understandable way, with inline comments. Of course, there are flaws every now and then, but the ability to do this has been such a game changer for reverse engineering, IMO.

    I can even ask to take a guess at finding better variable/function names and the AI can infer from the code (maybe has seen the unobfuscated libraries during training?) what this code is actually doing on a high-level and turn something like e.g(e.g) into player.initialize(player.state) which is nothing short of amazing.

    So for anyone doing similar work, I cannot recommend highly enough to have an AI agent as another tool in your tool belt.

    replies(4): >>43749332 #>>43750153 #>>43750771 #>>43758666 #
    2. lukan ◴[] No.43749332[source]
    Which AI agents did you use?
    replies(1): >>43749368 #
    3. kleiba ◴[] No.43749368[source]
    I've tried different ones, they all seem to do a great job.
    replies(4): >>43749490 #>>43749664 #>>43752083 #>>43757415 #
    4. klabetron ◴[] No.43749490{3}[source]
    Out of curiosity (as someone disappointingly new to prompt engineering), what’s an example prompt you used with some success?
    replies(3): >>43750158 #>>43750239 #>>43753794 #
    5. sureIy ◴[] No.43749664{3}[source]
    Could you name a couple?
    6. saagarjha ◴[] No.43750153[source]
    Is it truly obfuscated, or just minified?
    replies(1): >>43751235 #
    7. esseph ◴[] No.43750158{4}[source]
    Ask questions. Be disappointed in the outcomes.

    Ask more questions. Get some right answers. Repeat.

    Make question asking muscle get swole.

    8. nurettin ◴[] No.43750239{4}[source]
    Actually knowing the subject and presenting insights gives me much better results than simply asking it to do what I mean.
    9. poincaredisk ◴[] No.43750771[source]
    I'm surprised by this. As a professional reverse engineering I've actually found LLMs to be terrible at deobfuscation of JS (especially in the context of JS malware). But maybe my requirements are higher and it's actually OK for occasional use against weak packers?
    replies(2): >>43751627 #>>43754398 #
    10. johann8384 ◴[] No.43751235[source]
    Well the example in the article was obfuscated with several specific examples.
    replies(1): >>43754973 #
    11. Bilal_io ◴[] No.43751627[source]
    I've used it for small files and it did very well prettifying, naming the variables and adding comments for context. But I can imagine it doing a bad job with large files.
    12. ImPostingOnHN ◴[] No.43752083{3}[source]
    next up is using AI to obfuscate it better in the first place, and then the terrible code gets scraped and used in further training, with an arms race ensuing, until all code on the internet is unintelligible but somehow works and can only be maintained by a specific AI that has a particularly encoded form of insanity
    13. Loughla ◴[] No.43753794{4}[source]
    For help with prompt engineering, take a graduate level grant writing course. It teaches you how to ask the right questions to get answers from humans and how to break down complicated processes into bite size pieces; really useable for llm's.
    replies(1): >>43755274 #
    14. ctoth ◴[] No.43754398[source]
    Have you seen this?

    https://github.com/jehna/humanify

    What they do is ground the LLM to the AST with Babel to ensure you still get the same shape of AST out of your deobfuscation pass. Probably this tool could be cleaned up, made to work with multiple llm and parser backends, have its prompts improved, &c.

    replies(1): >>43780986 #
    15. saagarjha ◴[] No.43754973{3}[source]
    I mean the JavaScript the LLM reversed for them
    16. specialist ◴[] No.43755274{5}[source]
    Heh. Probably also useful should a djinn ever grant you three wishes.
    17. titaphraz ◴[] No.43757415{3}[source]
    > they all seem to do a great job

    Yeah right.

    18. pcwalton ◴[] No.43758666[source]
    I tried ChatGPT 4o to help me reverse engineer some game code with the symbols missing and the results were quite disappointing. To say it had a tendency to hallucinate is an understatement. It didn't have any clue what was going on.

    For me, those AI tools are much better at saving me time looking up documentation when doing simple things where it has examples of the exact code pattern I'm looking for in its training set. ChatGPT is great at writing one-off Blender scripts for me to give to artists, for instance.

    19. rfoo ◴[] No.43780986{3}[source]
    This is great idea! But it's more about having LLMs to give function & variables names, instead of having LLM to deobfuscate. The (traditional) deobfuscations (e.g. unpack, de-flatten, de-virtualization etc) were done by 100% precise human made Babel plugins and is totally unrelated to a LLM.