Most active commenters
  • xfeeefeee(3)
  • noduerme(3)

←back to thread

412 points xfeeefeee | 14 comments | | HN request time: 1.271s | source | bottom
1. xfeeefeee ◴[] No.43747922[source]
The fascinating process of reverse engineering this VM is detailed here.

TikTok uses a custom virtual machine (VM) as part of its obfuscation and security layers. This project includes tools to:

Deobfuscate webmssdk.js that has the virtual machine.

Decompile TikTok’s virtual machine instructions into readable form.

Script Inject Replace webmssdk.js with the deobfuscated VM injector.

Sign URLs Generate signed URLs which can be used to perform auth-based requests eg. Post comments.

replies(2): >>43748699 #>>43754044 #
2. noduerme ◴[] No.43748699[source]
Is calling a massive embedded JS obfuscator a "VM" a bit of a stretch? Ultimately it's not translating anything to a lower-level language.

Still, I had no idea. This is really taking JS obfuscation to the next level.

One kind of wonders, what is the purpose of that level of obfuscation? The naive take is that obfuscation is usually to protect intellectual property... but this is client-side code that wouldn't give away anything about their secret sauce algorithm.

replies(3): >>43748760 #>>43748939 #>>43748965 #
3. throwaway48476 ◴[] No.43748760[source]
VM obfuscation is a common technique for malware developers.

The VM term is applied because the obfuscator creates a custom instruction set and executes custom byte code. This is generated per build.

replies(1): >>43750450 #
4. MonkeyClub ◴[] No.43748939[source]
> Is calling a massive embedded JS obfuscator a "VM" a bit of a stretch? Ultimately it's not translating anything to a lower-level language.

From the Repo's README:

"TikTok is using a full-fledged bytecode VM, if you browse through it, it supports scopes, nested functions and exception handling. This isn't a typical VM and shows that it is definitely sophiscated."

replies(1): >>43750161 #
5. userbinator ◴[] No.43748965[source]
You are replying to a comment that looks extremely unhuman.
replies(1): >>43749770 #
6. codetrotter ◴[] No.43749770{3}[source]
It looks like OP filled out the text area alongside with the URL when submitting the post.

HN takes that text and turns it into a comment. I’ve seen it happen before.

The unfortunate outcome of that IMO is that sometimes text that makes sense as a description of a submission feels a bit out of place as a comment due to how they are worded. And these comments sometimes then end up getting downvoted.

I wouldn’t be completely sure it was not human written. Even though it feels a bit weird to read it as a comment.

replies(2): >>43751361 #>>43752302 #
7. noduerme ◴[] No.43750161{3}[source]
But that's basically an emulator of a VM, isn't it? It's like rewriting the Flash AVM2 into JS... it's still running in JS whereas the original VM was C++. It could JIT compile stuff but only because it literally was reserving memory that could overflow, and (semi-technical take here) from that advantage, of being closer to the metal, flowed all of the flaws in AVM2 that precipitated most of Adobe's woes with Flash. A VM implant in a web page that uses a plugin like Java or Flash, to get around running browser-sandboxed code, which can take over physical memory, is far different from just emulating a VM in Javascript. I wouldn't call writing a ton of opcodes in JS, which resolved to JS functions, a "virtual machine", because it isn't reserving anything or doing anything that Javascript can't do. Someone correct me here if I'm wrong... this is just heavy-duty obfuscation.

Also, one major purpose of a VM is to improve performance over what's available in the browser. If you use that as a measurement, this clearly doesn't fit that goal.

replies(2): >>43751749 #>>43751957 #
8. noduerme ◴[] No.43750450{3}[source]
I appreciate you making the distinction that anything which creates a custom instruction set is thus a VM. I think that's the way a lot of people here who are currently at my throat seem to define it, so I'm glad you put it in clear terms. I would define it as a custom instruction set plus some sort of plug-in that allows those opcodes to be run closer to the metal than the language they're written in. FWIW I'd call this thing more of an obfuscation framework. But maybe I'm just a dino. I am really glad you made this comment, though. It clarified for me why so many people went bananas when I said this wasn't a VM.
9. ◴[] No.43751749{4}[source]
10. gruez ◴[] No.43751957{4}[source]
>But that's basically an emulator of a VM, isn't it?

Emulators and VMs aren't mutually exclusive.

>Also, one major purpose of a VM is to improve performance over what's available in the browser. If you use that as a measurement, this clearly doesn't fit that goal.

And from your other comment:

>I would define it as a custom instruction set plus some sort of plug-in that allows those opcodes to be run closer to the metal than the language they're written in.

A virtual machine just means a machine that's virtual. All the other expectations you apply on top of it (eg. "improve performance over what's available in the browser") is totally irrelevant. The JVM clearly doesn't improve performance of java code than running natively, but nobody denies it's a virtual machine. The same goes for VMWare products ("VM" is literally in its name!), which executes x86 code but is further away from "the metal" that it's running on.

11. xfeeefeee ◴[] No.43752302{4}[source]
> It looks like OP filled out the text area alongside with the URL when submitting the post. HN takes that text and turns it into a comment.

Yeah, this is exactly what happened, but I decided to keep it rather than delete and filled it out more with the synopsis from the repo.

Looking back at it, it really does look like an AI bulleted summary. I probably should have noted that the last part was indeed a quotation.

12. dmitrygr ◴[] No.43754044[source]
What is the purpose of you posting a bad ChatGPT summary of the original post?
replies(2): >>43755035 #>>43757853 #
13. xfeeefeee ◴[] No.43755035[source]
I quoted the synopsis from the readme thinking it would be helpful.
14. pests ◴[] No.43757853[source]
It was the submission statement along with him submitting this post. It was detached as a comment and I don't think it AI.