isn't this more like a port of `html5ever` from rust to python using LLM, as opposed to creating something "new" based on the test suite alone?
if yes, wouldn't be the distinction rather important?
isn't this more like a port of `html5ever` from rust to python using LLM, as opposed to creating something "new" based on the test suite alone?
if yes, wouldn't be the distinction rather important?
The first iteration of the project created a library from scratch, from the tests all the way to 100% test coverage. So even without the second iteration, it's still possible to create something new.
In an attempt to speed it up, I (with coding agent) rewrote it again based on html5ever's code structure. It's far from a clean port, because it's heavily optimized Rust code, that isn't possible to port to Python (Rust marcos). And it still depended on a lot of iteration and rerunning tests to get it anywhere.
I'm not pushing any agenda here, you're free to take what you want from it!
It looks to me like this is the last commit before the rewrite: https://github.com/EmilStenstrom/justhtml/tree/989b70818874d...
The commit after that is https://github.com/EmilStenstrom/justhtml/commit/7bab3d2 "radical: replace legacy TurboHTML tree/handler stack with new tokenizer + treebuilder scaffold"
It also adds this document called html5ever_port_plan.md: https://github.com/EmilStenstrom/justhtml/blob/7bab3d22c0da0...
Here's the Codex CLI transcript I used to figure this out: https://gistpreview.github.io/?53202706d137c82dce87d729263df...