Most active commenters

ninocan(3)

ASK HN: How to engineer a JavaScript to Python migration?

Context: I was tasked with migrating a legacy workflow system (Broadcom CA Workflow Automation) to Airflow.

There are some jobs that contain rather simple JavaScript snippets, and I was trying to design a first prototype that simply takes the JS parts and runs them in a transpiler.

In this respect, I found a couple of packages that could be leveraged: - js2py: https://github.com/PiotrDabkowski/Js2Py - mini-racer: https://github.com/bpcreech/PyMiniRacer Yet, both seem to be abandoned packages that might not be suitable for usage in production.

Therefore, I was thinking about parsing and translating Javascript's abstract syntax trees to Python. Whereas a colleague suggested I bring up an LLM pipeline.

How much of an overkill that might be? Has anyone else ever dealt with a JavaScript-to-Python migration and could share heads-ups on strategies or pitfalls to avoid?

1. gostsamo ◴[14 Mar 25 08:30 UTC] No.43360622[source]▶

>>43360552 (OP) #

Hi, is this a monolith or a series of scripts? if the latter, isolate each script enough that it could be ran with python and then do it for all the scripts. if it is a monolith, consider splitting it and working on the pieces one by one.

an old package on prod is not an issue if it is isolated and you have a plan for change, so if you have to transpile, do it and then work from the bottom.

replies(1): >>43360647 #

2. ninocan ◴[14 Mar 25 08:36 UTC] No.43360647[source]▶

>>43360622 #

It's a bunch of scripts, a few lines long. But again, I haven't seen the totality of them and there might be edge cases with longer scripts.

Thanks for your hint!

replies(1): >>43360690 #

3. austin-cheney ◴[14 Mar 25 08:39 UTC] No.43360659[source]▶

>>43360552 (OP) #

> How much of an overkill that might be?

It sounds like a complete waste of time. If you are talking about small code snippets then simply write new original Python to replace them.

replies(1): >>43360970 #

4. gostsamo ◴[14 Mar 25 08:47 UTC] No.43360690{3}[source]▶

>>43360647 #

if we are talking small scripts, just rewrite in python and make sure that they are called with the right interpreter. use ai if you are not acquainted with the language. ask your managers if they might want to use another language which could be compiled and save on compute time. though not popular, nim is python like, and should do the job admirably.

5. ninocan ◴[14 Mar 25 09:38 UTC] No.43360970[source]▶

>>43360659 #

Yep, I thought about that... Still, there's a few hundreds of workflows to migrate, so I was looking for a systematic approach

replies(1): >>43378236 #

6. antiobli ◴[14 Mar 25 13:17 UTC] No.43362354[source]▶

>>43360552 (OP) #

Working on something similar but in reverse (Python to JavaScript). A few tips:

- If the scripts are primarily by the same author, it's likely they copy-pasted a lot of their functions throughout their scripts. Do something like a "Find All" in the repository holding the scripts for the function name. Regex will help a lot in this, and help you see if a function ever has variations on number of arguments, slight name changes or misspells, etc.

- Refine an AI prompt as you convert scripts over and over.

- AI (ChatGPT, Copilot, etc.) is going to be the best automation you can get for this, because often times conversions don't match up one-to-one in a language. Especially if your scripts use npm, you might not find one-to-one matches in PyPI. AI refactors will also force you to examine the conversions.

- Understand what tech debt could be introduced in a one-to-one conversion. There may be some language workarounds that make a lot more sense to introduce over exact conversions of workaround functions for the language. I've had to work with several legacy Python scripts that called functions that can be better written in your target language. For example, one of our functions was conditionally_get_keys(level1, level2, level3) to kind of recursively get a nested value. This didn't need to be a function when I rewrote it in JS, rather I just wrote the variable as something like `city = user.location?.city`. No need to one-to-one convert a function that can be better written with a JS language feature. You'll probably encounter this when you can more succinctly write list manipulations with slicing (e.g. `steppedList = fooList[::2]` instead of converting a function necessary in JS to do the same thing).

Sorry if you have a time-crunch with converting things, but I would recommend a more hands-on conversion strategy, leveraging AI to do the basic conversion and then manually testing, debugging the solutions.

7. barrkel ◴[16 Mar 25 10:14 UTC] No.43377932[source]▶

>>43360552 (OP) #

I did a reasonably big rewrite from JavaScript (Nashorn, long story) to Kotlin/JVM recently (with 60x speedup and elimination of huge variance in runtime).

Keys to success in a larger scale translation:

- don't redesign anything, do a port (see also, Typescript compiler to Go port)

- leverage LLMs interactively: per chunk (e.g. function), copy the old code into a comment in the new code, then use LLM completion to quickly fill out the translation

- get something basic up and running ASAP that you can test, ideally data-driven (inputs, expectations) tuples, that you can write scaffolds for execution of the old and the new code

- for every method / control flow ported, add tests that target the newly added code, validating it does the same as the old code

Some of this may be less applicable to scripts or harder to apply to imperative code, for which you might want to spend time converting side-effecting actions into data that can be asserted on (e.g. instead of performing commands, emit a list of commands); do this refactoring on the old code before porting.

Don't get tempted into doing refactorings as you go. When you notice an opportunity to refactor, create a bug for it. What you don't want to do is build up a list of transformations that increases the more code you port, and makes finishing everything harder and harder.

replies(2): >>43378286 #>>43387714 #

8. Alifatisk ◴[16 Mar 25 10:18 UTC] No.43377941[source]▶

>>43360552 (OP) #

Sounds like a job for an llm honestly, just make sure to have tests if you are going this route

9. dvh ◴[16 Mar 25 10:21 UTC] No.43377954[source]▶

>>43360552 (OP) #

I once needed to convert 2000 line excel formula to PHP but PHP linters sucks and are generally not helpful so I converted it to JS first and then I just added $ signs in front of variable names, few minor tweaks and it worked. It was easier than to go directly to PHP.

replies(1): >>43378017 #

10. harvey9 ◴[16 Mar 25 10:38 UTC] No.43378017[source]▶

>>43377954 #

Do you literally mean Excel formula and not VBA? That's mind-blowing.

replies(1): >>43378313 #

11. nextts ◴[16 Mar 25 10:40 UTC] No.43378026[source]▶

>>43360552 (OP) #

Semantics are gonna get you. Especially if they use idiomatic stuff like (!x && x == y) and rely on JS type coercion.

In this sense an LLM or hand crafted approach may win out.

Also API will likely be different.

12. ben30 ◴[16 Mar 25 10:51 UTC] No.43378053[source]▶

>>43360552 (OP) #

I’d take a step back and get clarification on the scope of your task. See my comment other day about how you can use Claude to help with that.

https://news.ycombinator.com/item?id=43163011

replies(1): >>43378063 #

13. ben30 ◴[16 Mar 25 10:53 UTC] No.43378063[source]▶

>>43378053 #

Example output from your question:

## Core Questions for Migration Scope Clarification

1. *What exactly needs to be preserved?* - Business outcomes only, or exact implementation details? - Current scheduling patterns or can they be optimized?

2. *What's the true scale?* - Number of workflows needing migration - Complexity spectrum of the JavaScript snippets - Frequency and criticality of each workflow

3. *What are the real constraints?* - Timeline requirements - Available expertise (JavaScript, Python, Airflow) - Downtime tolerance during transition

4. *What's the maintenance plan?* - Who will support the migrated workflows? - What documentation needs to be created? - How will knowledge transfer occur?

5. *What's the verification strategy?* - How will you validate correct migration? - What tests currently exist or need to be created? - What defines "successful" migration?

6. *What's unique to your environment?* - Custom integrations with other systems - Special CA Workflow features being utilized - Environmental dependencies

7. *What's the true purpose of this migration?* - Cost reduction, technical debt elimination, feature enhancement? - Part of larger modernization or standalone project? - Strategic importance versus tactical necessity

8. *What approaches have been eliminated and why?* - Complete Python rewrite - Containerized JavaScript execution - Hybrid approaches

9. *What would happen if this migration didn't occur?* - Business impact - Technical debt consequences - Opportunity costs

10. *Who are the true stakeholders?* - Who relies on these workflows? - Who can approve changes to functionality? - Who will determine "success"?

Answering these questions before diving into implementation details will save significant time and reduce the risk of misaligned expectations.

14. LunaSea ◴[16 Mar 25 11:18 UTC] No.43378153[source]▶

>>43360552 (OP) #

I would simply dockerize the Airflow tasks and keep them in JS as-is.

Then you write the short DAG description in Python but make the task executor launch the Docker containers.

And then you're done.

replies(1): >>43378556 #

15. throwxbnk ◴[16 Mar 25 11:20 UTC] No.43378159[source]▶

>>43360552 (OP) #

It is not necessary to automate it. Do it manually, say it requires a lot of attention and care and bill the hours.

If they force an LLM on you, say that you are thrilled to use an LLM, give it a try and arrive at the conclusion that unfortunately LLMs are not up to the task. Bill the hours needed to arrive at that conclusion.

There's no need for software engineers to automate themselves away. Look at lawyers, they know how to bill and protect themselves.

replies(1): >>43378423 #

16. huem0n ◴[16 Mar 25 11:26 UTC] No.43378187[source]▶

>>43360552 (OP) #

If you have any async JS, that's going to seriously complicate things. Theres no AST mapping for that (python async is not the same).

Pitfalls to watch our for? Tons of them. Comparison is very different, modulus is different, .sort is different, object destructuring doesn't map nicely to python, lambda's won't map nicely to python, promises won't map to python. Labelled loops won't map nicely to python.

If your JS snippets are truly simple, just LLM translate and manually check. They're pretty good at the simple stuff.

replies(1): >>43378231 #

17. KolmogorovComp ◴[16 Mar 25 11:34 UTC] No.43378218[source]▶

>>43360552 (OP) #

It's hard to give a proper advice without knowing which magnitude of LOC you are talking about.

18. egeozcan ◴[16 Mar 25 11:37 UTC] No.43378231[source]▶

>>43378187 #

Random idea: Couldn't Babel translate async code to callbacks?

replies(1): >>43380684 #

19. simonw ◴[16 Mar 25 11:38 UTC] No.43378236{3}[source]▶

>>43360970 #

LLMs are absolutely the right thing to look at for migrating hundreds of "simple" workflows like this.

The hard work will be validating that the code they write for you is exactly right. You would have to do that if you wrote the code yourself, too. The LLMs will accelerate the writing-the-code part but the manual QA work will still be on you: https://simonwillison.net/2025/Mar/11/using-llms-for-code/#y...

20. anonzzzies ◴[16 Mar 25 11:51 UTC] No.43378286[source]▶

>>43377932 #

> - don't redesign anything, do a port (see also, Typescript compiler to Go port)

> Don't get tempted into doing refactorings as you go.

I would say those are the most important. We did so many migrations in the past 30 years and the only ones that went ok were the ones that held to these rules. If you don't, you are rapidly stuck in a lot of pain and probably you won't be able to get out.

replies(2): >>43378348 #>>43391057 #

21. JimDabell ◴[16 Mar 25 11:52 UTC] No.43378287[source]▶

>>43360552 (OP) #

Unless you’re dealing with a lot of third-parties who can’t port their code, all of this seems like overkill. Just port the workflows to Python instead of trying to transpile them.

If you have an ecosystem to keep compatibility with, I would look at compiling the JavaScript to WASM and running the WASM from Python, or some kind of sandboxing to continue running the JavaScript as-is.

22. jhfdhsldhdlflj ◴[16 Mar 25 11:53 UTC] No.43378294[source]▶

>>43360552 (OP) #

1. Ensure there are tests for EVERYTHING important on the JS side of things. 2. Port the tests (if necessary -- if there is a REST interface, just use the same tests) 3. Port parts of the code and run against the tests.

This way you have an accurate idea of how your code is working before and after the port.

23. anonzzzies ◴[16 Mar 25 11:57 UTC] No.43378313{3}[source]▶

>>43378017 #

Lot of insurers etc do their work in Excel, i've seen 10000s of 'lines' of formulas in 1000s of sheets needing to be translated into Java. Most of them try, every few years, one of those 'why can't we just run Excel on the backend?' with one of those tools, commercial or not, spend a bunch of money, find it's a crap idea (scalability, maintenance etc) and then port it to a 'real language'.

24. nailer ◴[16 Mar 25 12:04 UTC] No.43378348{3}[source]▶

>>43378286 #

Do add TODO comments about proposed refactorings for later though.

25. andybak ◴[16 Mar 25 12:16 UTC] No.43378423[source]▶

>>43378159 #

Or - do the best job you can with whatever tools you think will help and sleep better at night.

replies(1): >>43378451 #

26. 43h5zqt ◴[16 Mar 25 12:21 UTC] No.43378451{3}[source]▶

>>43378423 #

Even the Python2 => Python3 translation with the automated tool "2to3" wasn't a great success. Why would Javascript => Python even work?

The best tool is a manual translation.

replies(1): >>43378907 #

27. willquack ◴[16 Mar 25 12:35 UTC] No.43378542[source]▶

>>43360552 (OP) #

Check out PythonMonkey [1], it's an actively maintained project which embeds the SpiderMonkey JS engine inside a Python library. It reuses the same memory buffers whenever possible and allows for pretty impressive interop like executing functions back and forth [2].

At my last job we used PythonMonkey to port our complex distributed computing JS Library to Python enabling us to reuse all the code and keep almost all the performance.

1. https://pythonmonkey.io/ and https://github.com/Distributive-Network/PythonMonkey 2. https://distributive.network/jobs/python-monkey

28. viceconsole ◴[16 Mar 25 12:39 UTC] No.43378556[source]▶

>>43378153 #

This was my immediate thought. Just because Airflow is written in Python doesn't mean the tasks you're running need to be in Python.

Separate the concerns: migrate the task orchestration to Airflow (or whatever) while keeping the actual Javascript task code largely unchanged.

29. viraptor ◴[16 Mar 25 12:45 UTC] No.43378596[source]▶

>>43360552 (OP) #

Airflow allows you to run node/bun/whatever from your python step. Unless you're really hurting with performance, do you need to port those things at all?

replies(1): >>43379901 #

30. from-nibly ◴[16 Mar 25 12:55 UTC] No.43378653[source]▶

>>43360552 (OP) #

How to translate JavaScript to Python is a bike shed.

The thing that really matters is how are you going to ship this?

You should figure out if there is a way it can be delivered incrementally.

Make sure it's easy yo roll back from new to old on as small a chunk as possible.

Make sure rollbacks and deploys don't require manual futzing.

Make sure it's easy for outside people to KNOW the status of things without asking you.

Make sure you have a way to coordinate with feature devs on when it's OK to work on a specific chunk.

Make sure you can test if things are working after you deploy a change.

After that you'll probably come up with like 30 ways to translate the code and use all of them until you find one that's actually tollerable.

31. TZubiri ◴[16 Mar 25 12:58 UTC] No.43378676[source]▶

>>43360552 (OP) #

Line by line, don't overthink it.

Programmers have an unhealthy aversion to repetitive tasks. Sometimes you just have to do work-work. Happens all the time in other industries,

clock in at 9 do the same thing for 2 hours, take a break, do the same thing for 2 hours, lunch, 2 hours break, 2 hours, go home.

Repeat this for weeks if necessary, you can plan it out and predict when it will be done, if need be ask for more resources.

32. andybak ◴[16 Mar 25 13:27 UTC] No.43378907{4}[source]▶

>>43378451 #

I think many people will disagree with you on this. (and just to clarify - automated/manual is not a binary choice)

33. michaelrpeskin ◴[16 Mar 25 14:26 UTC] No.43379311[source]▶

>>43360552 (OP) #

Was it Larry Wall that said "it's easier to port a shell than a shell script"?

I've done similar inter-language changes, and I have always found it easier to not change the language of the business logic. (Unless the change gives you something really big - for me I often port stuff to numpy because I need the vectorized code, but that's only for very specific problems).

If I had this task, the place where my brain would go would be to find a way to compile JS into C and then use C calling conventions to call the functions from Python. Keep the JS code around so that if you need to change anything, you change it in JS and then recompile to C.

I don't know the JS space very well, but can you get a JS interpreter that lives in Python? That way you can call JS functions from Python?

I don't like transpiling, there's always enough differences between the languages that something bad happens. When I've run into issues like this, since I'm an "old guy", I tend to try to get everything into C calling conventions and use that as my base interface.

Worst case, there has to be some good JS interpreter that can give you a C interface that you could call from Python. So you'd have Python -> C -> JS and your business logic can still live in JS (if your port is because of efficiency and you need compiled code, then you can ignore me.)

replies(1): >>43379888 #

34. willquack ◴[16 Mar 25 15:42 UTC] No.43379888[source]▶

>>43379311 #

> can you get a JS interpreter that lives in Python

The PythonMonkey library is a full JS interpreter running in the same Python process. It allows for JS functions to be called from Python and vice versa

35. viraptor ◴[16 Mar 25 15:44 UTC] No.43379901[source]▶

>>43378596 #

A comment would be useful about what people disagree with so much. OP starts with "I was tasked with migrating a legacy workflow system..." rather than tasked with converting the code, so it may be useful to bring other alternatives to the table.

36. rs186 ◴[16 Mar 25 17:33 UTC] No.43380684{3}[source]▶

>>43378231 #

You could if you want your codebase to be an unmanageable mess.

37. waldrews ◴[16 Mar 25 18:31 UTC] No.43381067[source]▶

>>43360552 (OP) #

There are type system gotchas. My favorites have to do with integers in place of JS's double-only numerics, the treatment of undefined and nulls, especially in arrays.

The LLM's are just fine at AST translation, though they might inject their quirky preferences if you don't watch them carefully. Suggestion: use the LLM's, but interactively, to translate one script at a time, starting with the simpler ones. Tell the LLM to explicitly call out potential issues. Manually review each, and incorporate the issues and preferences you learn into the prompt. If all goes well, the process will quickly converge on a cut and paste job, but don't be tempted to fully automate it if it's a hundred scripts -- different matter if it's thousands.

38. mooreds ◴[17 Mar 25 12:21 UTC] No.43387714[source]▶

>>43377932 #

> for every method / control flow ported, add tests that target the newly added code, validating it does the same as the old code

Nice, and as a bonus you end up with a well tested system. Can't speak highly enough of data driven testing for this kind of system. Gives you such confidence.

> When you notice an opportunity to refactor, create a bug for it.

Did you have success in getting time to revisit all these bugs? Did you get pressure to fix them along the way (in either codebase)?

replies(1): >>43396330 #

39. datadrivenangel ◴[17 Mar 25 17:58 UTC] No.43391057{3}[source]▶

>>43378286 #

110% this. Resist the urge to make changes until everything is moved over. Any system 'enhancements' may also be viewed as bugs/defects, and reduces trust, requiring lengthier validation.

40. barrkel ◴[18 Mar 25 06:31 UTC] No.43396330{3}[source]▶

>>43387714 #

I think data driven tests are underused by engineers generally. One advantage that shines in a porting scenario is that they can be language agnostic.

There was pressure from a reviewer in one area where the code could obviously be improved and I pushed back fairly hard in principle, but this was our first big project together and we were building trust, so I compromised in some leaf functions that presented the same API.

↑