←back to thread

17 points Hashex129542 | 3 comments | | HN request time: 0.684s | source

I'm developing a programming language, the keywords and features mostly based on Swift 5 but some additional features like,

1. async function will be called from no async function but no async/await keyword. If you want to block main thread then block_main() function will be used. block_main() /* operations */ unblock_main()

2. protocol can inherit another protocol(s) & protocol can confirm a class like swift.

3. no `let`. only `var`. compiler can optimize further.

4. if (a == (10 || 20 || 30) || b == a) && c { }

5. `Asterisk` is replaced to `x` operator for mul operations.

What are the features you found or you need in a programming language?

Show context
ActorNightly ◴[] No.42201001[source]
Im going to save you time and describe what the optimal programming language anyone actually wants, no matter what they say:

People want to be able to write either python or javascript (i.e the 2 most widely used languages) , and have a compiler with an language model (doesn't have to be large) on the back end that spits out the optimal assembly code, or IR code for LLVM.

Its already possible to do this with the LLMs direct from the source code, (although converting to C usually yields better results than direct to assembly) but these models are overkill and slow for real compilation work. The actual compiler just need to have a specifically trained model that reads in bytecode (or output of the lexer) and does the conversion, which should be much smaller in size due to having a way smaller token space.

Not only do you get super easy adoption with not having to learn a new language, you also get the advantage of all the libraries in pypi/npm that exist that can be easily converted to optimal native code.

If you manage to get this working, and make it modular, the widespread use of it will inevitably result in community copying this for other languages. Then you can just write in any language you want, and have it all be fast in the end.

And, with transfer learning, the compiler will only get better. For example, it will start to recognize things like parallel processing stuff that it can offload to the GPU or use AVX instructions. It can also automatically make things memory safe without the user having to manually specify it.

replies(3): >>42201027 #>>42201547 #>>42205824 #
duped ◴[] No.42205824[source]
I find it dubious that an ML model would outperform existing compilers (AOT or JIT) for Python and JS, both of which exist and have many engineer years invested in their design and testing.

I find it even more dubious that someone would want something that could hallucinate generating machine code. The difficulty of optimizing compiler passes is not in writing code that appears to be "better" or "faster" but guaranteeing that it is correct in all possible contexts.

replies(1): >>42208466 #
1. ActorNightly ◴[] No.42208466[source]
This would be a much different training task than LLMs. The reference to it being possible with large LLMs is just a proof that it can be done.

The reason its different is because you are working with a finite set of token sequences, and you will be training the model on every value of that set, because its fairly small. So hallucination won't be a problem.

Even without ML, its a lengthy but P hard task to really build a python to C translator. Once you unroll things like classes, list comprehensions, generators, e.t.c, you end up with basically the same rough structure of code minus memory allocation. And for the latter, its a process of semantic analysis to figure out how to allocate memory, very deterministic. Then you have your C compiler code as it exists. Put the two together, and you basically have a much faster python without any dynamic memory handling.

The advantage of doing it through ML is that once you do the initial setup of the training set, and set up the pipeline to train the compiler, to integrate any pattern recognition into the compiler would be very trivial.

replies(1): >>42208686 #
2. duped ◴[] No.42208686[source]
> Once you unroll things like classes, list comprehensions, generators, e.t.c, you end up with basically the same rough structure of code minus memory allocation.

No, you don't, and that's why there are many engineer years invested into designing AoT and JIT compilers for JS and Python.

If you write C like Python you get Python but slower.

> The advantage of doing it through ML is that once you do the initial setup of the training set, and set up the pipeline to train the compiler, to integrate any pattern recognition into the compiler would be very trivial.

Except this has already been done, so what advantage does ML bring? Other than doing it again, but worse, and possibly incorrectly?

replies(1): >>42240992 #
3. ActorNightly ◴[] No.42240992[source]
AoT/JIT work would literally be the training tool to train the model, with additional optimizations along the way. The issue is that manual removal of all the runtime stuff from the generated code is just cumbersome right now, but with ML, its just a matter of having enough examples.

The advantage is that you get native optimized code like someone wrote it in C directly, and ability to automatically generate code to be offloaded to GPU as people start doing expanded training with higher level pattern recognition.

The incorectness part I adressed already, stochastic output doesnt matter when your domain is finite.