Most active commenters
  • almostgotcaught(5)
  • Jeaye(5)

←back to thread

jank is C++

(jank-lang.org)
252 points Jeaye | 17 comments | | HN request time: 1.904s | source | bottom
1. almostgotcaught ◴[] No.44535360[source]
i commented on reddit (and got promptly downvoted) but since i think jank's author is around here (and hopefully is receptive to constructive criticism): the CppInterOp approach to cpp interop is completely janky (no pun intended). the approach literally string munges cpp and then parses/interprets it to emit ABI compliant calls. there's no reason to do this except that libclang currently doesn't support any other way. that's not jank's fault but it could be "fixed" in libclang. at a minimum you could use https://github.com/llvm/llvm-project/blob/main/clang/lib/Cod... to emit the code based on clang ast. at a maximum would be to use something like

https://github.com/Mr-Anyone/abi

or this if/when it comes to fruition

https://discourse.llvm.org/t/llvm-introduce-an-abi-lowering-...

to generate ABI compliant calls/etc for cpp libs.

note, i say all this with maximum love in my heart for a language that would have first class cpp interop - i would immediately become jank's biggest proponent/user if its cpp interop were robust.

EDIT: for people wanting/needing receipts, you can skim through https://github.com/compiler-research/CppInterOp/blob/main/li...

replies(2): >>44535427 #>>44535628 #
2. wk_end ◴[] No.44535427[source]
> the CppInterOp approach to cpp interop is completely janky (no pun intended). the approach literally string munges cpp and then parses/interprets it to emit ABI compliant calls.

So, I agree that this sounds janky as heck. My question is: besides sounding janky as heck, is there something wrong with this? Is it slow/unreliable?

replies(1): >>44535464 #
3. almostgotcaught ◴[] No.44535464[source]
i mean it's as prone to error as any other thing that relies on string munging. it's probably not that much slower than the alternative i proposed - because the trampolines/wrappers are jitted and then reused - but it's just not robust enough that i would ever imagine building a prod system on top of it (eg using cppyy in prod) let alone baking it into my language/runtime.
replies(2): >>44535491 #>>44536478 #
4. refulgentis ◴[] No.44535491{3}[source]
The delta between the title and the content gave me extreme pause, thanks for sharing that there's, uh, worse problems.

I'm a bit surprised I've seen two articles about jank here the last 2 days if these are exemplars of the technical approach and communication style. Seems like that wouldn't be enough to get on people's radars.

replies(2): >>44535589 #>>44535694 #
5. actionfromafar ◴[] No.44535589{4}[source]
Given how the world works, that might mean we will all sit and curse Jank instead of cursing Node. :)
6. Jeaye ◴[] No.44535628[source]
Hey! I'm here and receptive.

I completely agree that Clang could solve this by actually supporting my use case. Unfortunately, Clang is very much designed for standalone AOT compilation, not intertwined with another IR generating mechanism. Furthermore, Clang struggles to handle some errors gracefully which can get it into a bad state.

I have grown jank's fork of CppInterOp quite significantly, in the past quarter, with the full change list being here: https://gist.github.com/jeaye/f6517e52f1b2331d294caed70119f1... Hoping to get all of this upstreamed, but it's a lot of work that is not high priority for me right now.

I think, based on my experience in the guts of CppInterOp, that the largest issue is not the C++ code generation. Basically any code generation is some form of string building. You linked to a part of CppInterOp which is constructing C++ functions. What's _actually_ wrong with that, in terms of robustness? The strings are generated not based on arbitrary user input, but based on Clang QualTypes and Decls. i.e. you need valid Clang values to actually get there anyway. Given that the ABI situation is an absolute mess, and that jank is already using Clang's JIT C++ compiler, I think this is a very viable solution.

However, in terms of robustness, I go back to Clang's error handling, lack of grace, and poor tooling for use cases like this. Based on my experience, _that_ is what will cause robustness issues.

Please don't take my response as unreceptive or defensive. I really do appreciate the discussion and if I'm saying something wrong, or if you want to explain further, please do. For alternatives, you linked to https://github.com/Mr-Anyone/abi which is 3 months old and has 0 stars (and so I assume 0 users and 0 years of battle testing). You also linked to https://discourse.llvm.org/t/llvm-introduce-an-abi-lowering-... which I agree would be great, _if/when it becomes available_.

So, out of all of the options, I'll ask clearly and sincerely: is there really a _better_ option which exists today?

CppInterOp is an implementation detail of jank. If we can replace C++ string generation with more IR generation and a portable ABI mechanism, _and_ if Clang can provide the sufficient libraries to make it so that I don't need to rely on C++ strings to be certain that my template specializations get the correct instantiation, I am definitely open to replacing CppInterOp. From all I've seen, we're not there yet.

replies(2): >>44535716 #>>44535916 #
7. Jeaye ◴[] No.44535694{4}[source]
Which particular delta between the title and the content gave you extreme pause?
replies(1): >>44535871 #
8. almostgotcaught ◴[] No.44535716[source]
> which is 3 months old and has 0 stars (and so I assume 0 users and 0 years of battle testing)

ah my bad i meant to link to this one https://github.com/scrossuk/llvm-abi

which inspired the gsoc.

> is there really a _better_ option which exists today?

today the "best in class" approach is swift's which fully (well tries to) model cpp AST and do what i suggested (emitting code directly):

https://github.com/swiftlang/swift/blob/c09135b8f30c0cec8f5f...

replies(1): >>44535835 #
9. Jeaye ◴[] No.44535835{3}[source]
There are upsides to this approach. Coupling Swift's AST with Clang's AST will allow for the best codgen, for sure.

However, the huge downside to this approach, which cannot be overlooked, is that Clang (not libclang) is not designed to be a library. It doesn't have the backward compatibility of a library. Swift (i.e. Apple) is already deep into developing Clang, and so I'm sure they can afford the cost of keeping up with the breaking changes that happen on every Clang release. For a solo dev, I'm not yet sure this is actually viable, but I will give it more consideration.

However, I think that raising alarms at C++ codegen is unwarranted. As I said before, basically any query builder or codegen takes some form of string generation. The way we make those safe is to add types in front of them, so we're not just formatting user strings into other strings. That's exactly what CppInterOp does, where the types added are Clang QualTypes and Decls.

replies(2): >>44536865 #>>44538335 #
10. refulgentis ◴[] No.44535871{5}[source]
It said "jank is C++", which I assumed would be explaining that jank compiles down to C++ or something similar, i.e. there is a layer of abstraction between jank and C++, but it effectively "works like" C++.

On re-read, I recognize where it is used in the article:

"jank is C++. There is no runtime reflection, no guess work, and no hints. If the compiler can't find a member, or a function, or a particular overload, you will get a compiler error."

I assume other interop scenarios don't pull this off*, thus it is distinctive. Additionally, I'm not at all familiar with Clojure, sadly, but it also sounds like there's some special qualities there ("I think that this is an interesting way to start thinking about jank, Clojure, and static types")

Now I'll riff and just write out the first 3-5 titles that come to mind with that limited understanding:

- Implementing compile-time verifiable C++ interop in jank

- Sparks of C++ interop: jank, Clojure, & verifying interop before runtime

- jank's progress on C++ interop

- Safe C++ interop lessons from jank

* for example, I write a lot of Dart day to day and rely on Dart's "FFI" implementation to call C++, which now that I'm thinking about, only works because there's a code generator that creates "Dart headers" (my term) for the C++ libraries. I could totally footgun and call arbitrary functions that don't exist.

replies(1): >>44535919 #
11. rjsw ◴[] No.44535916[source]
I think that some packages that generate Python bindings for C++ use Clang to do it as well.
12. Jeaye ◴[] No.44535919{6}[source]
My reasoning is this:

jank is written in C++. Its compiler and runtime are both in C++. jank can compile to C++ directly (or LLVM IR). jank can reach into C++ seamlessly, which includes reaching into its own compiler/runtime. Thus, the boundary between what is C++ and what is Clojure is gone, which leaves jank as being both Clojure and C++.

Achieving this singularity is a milestone for jank and, I think, is worthy of the title.

replies(1): >>44541586 #
13. Jeaye ◴[] No.44536478{3}[source]
> i mean it's as prone to error as any other thing that relies on string munging.

This is misleading. Having done a great deal of both (as jank also supports C++ codegen as an alternative to IR), if the input is a fully analyzed AST, generating IR is significantly more error prone than generating C++. Why? Well, C++ is statically typed and one can enable warnings and errors for all sorts of issues. LLVM IR has a verifier, but it doesn't check that much. Handling references, pointers, closures, ABI issues, and so many more things ends up being a huge effort for IR.

For example, want to access the `foo.bar` member of a struct? In IR, you'll need to access foo, which may require loading it if it's a reference. You'll need to calculate the offset to `bar`, using GEP. You'll need to then determine if you're returning a reference to `bar` or if a copy is happening. Referencing will require storing a pointer, whereas copying may involve a lot more code. If we're generating C++, though, we just take `foo` and add a `.bar`. The C++ compiler handles the rest and will tell us if we messed anything up.

If you're going to hand wave and say anything that's building strings is error prone and unsafe, regardless of how richly typed and thoroughly analyzed the input is, the stance feels much less genuine.

14. almostgotcaught ◴[] No.44536865{4}[source]
> For a solo dev, I'm not yet sure this is actually viable, but I will give it more consideration.

look i'm not trying to shit on your project - i promise - i know calling you out like this publically almost requires a political kind of response (i probably shouldn't have done it). i agree with you that as a solo dev you can't (shouldn't) solve this problem - you have enough on your plate making jank great for your core users (who probably don't really care about cpp).

> As I said before, basically any query builder or codegen takes some form of string generation.

i mean this is a tautology on the level of "everything can be represented as strings". yes that's true but types (as you mention are important) and all i'm arguing is that it's much more robust to start with types and end with types instead of starting with strings and ending with types.

anyway you don't need to keep addressing my complaints - you have enough on your plate.

15. caim ◴[] No.44538335{4}[source]
You right. Always good to remember that Apple was and still is the main company behind LLVM.

Swift was built and its maintained by the same time that worked in LLVM.

And also, Swift has its own fork of LLVM and LLVM has built-in a lot of features designed for swift like calling convention and async transformation.

The amount of features swift has and is releasing at the same time it has its own LLVM version is just not a thing you can do without a lot of money and years of accumulated expertise.

replies(1): >>44539205 #
16. almostgotcaught ◴[] No.44539205{5}[source]
> still is the main company behind LLVM.

lol people really say whatever comes to their mind around here don't they? I'm pretty sure all of the companies associated with these targets would strongly disagree with you

https://github.com/llvm/llvm-project/tree/main/llvm/lib/Targ...

17. quuxplusone ◴[] No.44541586{7}[source]
FWIW, I saw that the title was false (after all, Jank and C++ are two different things), but I assumed it was playing on the snowclone "Are we _X_ yet?" and therefore the blog post was going to be explaining why the answer to "Is Jank C++ yet?" should be "Yes, Jank is C++ now."