I'm curious what are these other languages that can do these things? I read HN regularly but don't recall them. Or maybe that's including things like Java's annotation processing which is so clunky that I wouldn't classify them to be equivalent.
I'm curious what are these other languages that can do these things? I read HN regularly but don't recall them. Or maybe that's including things like Java's annotation processing which is so clunky that I wouldn't classify them to be equivalent.
Annotations themselves are pretty great, and AFAIK, they are most widely used with reflection or bytecode rewriting instead. I get that the maintainers dislike macro-like capabilities, but the reality is that many of the nice libraries/facilities Java has (e.g. transparent spans), just aren't possible without AST-like modifications. So, the maintainers don't provide 1st class support for rewriting, and they hold their noses as popular libraries do it.
Closely related, I'm pretty excited to muck with the new class file API that just went GA in 24 (https://openjdk.org/jeps/484). I don't have experience with it yet, but I have high hopes.
It’s beautiful to implement an incredibly fast serde in like 10 lines without requiring other devs to annotate their packages.
I wouldn’t include Rust on that list if we’re speaking of compile time and compile time type abilities.
Last time I tried it Rust’s const expression system is pretty limited. Rust’s macro system likewise is also very weak.
Primarily you can only get type info by directly passing the type definition to a macro, which is how derive and all work.
Note that more intrusive changes -- including not only bytecode-rewriting agents, but also the use of those AST-modifying "libraries" (really, languages) -- require command-line flags that tell you that the semantics of code may be impacted by some other code that is identified in those flags. This is part of "integrity by default": https://openjdk.org/jeps/8305968
Now, should they do anything they please? Definitely not, but they can. That's why there's a (serious) macro which runs your Python code, and a (joke, in the sense that you should never use it, not that it wouldn't work) macro which replaces your running compiler with a different one so that code which is otherwise invalid will compile anyway...
How so? Rust procedural macros operate on token stream level while being able to tap into the parser, so I struggle to think of what they can't do, aside from limitations on the syntax of the macro.
If you have a derive macro for
#[derive(MyTrait)]
struct Foo {
bar: Bar,
baz: Baz,
}
then your macro can see that it references Bar and Baz, but it can't know anything about how those types are defined. Usually, the way to get around it is to define some trait on both Bar and Baz, which your Foo struct depends on, but that still only gives you access to that information at runtime, not when evaluating your macro.Another case would be something like
#[my_macro]
fn do_stuff() -> Bar {
let x = foo();
x.bar()
}
Your macro would be able to see that you call the functions foo() and Something::bar(), but it wouldn't have the context to know the type of x.And even if you did have the context to be able to see the scope, you probably still aren't going to reimplement rustc's type inference rules just for your one macro.
Scala (for example) is different: any AST node is tagged with its corresponding type that you can just ask for, along with any context to expand on that (what fields does it have? does it implement this supertype? are there any relevant implicit conversions in scope?). There are both up- and downsides to that (personally, I do quite like the locality that Rust macros enforce, for example), but Rust macros are unquestionably weaker.
A much much better system would be one that lets you write vanilla Rust code to manipulate either the token stream or the parsed AST.
The integrity by default JEPs are really about trying to reduce developers depending upon JDK/JRE implementation details, for example, sun.misc.Unsafe. From the JEP:
"In short: The use of JDK-internal APIs caused serious migration issues, there was no practical mechanism that enabled robust security in the current landscape, and new requirements could not be met. Despite the value that the unsafe APIs offer to libraries, frameworks, and tools, the ongoing lack of integrity is untenable. Strong encapsulation and the restriction of the unsafe APIs — by default — are the solution."
If you're dependent on something like ClassFileTransformer, -javaagent, or setAccessible, you'll just set a command-line flag. If you're not, it's because you're already doing this through other means like a custom ClassLoader or a build step.
That depends on the language specification. The Java spec dictates what code a Java compiler must accept and must reject. Any "mucking with AST" that changes that is, by definition, not Java. For example, many Lombok programs are clearly not written in Java because the Java spec dictates that a Java compiler (with or without annotation processors) must reject them.
In Scheme or Clojure, user-defined AST transformations are very much part of the language.
> The integrity by default JEPs are really about trying to reduce developers depending upon JDK/JRE implementation details
I'm one of the JEP's authors, and it concerns multiple things. In general, it concerns being able to make guarantees about certain invariants.
> If you're not, it's because you're already doing this through other means like a custom ClassLoader or a build step.
Custom class loaders fall within integrity by default, as their impact is localised. Build step transforms also require an explicit run of some executable. The point of integrity by default is that any possibility of breaking invariants that the spec wishes to enforce must require some visible, auditable step. This is to specifically exclude invariant-breaking operations by code that appears to be a regular library.
Token manipulation code is frequently full of syn! macro hell. So even token manipulation is only kind of normal Rust code.
I feel like we're talking right past one another. The ultimate reality is that annotation processors are pretty terrible for implementing functionality that a lot of Java developers depend upon. You could say annotation processors "weren't designed for that", but then you're just agreeing with me. This is sad, because arguably something quite similar to annotation processors could make the jobs of all of these developers a lot easier, instead of having them falling back to other mechanisms.
If your concern is integrity by default, why not just add yet another flag for can-muck-with-the-ast-annotation-processors? Or we can continue with the status quo.
There is such a flag (or, rather, a set of flags), and that's exactly what the Lombok compiler uses to change javac to compile Lombok sources rather than Java sources.
However, we think there are much better solutions to the problem those languages try to solve than allowing AST manipulation.
> However, we think there are much better solutions
I'd like to hear more. Can I discuss this further with you in a more appropriate venue than this forever thread?
Those are traditionally offered in Java in the form of bytecode transformation rather than AST transformations, as the notion of "compile time" in Java is not as clear as it is in, say, Zig; Project Leyden will make it even more vague, as it will allow caching JIT output from one run to the next.
> Can I discuss this further with you in a more appropriate venue than this forever thread?
Sure, you can email me at the email address I use on the JDK mailing lists (e.g. loom-dev).
And we've come full circle. I think they're traditionally written as bytecode transformations, because the entire pipeline for both writing and using many kinds of program transformations in bytecode is far simpler, more accessible, and more performant than implementing and executing a source-to-source compiler that feeds into another java compiler.
That said, there are also times you wish to perform transforms on programs for which you don't have access to source, in which case your hand is forced. Ideally, you would be able to write many classes of transforms agnostic to that context.
> Sure
Thanks!