Weird that there has been no significant adoption of Mojo. It has been quite some time since it got released and everyone is still using PyTorch. Maybe the license issue is a much bigger deal than people realize.
replies(10):
https://cuda.juliagpu.org/stable/tutorials/introduction/#Wri...
With KernelAbstractions.jl you can actually target CUDA and ROCm:
https://juliagpu.github.io/KernelAbstractions.jl/stable/kern...
For python (or rather python-like), there is also triton (and probably others):