Yeah! Most statements translate to a single instruction, and you have to pick registers, so even though performance hasn't been a priority I think it'll tend to be pretty good. It does have some simplifications for safety and clarity rather than optimization. For example, setting a register to zero generates a copy with a 32-bit immediate rather than the xor trick most Assembly programmers would use.
I haven't tried to implement C, but can't immediately think of a reason why not. Forks strongly encouraged; if you decide to try to implement one, I'd love to contribute.
My first thought was “can this lead to easier implementation of a higher level language” and it seems Mu is exactly that? What’s performance like?