The info everyone is missing is code density comparison with ARM. Risc-V is more efficient and has about 10% denser code, which translates to more instructions fitting in i-cache, less memory pressure, and ultimately better performance and battery life. Long term roadmap, thats a win for Risc-V.
> Risc-V is more efficient and has about 10% denser code, which translates to more instructions fitting in i-cache, less memory pressure, and ultimately better performance and battery life. Long term roadmap, thats a win for Risc-V.
Only in the most extreme cases.
1) Battery life isn't dominated by run current for the vast majority of embedded devices. Sleep current dominates (most cases) or peripheral current dominates (RF transmit/receive, for example). You try to dial down the number of times you turn on until it's below the amount of energy you burn while off.
2) RAM is expensive; flash not so much. Code space isn't the issue--10% almost certainly not. Correlated: this is why I expect you really won't see 64 bits making a lot of inroads into embedded--doubling RAM consumption is expensive on embedded.
That's a pretty damn good argument. It's 10% ahead of the best ISAs that took decades of development. Just think of how adoption would be affected if it was 30% worse.
In other words: not only is it better in terms of royalties and ecosystem, it also better at everything else too. Isn't that terrific?
I read the GP as talking about how code size was always a weakness of RISC, and seemingly the largest one.
And here it is compared against a classical CISC platform and a hybrid one highly optimized for code size, and winning. What just makes RISC-V even more awesome than just any non-optimized design beating the incumbents.
Is it a core advantage? Maybe. But smaller code size has beneficial effects on the silicon cost. Choice 1: If you can benchmark the same on important work loads with 10% smaller I-Cache, make the die smaller. Manufacturing costs go down with a greater than square law effect with die area. Choice 2: Use the die area freed up to put more functional units in the same area.
Core advantage? I will let others debate that. Significant: surely.
Smaller code size makes your caches more effective. L1 instruction cache is size limited because it's on a critical timing path. Increasing its size limits your operating frequency.
Code size for RV32IMAC is still pretty mediocre with the current GCC/RISCV compiler. And the standard library they use by default is pretty sub-optimal. I know they're working on it, and it's clear they're making quick progress, but it's not easy at the moment. The last project I worked on, I had to abandon ABI conventions and hand craft large chunks of code.
As of early 2016, with the GCC port at that time, RV32GC was as dense as Thumb, and RV64GC was denser than AArch64 and every other major 64-bit ISA, including AMD64. Though RV64G (no C) was in some extreme cases up to 50% larger than AArch64 (due to inlining memcpy and memset, which are a bit larger without compressed instructions), but usually around the same (except MIPS64, which is way larger than the other 64-bit ISAs, probably because of exposed delay slots). [0]
There's some indication that density should have increased somewhat since then, but I haven't looked at it myself.
That's why I have a lot more faith in RISC-V's ability to take on relatively high end embedded tasks than lower end ones. I'd expect compression to be too expensive, transistor wise, for many roles where you'd use an ARM Cortex M2 or such and program memory is at a premium in those places.
It's not the kind of compression you might be thinking of. It's just 16-bit "shortcuts" for some of the common 32-bit instructions. The impact in gate count should be minimal. In a lot of these applications you'll have the code in on-chip non-volatile memory which means reducing code size may also reduce chip area.
I think with relatively little increase in gate count you could also make some sequences of two 16-bit instructions execute simultaneously, which could yield nice performance improvements for micro-controller cores.
Also, you might be surprised at how "big" many micro-controllers are becoming these days.
> I'd expect compression to be too expensive, transistor wise, for many roles where you'd use an ARM Cortex M2 or such...
Decoding the "compressed" instructions is actually pretty straightforward, it doesn't add much complexity to a design. ARM Cortex M0+/M3/M4 implements a similar (but more complex) "compressed" instruction set called Thumb, and comparable RISC-V cores available from SiFive are smaller, faster, and more efficient.
In a very small RISC-V core by the venerable Clifford Wolf called PicoRV32 [0], you can look at the complexity introduced by configuring it with the COMPRESSED_ISA option.
> ...and program memory is at a premium in those places.
Program memory is one thing, but on processors of all sizes, code size has a big impact on performance in common types of program.
I think this is wrong. Certainly the GCC toolchain spits out some remarkably mediocre code. RISC-V compressed is generally on par with Thumb2, and where it differed, Thumb2 seemed to be a tiny bit more dense.
If you compare GCC/ARM with GCC/RISCV the difference isn't too great, but even the IAR ARM compiler gives you noticeable improvements over GCC/RISCV. And ARM's compilers are actually quite good with respect to code size; MUCH better than GCC/RISCV (or even GCC/ARM).
That being said, were I to add some custom instructions, I would COMPLETELY prefer to do it with RISC-V than with ARM.
[] Though the gcc/riscv toolchain is getting better pretty quickly.
Andrew Waterman's PhD thesis "Design of the RISC-V Instruction Set Architecture" ("Why Develop a New Instruction Set?") has a nice comparison of ISA encoding and density of RISC-V, MIPS, SPARC, Alpha, ARMv7/8, Thumb, OpenRISC, and x86/x86-64.