To some extent Fermi looks like more of the same, only bigger and faster.
What's really interesting about it is that they've added L1 and L2 cache, they've gone to a global memory model so you can pass pointers around and they've put some effort into making branches less expensive. I didn't yet spot whether it supports recursion, but it's implied that it does. I expect to see a lot more GPU-raytracing projects start popping up in the near future thanks to this...
I'd be very interested to see how this compares to Larrabee, if/when it actually comes out.
The big change is certainly the support for indirect jump instructions and the exception handling support. Quite a big leap in GPU programmability. The only piece still missing is fully coherent caches.
What's really interesting about it is that they've added L1 and L2 cache, they've gone to a global memory model so you can pass pointers around and they've put some effort into making branches less expensive. I didn't yet spot whether it supports recursion, but it's implied that it does. I expect to see a lot more GPU-raytracing projects start popping up in the near future thanks to this...
I'd be very interested to see how this compares to Larrabee, if/when it actually comes out.