I've looked hard at using amazons gpu clusters, and the math just doesn't work out.
If you're working on applications that will need to be using the gpu regularly, you can build a system with 4 gtx580s for about $3,000, and one of those systems will outperform 2, maybe 3, aws gpu instances, which will run you about 1000 per month each. The ownership number does not include data center/power/etc., but I still think buying is better value if you'll be using it a lot.
Now, if you're running gpu jobs sporadically, aws may make sense, but you should really look carefully at this, it's not the same value relationship as hosting web servers on aws (which I'm a general proponent of).
Although, to be fair, that may change if they really do pass on some of their savings from this deal to the user.
6Gigs on a single GPU is very tempting. And the single precision performance (Theoretically 2.2 TFLOPS) per GPU on the K10 is more than anything else you can get on the market. To buy one of those would cost you $3500-$4000. To buy two + motherboard + xeon would probably cost you close to $10,00. If you want to scale it and set up a cluster, the cost per machine scales up and the amount of man hours spent would become a factor.
Amazon offers a heavy usage deal which comes out to ~ $11,000 per instance if used 24x7x365.
You could argue that the cluster / machine you set up would be useful for more than a year. This is true to an extent, but at the current rate of development GPUs become obsolete rather quickly and suddenly having a cluster on the cloud sounds more appealing than going through the process of updating your machines every 18-20 months.
Yeah, I did a few trial runs on the cluster gpu instances maybe 4 months ago. I found that, while the gpus themselves were really quite fast, moving data in and out of gpu was not. Maybe amazon will focus on increasing bandwidth to the gpu for the new boxes.
To be fair, there should not be a lot of data transfers to and from the GPU. Moving larger chunks of data (instead of many smaller ones) when you are running out of memory, using asynchronous data / compute streams would increase the performance.
Because the current series (GTX 680) is severely stunted for CUDA compared to the GTX 580. The single precision performance of 680 barely beats the 580. The double precision performance on the GTX series sucks in general, but NVIDIA actually made it twice as worse going from 580 to 680. (Benchmarks linked at bottom).
The reasoning may have been to focus the GTX series more on gaming. Or it could be more sinister to push more people towards their costlier Tesla Line. Considering that they came out with the K10 which has terrible double precision performance, but incredible single precision performance, I think they are heading towards multiple Tesla lines and want to push the GTX series away from the serious GPGPU computing.
If you're working on applications that will need to be using the gpu regularly, you can build a system with 4 gtx580s for about $3,000, and one of those systems will outperform 2, maybe 3, aws gpu instances, which will run you about 1000 per month each. The ownership number does not include data center/power/etc., but I still think buying is better value if you'll be using it a lot.
Now, if you're running gpu jobs sporadically, aws may make sense, but you should really look carefully at this, it's not the same value relationship as hosting web servers on aws (which I'm a general proponent of).
Although, to be fair, that may change if they really do pass on some of their savings from this deal to the user.