We would've preferred to not have to write these things. If you knew me you'd know I'm not a "build things because we can" type of person. We spent a week tuning PostgreSQL trying to get it fast enough, and couldn't (due to the restrictions of ACID we were trying to break). We then hired an outside DB consultancy for a handful of hours, since they claimed they could do it, since consultants for a few hours is cheaper than us writing a DB, and they ultimately came back and said "SQL isn't the right choice for this."
I can see how you can make your comment without understanding this context and more. But really, I promise you SQL didn't work AT ALL. This is primarily due to the type of data we were putting in there: we needed some aggregates like totals, averages, standard deviation, etc. To get this data, you need to have it inserted first, and we were requesting on the fly in real time at a pretty large rate. This would cause the CPU of the SQL servers to go up pretty hard WHILE also serving hundreds of thousands of insert/update ops/sec. We could've solved this by adding read slaves and all that, but now we're talking expensive CapEx for something we were confident we could fix with some C in a week or two.
Enter: the two servers linked in my parent comment.
We sacrificed durability, accuracy, and some level of safety (if the server crashes before an accumulation period once every X seconds, you lose that data) for raw speed. We calculated out the acceptable tolerances (max % off the real value) and tuned our data structures to that (bloomd and hlld both expose these tunables). We then ran both servers in parallel to our existing SQL solution for a few weeks to verify the numbers were correct. Finally, we switched over.
Like I said, when we built that we were doing maybe 450,000 metrics per second. That was 2 years ago. Now they're doing much much more than that, and both hlld and bloomd are running on a single m3.medium with no persistence (they aggregate then issue a single large UPDATE to SQL) with no issue.
The amortized operating expenses (human labor) cost of developing the servers, along with the thousands saved in server costs, and with the effectively zero ops overhead (they never go down, they're stateless, etc.) has been well worth it.
Very well put. I think people who have only built SQL-based solutions tend to underestimate the labor and ops overhead of SQL-based solutions for problems that don't fit the typical SQL use cases very well. It's nice to read of a long-running example like this.
If the data is partition-able, a main-memory database should be able to deliver aggregates in realtime without much pain. If you're writing to disk at 1M tx/sec though, yeah could perhaps be a minor bottleneck. I'm not quite sure what this has to do with SQL though.
Of course writing a specific datastructure will outperform a general solution - that goes without saying. No HA/persistence makes it even easier. And if the datastructure is simple enough, then writing and maintaining the code yourself may pay off.
I can see how you can make your comment without understanding this context and more. But really, I promise you SQL didn't work AT ALL. This is primarily due to the type of data we were putting in there: we needed some aggregates like totals, averages, standard deviation, etc. To get this data, you need to have it inserted first, and we were requesting on the fly in real time at a pretty large rate. This would cause the CPU of the SQL servers to go up pretty hard WHILE also serving hundreds of thousands of insert/update ops/sec. We could've solved this by adding read slaves and all that, but now we're talking expensive CapEx for something we were confident we could fix with some C in a week or two.
Enter: the two servers linked in my parent comment.
We sacrificed durability, accuracy, and some level of safety (if the server crashes before an accumulation period once every X seconds, you lose that data) for raw speed. We calculated out the acceptable tolerances (max % off the real value) and tuned our data structures to that (bloomd and hlld both expose these tunables). We then ran both servers in parallel to our existing SQL solution for a few weeks to verify the numbers were correct. Finally, we switched over.
Like I said, when we built that we were doing maybe 450,000 metrics per second. That was 2 years ago. Now they're doing much much more than that, and both hlld and bloomd are running on a single m3.medium with no persistence (they aggregate then issue a single large UPDATE to SQL) with no issue.
The amortized operating expenses (human labor) cost of developing the servers, along with the thousands saved in server costs, and with the effectively zero ops overhead (they never go down, they're stateless, etc.) has been well worth it.
Hope that makes our decision a little clearer.