Well, I'm sure there are specific use cases where Riemann would be preferable to a generic web server. But for most developers in most situations, it is a no brainer to choose an HTTP-based protocol with off-the-shelf HTTPD server over a 10x slower proprietary system.
Er, I don't mean to be contrarian, but Riemann is anything but proprietary. I'm an unemployed OSS developer. Every bit of code from clients to dashboard to server to integration tools to the website is open source: http://aphyr.github.com/riemann/. I'm not trying to sell anything. I'm just building a tool to solve a problem I faced in developing distributed systems.
Second... I guess I can reiterate. Riemann is not an HTTP server. It's an event-stream driven monitoring system. The protocol uses existing standards (e.g. protobufs), is simple to implement, and the community has written clients for many languages: http://riemann.io/clients.html.
As an aside, I do plan on adding an HTTP interface to Riemann, but HTTP processing (and using JSON for serialization) comes with certain unavoidable costs in bandwidth, memory and latency. It'll fill a complementary space to the existing TCP and UDP interfaces.
Sorry, by "proprietary" I meant "custom". I admire people who have skills and dedication to built OSS, this was just a wrong word to use.
I completely agree that for specific uses Riemann is great. Your post was, though, about the performance/throughput, and so my comment was about the performance/throughput. Streaming messages/events over the network is an old problem, with well-known limitations, and this was what my comment was about.
Did you even read what aphyr wrote before posting this? It detailed that unlike HTTP, which in general supplies one request per message (GET, POST, etc), a Riemann message contains potentially hundreds of different requests that must be processed individually.
You can put as many "events" in the body of an HTTP POST request as you wish. What really matters in distributed messaging systems, from the performance point of view, is the number of distinct messages per second.
And if a system designer wants to send a stream of "events" to another system to be acted upon, and if this designer cares about throughput (which is assumed here, given the title of the post), then this designer is likely to choose a faster messaging system, especially if it is more flexible, due to its ubiquity and universal support, protocol (e.g. HTTP) over a custom protocol.
They're completely different kinds of software, at least as I understand them. Riemann is about pushing lots of little hashmaps (events) through a DAG of streaming functions; HTTP is about synchronous gets/puts/updates/deletes to a tree of resources.
If you are trying to make sense of Riemann in HTTP terms, sending an event to Riemann might look like POST /streams, with a body containing a single JSON object. There's no notion of GET, PUT, or DELETE though--the state inside Riemann streams has no name or external representation.
There are other components in Riemann which can be expressed as HTTP resources--the index, which is used for tracking the most recent event for a given host and service, and the pubsub system for example. Those have HTTP APIs for making a query (GET /index?q=service = "www" and state = "critical"), and a websocket variant which streams down updates for that query to you.
But as far as a general replacement, I'd say no, it doesn't make any sense. This is more akin to... a slow, insanely flexible, less complete version of Esper than an HTTP server.
Don't let the afterhours haters -- sorry, hackers -- get you down. Riemann looks to be a fantastic piece of open-source software for scratching a particular itch.
Coincidentally, I was just working on a tracing system to dump data to Riemann while eating HTTP logs and/or handling live requests from browsers. It seems to be just what we need to aggregate, monitor and graph our trace data. Thanks!
Thank you. :) If you have any questions, feel free to hop on Freenode #riemann and I'll do my best to help out.
BTW, you're not the first to wonder about streaming events directly to Riemann from client browsers. I... don't recommend it, just because I don't have the time to appropriately guarantee Riemann's performance and security characteristics as an internet-facing service (yet), but adding an HTTP POST path to (ws-server) is definitely on my list. Even if the HTTP+JSON interface is much slower than the TCP/UDP interfaces, I think it'll be plenty useful for many deployments, especially those making requests from JS.