Well, I'm sure there are specific use cases where Riemann would be preferable to...

aphyr · on Feb 1, 2013

Er, I don't mean to be contrarian, but Riemann is anything but proprietary. I'm an unemployed OSS developer. Every bit of code from clients to dashboard to server to integration tools to the website is open source: http://aphyr.github.com/riemann/. I'm not trying to sell anything. I'm just building a tool to solve a problem I faced in developing distributed systems.

Second... I guess I can reiterate. Riemann is not an HTTP server. It's an event-stream driven monitoring system. The protocol uses existing standards (e.g. protobufs), is simple to implement, and the community has written clients for many languages: http://riemann.io/clients.html.

As an aside, I do plan on adding an HTTP interface to Riemann, but HTTP processing (and using JSON for serialization) comes with certain unavoidable costs in bandwidth, memory and latency. It'll fill a complementary space to the existing TCP and UDP interfaces.

trekkin · on Feb 1, 2013

Sorry, by "proprietary" I meant "custom". I admire people who have skills and dedication to built OSS, this was just a wrong word to use.

I completely agree that for specific uses Riemann is great. Your post was, though, about the performance/throughput, and so my comment was about the performance/throughput. Streaming messages/events over the network is an old problem, with well-known limitations, and this was what my comment was about.

aba_sababa · on Feb 1, 2013

Please don't criticize what he's built before you even barely understand it.

rbranson · on Feb 1, 2013

Did you even read what aphyr wrote before posting this? It detailed that unlike HTTP, which in general supplies one request per message (GET, POST, etc), a Riemann message contains potentially hundreds of different requests that must be processed individually.

trekkin · on Feb 1, 2013

You can put as many "events" in the body of an HTTP POST request as you wish. What really matters in distributed messaging systems, from the performance point of view, is the number of distinct messages per second.

And if a system designer wants to send a stream of "events" to another system to be acted upon, and if this designer cares about throughput (which is assumed here, given the title of the post), then this designer is likely to choose a faster messaging system, especially if it is more flexible, due to its ubiquity and universal support, protocol (e.g. HTTP) over a custom protocol.

jeremyjh · on Feb 1, 2013

Is that why HTTP is always used for high-throughput network services like MySQL, Memcached or Redis? Oh wait it isn't.

sgrove · on Feb 1, 2013

What similarities between the two pieces do you see? What task would you replace Riemann with httpd? Genuinely curious, though slightly skeptical.

aphyr · on Feb 1, 2013

They're completely different kinds of software, at least as I understand them. Riemann is about pushing lots of little hashmaps (events) through a DAG of streaming functions; HTTP is about synchronous gets/puts/updates/deletes to a tree of resources.

If you are trying to make sense of Riemann in HTTP terms, sending an event to Riemann might look like POST /streams, with a body containing a single JSON object. There's no notion of GET, PUT, or DELETE though--the state inside Riemann streams has no name or external representation.

There are other components in Riemann which can be expressed as HTTP resources--the index, which is used for tracking the most recent event for a given host and service, and the pubsub system for example. Those have HTTP APIs for making a query (GET /index?q=service = "www" and state = "critical"), and a websocket variant which streams down updates for that query to you.

But as far as a general replacement, I'd say no, it doesn't make any sense. This is more akin to... a slow, insanely flexible, less complete version of Esper than an HTTP server.

bkirkbri · on Feb 1, 2013

Don't let the afterhours haters -- sorry, hackers -- get you down. Riemann looks to be a fantastic piece of open-source software for scratching a particular itch.

Coincidentally, I was just working on a tracing system to dump data to Riemann while eating HTTP logs and/or handling live requests from browsers. It seems to be just what we need to aggregate, monitor and graph our trace data. Thanks!

aphyr · on Feb 1, 2013

Thank you. :) If you have any questions, feel free to hop on Freenode #riemann and I'll do my best to help out.

BTW, you're not the first to wonder about streaming events directly to Riemann from client browsers. I... don't recommend it, just because I don't have the time to appropriately guarantee Riemann's performance and security characteristics as an internet-facing service (yet), but adding an HTTP POST path to (ws-server) is definitely on my list. Even if the HTTP+JSON interface is much slower than the TCP/UDP interfaces, I think it'll be plenty useful for many deployments, especially those making requests from JS.