I have been working full-time on Kudu since its early development. As others hav...

I have been working full-time on Kudu since its early development. As others have mentioned, Arrow and Kudu are quite different. Despite the controversial-sounding title of Daniel Abadi's article, his content was actually reasonable and his conclusion in the final paragraph of the article is worth reading. In summary, he acknowledges that in-memory and on-disk columnar formats have different goals and both have their place (Arrow being an in-memory format).

Apache Kudu is much more than a file format - it is a columnar distributed storage engine. One way to think of Kudu is as mutable Parquet, but really it's a database backend that integrates with Impala and Spark for SQL, among other systems. It's fault tolerant, manages partitioning for you, secure, and much more. For a quick introduction to Kudu you can check out this short slide deck I put together over a year ago... it's a bit dated but a good overview: https://www.slideshare.net/MichaelPercy3/intro-to-apache-kud...

For more up-to-date information, follow the Apache Kudu Blog at http://kudu.apache.org/blog/ or follow the official Apache Kudu twitter account @ApacheKudu.