Under the Hood: How Rust Helps Keep Kentik’s Performance on High
Summary
We post a lot on our blog about our advanced network analytics platform, use cases, and the ROI we deliver to service providers and enterprises globally. However, today’s post is for our fellow programmers, as we go under Kentik’s hood to discuss Rust.
We post a lot on our blog about our advanced network analytics platform, use cases, and the ROI we deliver to service providers and enterprises globally. However, today’s post is for our fellow programmers out there, as we go under Kentik’s hood to discuss Rust, a new programming language that is getting a lot of attention for its performance, safety, and design. Simply put: Rust is a big part of how Kentik ingests Gigabits per second representing over a Petabit per second of Internet traffic, and stores 100 TB of network flow data each day. The performance and reliability of our platform relies on software written in Rust, and we benefit from the robust ecosystem of open source libraries available on crates.io.
On the ingest side, Kentik’s high-performance host and sensor agent captures raw network traffic and converts it into kflow, our internal Cap’n Proto-based flow record format. In addition to basic data like source and destination IP address, port, protocol, etc., we collect network performance metrics like TCP connection setup latency, retransmitted and out-of-order packet counts, and window size. We also utilize nom, an excellent Rust parser combinator library, for high-performance decoding of application layer protocols like DHCP, DNS, and HTTP.
Rust’s memory ownership model allows us to share the underlying packet capture buffer with the entire processing pipeline while ensuring that any references into the buffer do not outlive the packet data. Due to this, we were able to implement very efficient zero-copy parsers that do minimal allocation.
On the datastore side, we’ve recently rolled out a new backend disk storage format written in Rust that delivers improved performance and higher storage density. And our query layer utilizes a HyperLogLog extension, also written in Rust, for high-performance cardinality queries. Rust’s performance and low memory use is a key benefit here.
These components are distributed as shared libraries with a C API and linked into various parts of our distributed storage engine. We’ve taken this same approach with libkflow, a library for generating and sending kflow records to Kentik. However, libkflow is written in Go, and the resulting shared library is large because it contains the full runtime, garbage collector, etc.
Additionally, when the Go runtime is initialized it creates many threads, which fails spectacularly when linked into a program that then forks. Go also has long-outstanding bugs that cause the runtime to segfault when linked into a static binary. We’ve had to develop and maintain patches to the Go runtime that work around these issues.
None of this is meant as an attack on Go. The majority of Kentik’s backend systems are written in Go and we’re happy with that choice. However, Rust code is easier to embed as a dynamic or static library since there is no runtime or garbage collector, and no overhead calling Rust functions from C or vice versa. Rust’s design combined with LLVM’s powerful optimizer also often produces code with superior performance, less memory use, and no garbage collection overhead.
Rust is a critical component in the Kentik software stack, and absolutely delivers on its promise of performance and safety while being a joy to program in. If you’re a developer who has been curious about it, we’d encourage you to check out Rust. And, of course, if you’re curious about Kentik, trial it here.