Flows vs. packet captures for network visibility
Summary
A packet capture is a great option for troubleshooting network issues and performing digital forensics, but is it a good option for always-on visibility considering flow data gives us the vast majority of the information we need for normal network operations?
Recently, I saw some discussion online about how flow data, like NetFlow and sFlow, doesn’t provide enough network visibility compared to doing full packet captures. The idea was that unless you’re doing full packet captures, you’re not doing visibility right. Because I’ve used packet captures so many times in my career, I admit there’s a part of me that wants to agree with this. But I’ve been eyeball-deep in how network visibility works the last two years, so I really can’t agree.
Benefits of network flow data
In my experience, for the majority of the time, the level of visibility we get with flow data is perfect for what we’re trying to do, even when you take sampling into account. We get information about both ends of a conversation, application information, ports, protocols, path information, QoS activity, and BGP info. We can detect anomalies, recognize traffic patterns, and get information beyond just headers such as DNS info. So there’s really a lot we’re getting without capturing, processing, and storing every single packet.
Security and compliance concerns with packet captures
And, of course, we’re also avoiding the compliance issues or privacy violations such as with HIPAA or with some law enforcement agencies and chain of custody concerns when we start cracking open payloads.
Other issues with packet capture vs. flow
Remember that to capture literally every single packet on the network, we’re talking about installing taps to collect, aggregate, and forward that to some other collector. That means an additional parallel—and a typically very expensive—network to buy and maintain.
Ain’t nobody got time for that.
A cool thing about flow is that you actually do have the ability to extract the relevant parts of the packet, create aggregated counters and export that data through a protocol light enough that even the wimpiest processors can handle. And if you really need to, there are mechanisms that allow the export of actual packets alongside the interfaces they are on. Cisco has “Packet section” for v9 and sFlow has the ability to export payloads up to whatever length you want. And what’s crazy is that some devices will let you do that without sampling, which is usually the biggest gripe I hear about flow.
Encryption and packet captures
Another issue is that today we’re often working with TLS 1.3, which means we need keys for every single encrypted session to look at payloads. The problem is a lot of devices just don’t support exporting session keys, which is critical for decryption of payloads. So that means you end up storing a bunch of packets with encrypted payloads you can’t do anything with. For many scenarios, that’s totally pointless, especially considering you can get a lot of the same metadata from flows.
When should full packet captures be used?
But what about those times when full packet captures do make sense?
Orgs in some highly regulated industries are required to keep full packet captures, so whether or not it’s useful or cost prohibitive is pretty much irrelevant if they want to be in compliance.
From a troubleshooting perspective, you can’t beat a pcap. Assuming it’s traffic from your internal network, you can reconstruct an entire TCP conversation. I bet many-an-engineer would admit to running Wireshark on a lonely Friday night just to mess around with reconstructing a VoIP call.
And security folks might look at individual packets to do deep forensics analysis after a breach.
So clearly, there are specific use cases for full packet captures, but those are the exceptions and not the norm, at least not for most network operations teams. For those infrequent times when you need to perform granular troubleshooting or network forensic analysis, you can spin up your favorite tool and run an ad-hoc, temporary, and targeted packet capture.
NetFlow vs. pcap in the real world
I love pcaps as much as the next packet herder, but it just doesn’t make sense in most daily network operations as the primary, always-on visibility method. Most of the time, the level of visibility we get with flow data gives us the cost-effective and useful visibility we’re looking for.
In a perfect world, we could all afford to keep every packet and it would take 0ms to query out the data we want. But in the real world, we pcap where we have to, and we extract the stuff we really need — via flows — everywhere else.