How much does RPKI ROV reduce the propagation of invalid routes?
Summary
Our analysis from earlier this year estimated that the majority of internet traffic now goes to routes covered by ROAs and is thus eligible for the protection that RPKI ROV offers. This analysis takes the next step in understanding RPKI ROV deployment by measuring the rejection of invalid routes.
Earlier this year, Job Snijders and I published an analysis that estimated the proportion of internet traffic destined for BGP routes with ROAs. The conclusion was that the majority of internet traffic goes to routes covered by ROAs and are thus eligible for the protection that RPKI ROV offers.
However, ROAs alone are useless if only a few networks are rejecting invalid routes. The next step in understanding where we are at with RPKI ROV deployment is to better understand how widespread the rejection of invalid routes is.
It’s a tricky question, but one way to explore this is to measure the impact on propagation when a route is evaluated as invalid. The more networks that reject invalids, the smaller the propagation of those invalid routes. Reduced invalid route propagation reduces the damage caused by route leaks or inadvertent hijacks.
So the question is, how much does being evaluated as invalid reduce a route’s propagation?
To answer this question, we can make use of thousands of BGP routes that are persistently invalid due to ROA misconfigurations. Using a public BGP dataset like Routeviews, we can measure how far persistently invalid routes propagate on the internet as compared to their RPKI-valid and RPKI-not-found (i.e. no ROA) sibling routes. We can also look at what happens to the propagation of individual routes as their statuses change from valid to invalid.
Histogram time!
These days, the IPv4 table consists of over 900,000 routes while the IPv6 table has a little more than 150,000 routes. If we took each route and counted how many Routeviews vantage points had that route in their routing tables, we can make a histogram showing how many routes are seen by how many vantage points. The count of vantage points can serve as a measure of route’s propagation — the more vantage points, the more propagation.
Here are histograms for the two address families. The peak of globally routed prefixes (those seen by nearly all vantage points) is around 295 for IPv4 and 240 for IPv6 (the lower number reflects the smaller number of IPv6 vantage points in the Routeviews dataset).
If we decompose each of these histograms by RPKI ROV status, we arrive at the following histograms which illustrate the difference in propagation that invalids experience as compared to the alternatives. The distributions for IPv4 (left) and IPv6 (right) are colored by RPKI status: RPKI-not-found (blue), RPKI-valid (orange) and RPKI-invalid (green).
From these histograms, we can see that invalid routes rarely, if ever, experience propagation greater than half that experienced by RPKI-valid and RPKI-not-found routes. In fact, many experience propagation significantly less than half, but the amount of reduction depends on a number of factors including the upstreams involved in transiting the prefixes. Nonetheless, it is evident that RPKI ROV dramatically reduces the propagation of invalid routes.
The plight of the invalid
On any given day, there are hundreds of BGP routes that change RPKI ROV states either due to changes in their ROAs or due to a change in their origin. Let’s take a look at a couple recent examples of routes that became invalid and how that affected how far the route was propagated.
A common BGP error that never seems to go away is the single-digit BGP hijack caused by a simple router misconfiguration. My friend Anurag Bhatia wrote about this phenomenon back in 2013 and it was also discussed in the recent NANOG talk by Aftab Siddiqui of MANRS. These errors occur when a network engineer attempts to prepend an AS three times, for example, but instead ends up prepending the number 3 to the AS path. The result is that it appears as though the prestigious Massachusetts Institute of Technology (AS3) is hijacking the route when in reality it was a simple misconfig.
Well, when you add RPKI ROV to the mix, this error has more than a superficial impact. In the example below, AS210974 changed how it announced 212.192.2.0/24 on August 4, 2022. It began prepending the number 3 to its AS path, however since there was a ROA for this prefix, it also caused the route to become invalid leading to a significant drop in propagation.
This is depicted below in Kentik’s BGP visualization which reports on the percentage of BGP vantage points that have routes of each origin in their tables. When AS3 becomes the origin due to the misconfig (red), the percentage of vantage points carrying this route drops by half as numerous backbone providers (including Cogent (AS174), GTT (AS3257), Lumen (AS3356), and Tata (AS6453)) stop accepting this route from Bezeq (AS8551).
In another recent case, 103.169.138.0/23 changed origins on August 11, 2022 from AS142343 to AS38758. However, since the new origin isn’t listed as an authorized Origin AS in the ROA for this route, 103.169.138.0/23 became an invalid route. As a result, the propagation dropped by over a half of what it was at the beginning of the day, when Cogent (AS174) and Hurricane Electric (AS6939) stopped accepting the invalid route from Telin (AS7713).
Incidents like these are becoming more commonplace as RPKI adoption increases and the rate of misconfigurations remains steady. All the more reason to use route monitoring that reports on the RPKI ROV validity of your routes. Operators can monitor their routes using Kentik or the open source package BGPAlerter.
Conclusion
Based on the analysis above, the evaluation of a route as invalid reduces its propagation by anywhere between one half to two thirds. Given that the majority of internet traffic now flows towards routes with ROAs, this offers a significant degree of protection for the internet in the event of a routing leak or other inadvertent BGP hijack.
Just recently, Zayo (AS6461) announced that they would begin rejecting invalid routes. By doing so, they will join the ranks of the other tier-1 backbone networks like Arelion, NTT, Cogent, Lumen, and GTT that reject invalid routes. It will be fascinating to see how much the above distributions change when this announced change takes effect.
At this time Zayo will be only rejecting invalids coming from peers (not customers). Since most of their biggest peers are already dropping invalids, the overall impact may be subtle. Having all of the internet’s largest backbone networks dropping invalids would be a substantial achievement in routing security, and we’re almost there!