L. Gao, “On Inferring Autonomous System Relationships in the Internet,” IEEE/ACM Transactions on Networks, V. 9, N. 6, (Dec. 2001), pp. 733-745. [PDF]
Business relationships among the ASes in the Internet are kept secret, similar to any other business deals. The only thing available to outsiders are the BGP route announcements that can be accumulated and read from different vantage points. This paper presents heuristics to infer the different business relationships (customer-provider, peer-to-peer, and sibling-to-sibling) seen in the Internet with a very high accuracy (99.1% in some cases).
The inference procedure depends on rational selective export rules for announcing routes to customers, providers, peers, and siblings; e.g., peers never transits for each other, siblings always transits for each other, and customer-provider share a one-way transit from provider-to-customer direction and not vice versa. Based on this observation, the authors also define the Valley-Free property of routes, which says that
- A provider-to-customer edge can be followed by only provider-to-customer or sibling-to-sibling edges, and
- A peer-to-peer edge can be followed by only provider-to-customer or sibling-to-sibling edges.
Essentially there are some patterns of different types of routes, and the authors exploit their presence to identify different relationships. They also assume that larger ASes have higher degree and provider ASes are bigger than their customer ASes.
There are 3 phases to the proposed inference algorithm. First, all AS-pairs are coarsely classified into customer-provider and sibling relationships. Next, all AS-pairs that cannot have peering relationships are identified. Finally, peering relationships are identified from the remaining connected AS-pairs based on the heuristic that their comparative sizes remain within a factor R.
The authors used RouteViews-collected BGP routing table entries and using their algorithms found AS relationships, which were validated by secret AT&T data. By changing the approximation parameters they managed to achieve 99.1% accuracy. It should be noted that they were mostly wrong-inferring sibling relationships. They postulate that this error can happen due to several possible reasons: router configurations errors, misconfiguration of export rules, unusual AS relationships, and possible problem with the proposed heuristics, but they do not elaborate on that.
The contribution is great and important, but somehow the way its written makes a reader want not to read it. Unfortunately, it is unnecessarily long and artificially complicated. Also some of the mathematical stuff seemed not-so-necessary. When a few lines can clearly explain something, why bother using equations, mathematical symbols, theorems, lemmas just to scare people away!!! Plus, this paper should’ve been read with other BGP related papers.