Over the past decade, communication technologies have given rise to a wide range of online services for both individuals and organizations via the Internet and other interconnected networks. Network routing protocols play a critical role in these networks to effectively move traffic from any source to any destination.

The core routing decisions on the Internet are made by the Border Gateway Protocol (BGP) as specified in IETF RFC 4271, which uses routing tables to determine reachability among autonomous systems and make route selections. BGP guarantees that the traffic will go through the shortest path to reach its destination; however, it does not guarantee that the route is optimal in terms of performance (e.g. latency, loss, etc.) and/or costs as shown in the following figure.
BGP_figure1

Internap’s Managed Internet Route OptimizerTM (MIRO) was specifically designed to overcome this problem by evaluating different path characteristics to create performance metrics that are used to select the best routes for Internap customers.

What is MIRO?

MIRO is a highly engineered, distributed system whose functionality can be separated into four core subsystems: Route Collection and Injection, Traffic Estimation, Performance Measurement and Route Optimization. The following is a greatly summarized description of each subsystem:

Route Collection and Injection – MIRO actively learns full BGP tables (prefixes) announced by each provider to be aware of the different routes available to each destination. There are different ways to learn this information including direct BGP sessions with edge routers or via SNMP queries. Also, this subsystem is in charge of updating routes (moving routes) by telling the routers which provider is preferred for each route.

Traffic Estimation – To estimate the volume of traffic, MIRO consumes network flow information from the edge routers (e.g. Cisco Netflow, IPFIX). The flow information contains source and destination IP addresses, port numbers, octets, etc. This information is aggregated into subnetworks (prefixes) that should match the ones collected by the Route Collection subsystem, and then the total amount of traffic to each destination is calculated and handed over to the Route Optimization Engine.

Performance Measurement – Performance metrics can be defined as a combination of one or more measurement variables like latency, packet loss, jitter, etc. MIRO selects target IPs on each destination network for which it collect performance metrics, and does so via different techniques including pings and traces. This information is then combined and normalized, and handed to the Route Optimization Engine.

Route Optimization – The Route Optimization Engine is the brains of the MIRO system. It consumes routes and provider information, traffic estimates, performance metrics, and user rules and parameters, and runs a mathematical model to find the absolute best route for each destination. The Route Optimization Engine then sends the selected routes for the destinations to the Route Injection subsystem, which makes sure the changes are applied.

In order to optimize each component and meet quality and performance requirements, our engineering efforts had to overcome many challenges, including:

  • How to accurately calculate traffic at the prefix level, a problem which is still an open issue in the academic and research community;
  • How to optimize routes in polynomial time considering there are hundreds of millions of possible solutions;
  • How to keep track of thousands of route changes per minute from several providers without negatively impacting our edge routers’ performance; and
  • How to calculate convergence points for target selection to ensure stable and reliable probing for collecting performance measurements.

One of the main differences with its predecessor is the way new MIRO optimizes routes. Our previous method, a heuristic TCP/IP route management control, worked with BGP in an automated manner. It updated routing tables with the best performing routes available to provide a superior alternative to the manual route selection approach that many data centers employ to compensate for BGP’s inherent deficiencies. The new method is a deterministic approach based on a mathematical model, expressed with a linear programming formulation that considers performance, cost and efficiency as required.

With the completed deployment of this new MIRO system in all our markets, we immediately confirmed a better performance and much faster response time to network events. In Atlanta (ACS) for example, we selected a random day to show the best case average latency for all carriers compared against MIRO, and as expected, MIRO had better performance from 2 up to 15 milliseconds faster:

BGP_acs-miro-latency

In the previous figure, you can see there is a network event on provider RED (represented with the red line), where the average latency increased from 120 milliseconds to approximately 180 milliseconds. If we look at the amount of traffic MIRO was putting on provider RED, we can notice it reacted to the event almost instantly, moving about 2.8 Gigabits of traffic per second to other providers (from 4.3Gbps to 1.5Gbps).
BGP_acs-209-failure-latency

Similarly, we selected the same day in New York (NYM) to compare the average packet loss per provider against MIRO. MIRO’s average packet loss is 0.01% versus 0.04% for the rest of the providers:

BGP_nym-miro-loss

MIRO brings our customers faster and more stable gaming networks, Content Delivery Networks (CDNs), social networks and general availability for end users as a result of consistent low latency and packet loss. Even though the Internet wasn’t designed for speed, MIRO addresses the deficiencies in BGP and routes traffic along the best available paths.

To learn more about MIRO, watch the video, Maximize Internet Performance, Reduce Latency.