Thursday, May 30, 2013

BGP in juniper: Network redundancy & traffic engineering

In network and system architecture in general, it is a fact that things go down. System and processes fail and therefore we must build redundancy. A production network connected to a single link is a catastrophe waiting to happen. If that particular link fails the production network will no longer have access to the Internet and vice versa.

This is why it is always recommended to have another ISP link connected to our network. A practical solution is to connect that link on another router and have a redundancy protocol like VRRP running between the two router. But more about on such a practice in a future post. We will not only get link redundancy but also a network load balancer of sorts, using three ISP links connected to the same router. We will use policies to ensure that traffic for a particular subnet comes via link of our choice. This leads us to the question, why BGP is really required?

Need for BGP to do multihoming:


To receive the full internet routing table, we need higher end models like MX80 in Juniper which are capable of managing 400,000 plus routes. Multihoming can double or triple this. Such routers can be very expensive and such decisions have to be taken very carefully.

One question you may ask is what is the need to know route to every network when we can simply configure a static route pointing to your next hop? You can also have a floating static route to your other ISP, so that if the primary link goes down the other route will take preference. You can moreover do a host of filter based forwarding to manipulate outbound traffic.

First of all static routes bring a set of problems. The first option to have a backup static routes will lead to one pipe being over utilized while the other wont send out any traffic. Remember that for any enterprise serving content, outbound traffic is always more than the inbound traffic.

Having filter based forwarding so that a particular subnet takes one link while another takes the other link has scalability issues. As you add links or subnets you will have to carefully select which link to send out traffic by. In case you need to shift some traffic you really don't have many choices. Such a strategy balances traffic based on your subnets.

Having a full BGP table gives you full flexibility to manipulate traffic how you want. As you progress through the series all such aspects will be covered. Before delving into traffic manipulation it is important to learn a key concept known as AS paths.

AS Paths:

A distance vector protocol depends on the hop count, whereas a path vector protocol depends upon path count. For example, suppose our organisation is the originator of a subnet. It will therefore announce that subnet to its ISP with its own ASN say 100. The ISP will then prepend its own ASN (say 150) and announce it to its upstream or peer.

The upstream peer now has a route to the destination via AS 150. Moreover is has a direct reachability to this autonomous system. Therefore it is deemed as a valid route and placed in the routing table, as long as it not receiving announcement for the same subnet from other peers.


Shaping inbound traffic:


BGP uses a path selection algorithm to determine which route for a particular destination is the best. This link describes it. When a router receives routes from different ISPs it runs this algorithm. If a rule matches for both routes the next rule is looked at.  Usually the tie breaker occurs at the 4th step. A route having the minimum number of AS Paths is considered a better route.

In the illustration below is is clear that RTR1 is two AS hops away from RTR5. In an ideal case when the router will run the BGP path selection algorithm it fill find a tie even at rule 4 and proceed to the next rule. Unfortunately as we go down the list, it becomes more and more out of our hands. The last step is to select the route from a peer having lowest peer IP. Clearly we cannot control such factors. What we can control are some of the BGP attributes. What if could announce our routes by prepending the route with our own AS number? Let us look at an example for better understanding.

Suppose you have three subnets and want majority of traffic via each of the three links an with automatic link failover. This is where the attributes come into picture. By modifying them we can tell the whole world that a particular path is better than the others.

Note: The arrows represent the direction of route advertisements, not the data traffic. Path of data traffic is explained below.

In this diagram RTR5 receives a total of 9 routes, 3 per subnet , all announced by our router RTR1. To install a route for 200.200.200.0/23 it will run the path selection algorithm. Let us go through all the steps.


  1. Verify the next hop can be resolved.  True for all routes. (next hop being AS 300, 350 or 400)
  2. Since all are BGP paths, all will have default preference and therefore we have tie at this step as well.
  3. By default ISPs do not modify local preference (more about it in the next article) of its link and therefore a till at this step too.
  4. This is where the game changes. This is the first rule which us, AS 100 has control over. The below table describes how we have announced our subnets. When these announcements reach RTR5 it checks which announcements have the smalles AS path. 
As path prepending
Path prepending in action


For 200.200.200.0/23 path via AS 300 will be installed in the routing table. For, 150.150.150.0/23 path via 400 and for 100.100.100.0/23 path via AS 350.

It is very obvious that this leads to load sharing as traffic for different subnets will travel different links. A question you may ask is why send prepended announcements in the first place? We could have simply announced each subnet via one of the three links.  The answer is again network more specifically link redundancy. If one of the link goes down or an intermediate router goes down RTR5 will have alternate paths to reach RTR1. It will quickly run the best path algorithm and install the new route in the table.

Isn't it a brilliant technique? We have killed two birds with one stone. We have not only achieved automatic failover but shaped our traffic to come from different links. So if your subnet , 100.100.100.0/23 receives more traffic you can buy more bandwidth from ISP having AS 400. It is more scalable and convenient.

The final part is obviously the configuration and policy to achieve such a functionality in a juniper router. For more clarity on the basics of applying import and export policies click here and here.

Configuration:

Scenario discussion: RTR1 will be the customer router which needs to announce subnets mentioned above. Remember that only active routes can be announced via a dynamic routing protocol. Therefore we need these subnets in our forwarding table. To achieve that i have create a discard static route for these subnets. 

Just run set routing-options static discard in the configuration mode.

discard static routes are also present in the routing table
Discard static routes are also present in the routing table
Remember that the default policy for BGP in juniper is to export all active bgp routes. So if I announce all my subnets from rtr1 than all the other routers will relay that information to rtr5. You only need to set up basic bgp session between all the connected routers. If you do not know how to click here.

Configuration on rtr1:
set protocols bgp group test2 type external
set protocols bgp group test2 import import_bgp
set protocols bgp group test2 neighbor 10.10.10.2 export export_bgp_rtr2
set protocols bgp group test2 neighbor 10.10.10.2 peer-as 300
set protocols bgp group test2 neighbor 30.30.30.2 export export_bgp_rtr3
set protocols bgp group test2 neighbor 30.30.30.2 peer-as 400
set protocols bgp group test2 neighbor 20.20.20.2 export export_bgp_rtr4
set protocols bgp group test2 neighbor 20.20.20.2 peer-as 350

set policy-options policy-statement export_bgp_rtr2 term 1 from protocol static
set policy-options policy-statement export_bgp_rtr2 term 1 from route-filter 200.200.200.0/2 exact
set policy-options policy-statement export_bgp_rtr2 term 1 from route-filter 200.200.200.0/23 exact
set policy-options policy-statement export_bgp_rtr2 term 1 then accept
set policy-options policy-statement export_bgp_rtr2 term 2 from protocol static
set policy-options policy-statement export_bgp_rtr2 term 2 from route-filter 150.150.150.0/23 exact
set policy-options policy-statement export_bgp_rtr2 term 2 then as-path-prepend 100
set policy-options policy-statement export_bgp_rtr2 term 2 then accept
set policy-options policy-statement export_bgp_rtr2 term 3 from protocol static
set policy-options policy-statement export_bgp_rtr2 term 3 from route-filter 100.100.100.0/23 exact
set policy-options policy-statement export_bgp_rtr2 term 3 then as-path-prepend "100 100"
set policy-options policy-statement export_bgp_rtr2 term 3 then accept
set policy-options policy-statement export_bgp_rtr3 term 1 from protocol static
set policy-options policy-statement export_bgp_rtr3 term 1 from route-filter 200.200.200.0/23 exact
set policy-options policy-statement export_bgp_rtr3 term 1 then as-path-prepend 100
set policy-options policy-statement export_bgp_rtr3 term 1 then accept
set policy-options policy-statement export_bgp_rtr3 term 2 from protocol static
set policy-options policy-statement export_bgp_rtr3 term 2 from route-filter 150.150.150.0/23 exact
set policy-options policy-statement export_bgp_rtr3 term 2 then as-path-prepend "100 100"
set policy-options policy-statement export_bgp_rtr3 term 2 then accept
set policy-options policy-statement export_bgp_rtr3 term 3 from protocol static
set policy-options policy-statement export_bgp_rtr3 term 3 from route-filter 100.100.100.0/23 exact
set policy-options policy-statement export_bgp_rtr3 term 3 then accept
set policy-options policy-statement export_bgp_rtr4 term 1 from protocol static
set policy-options policy-statement export_bgp_rtr4 term 1 from route-filter 200.200.200.0/23 exact
set policy-options policy-statement export_bgp_rtr4 term 1 then as-path-prepend "100 100"
set policy-options policy-statement export_bgp_rtr4 term 1 then accept
set policy-options policy-statement export_bgp_rtr4 term 2 from protocol static
set policy-options policy-statement export_bgp_rtr4 term 2 from route-filter 150.150.150.0/23 exact
set policy-options policy-statement export_bgp_rtr4 term 2 then accept
set policy-options policy-statement export_bgp_rtr4 term 3 from protocol static
set policy-options policy-statement export_bgp_rtr4 term 3 from route-filter 100.100.100.0/23 exact
set policy-options policy-statement export_bgp_rtr4 term 3 then as-path-prepend 100
set policy-options policy-statement export_bgp_rtr4 term 3 then accept
set policy-options policy-statement import_bgp term RFC_1918 from route-filter 192.168.0.0/16 exact
set policy-options policy-statement import_bgp term RFC_1918 then reject
set policy-options policy-statement import_bgp term deny_own_pool from route-filter 200.200.200.0/23 orlonger
set policy-options policy-statement import_bgp term deny_own_pool then reject

Routing table on RTR5 
root@rtr5> show route protocol bgp

inet.0: 13 destinations, 18 routes (13 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.100.100.0/23   *[BGP/170] 00:06:18, localpref 100
                      AS path: 400 100 I
                    > to 60.60.60.1 via em3.0
                    [BGP/170] 00:06:18, localpref 100
                      AS path: 350 100 100 I
                    > to 50.50.50.1 via em1.0
                    [BGP/170] 00:06:18, localpref 100
                      AS path: 300 100 100 100 I
                    > to 40.40.40.1 via em0.0
150.150.150.0/23   *[BGP/170] 00:06:18, localpref 100
                      AS path: 350 100 I
                    > to 50.50.50.1 via em1.0
                    [BGP/170] 00:06:18, localpref 100
                      AS path: 300 100 100 I
                    > to 40.40.40.1 via em0.0
192.168.0.0/16     *[BGP/170] 00:11:04, localpref 100
                      AS path: 400 I
                    > to 60.60.60.1 via em3.0
200.200.200.0/23   *[BGP/170] 00:09:16, localpref 100
                      AS path: 300 100 I
                    > to 40.40.40.1 via em0.0
                    [BGP/170] 00:11:04, localpref 100
                      AS path: 400 100 100 I
                    > to 60.60.60.1 via em3.0
                    [BGP/170] 00:06:22, localpref 100
                      AS path: 350 100 100 100 I
                    > to 50.50.50.1 via em1.0



Clearly rtr5 will prefer different AS for the three subnets. Moreover RTR5 is learning two alternate paths for a particular subnet. (* marked route is the best route). So even if any of the intermediate link goes down, bythe virtue of a dynamic routing protocol-BGP our subnets will still be reachable.

This post gave the primary technique used by network administrators to engineer inbound traffic. What about outbound traffic? More on that in the next post!

Important Links:

Tuesday, May 28, 2013

BGP configuration in Juniper: Importing & exporting routes

The previous post (configuring bgp on juniper mx) gave you a brief hint about BGP and how a basic session can be established from a Juniper router. This post will move on further to explain how to override the default policy, which is to receive and export all active BGP routes, and customize the advertisement according to our needs.

Suppose you have procured a subnet like 200.200.200.0/23 (200.200.200.0 - 200.200.201.255). You need to inform your upstream provider and ask them to place appropriate filters to accept your announcement and propagate it further. You on the other hand need to create an export policy to announce your route via BGP. If you will be BGP multihoming ie connecting with two or more ISPs you may also ask them to send you the full routing table, otherwise you will have to configure a default static route to your upstream. More on multihoming in the next article.

Consider a scenario in which you have asked them to send you the complete routing table (which has no meaning for if you have a single ISP connection). You will then also need to configure import policies to accept only relevant advertisements. Thus,

Export policies:

  •  Announce your subnets only.

Import policies:

  • Reject any private ip 
  • Reject your own subnet
  • Reject any bogon as
  • Discard subnets with prefix length greater than /24
The reason being an ISP may accidentally send you BGP updates saying it has routes to private ip ranges (such an incident should be notified to your upstream provider). Also an organisation may intentionally or otherwise start announcing your subnets. You do not want your routers to think that your own subnets are located somewhere else.

Currently not all AS numbers have been allocated. A hacker may start announcing these unallocated AS number as his own. To protect this organisations such as IANA keep a list of such AS numbers and network prefixes. Such routes should be discarded.

As of now there are currently more than 400,000 routes in the Internet. To prevent the routing table to further swelling up it has been decided that ISPs will not announce prefixes greater than /24. Even in case they do it is your responsibility to discard such routes.

Important note: Remember when we "announce" a subnet via BGP, we tell the whole world that a particular subnet belongs to you therefore influence incoming traffic. This will be helpful in the coming articles. 

Configuration:


Assuming you have already set autonomous system and configured bgp neighbors. If not read this. We will be creating two new policies, export_subnet and import_bgp_tables and applying them as export and import respectively.

Import policy: 

set protocols bgp group test export type external
set protocols bgp group test export export_bgp import import_bgp
set policy-options policy-statement import_bgp term RFC_1918  from route-filter 192.168.0.0/16 exact
set policy-options policy-statement import_bgp term DENY_BOGONS from prefix-list BOGON-LIST
set policy-options policy-statement import_bgp term DENY_BOGONS then reject
set policy-options policy-statement import_bgp term DENY-RFC-1918 from route-filter 10.0.0.0/8 orlonger
set policy-options policy-statement import_bgp term DENY-RFC-1918 from route-filter 172.16.0.0/12 orlonger
set policy-options policy-statement import_bgp term DENY-RFC-1918 from route-filter 192.168.0.0/16 orlonger
set policy-options policy-statement import_bgp term DENY-RFC-1918 then reject
set policy-options policy-statement import_bgp term deny-own-pool from route-filter 200.200.200.0/23 orlonger
set policy-options policy-statement import_bgp term deny-own-pool then reject
set policy-options policy-statement import_bgp term DENY-MORE-THAN-/24 from route-filter 0.0.0.0/0 prefix-length-range /25-/32
set policy-options policy-statement import_bgp term DENY-MORE-THAN-/24 then reject
set policy-options prefix-list BOGON-LIST 0.0.0.0/8
set policy-options prefix-list BOGON-LIST 127.0.0.0/8
set policy-options prefix-list BOGON-LIST 169.254.0.0/16
set policy-options prefix-list BOGON-LIST 192.0.0.0/24
set policy-options prefix-list BOGON-LIST 192.0.2.0/24
set policy-options prefix-list BOGON-LIST 198.18.0.0/15
set policy-options prefix-list BOGON-LIST 198.51.100.0/24
set policy-options prefix-list BOGON-LIST 203.0.113.0/24
set policy-options prefix-list BOGON-LIST 224.0.0.0/4
a sample of a bogon list. RFC 1918 specifies the private network prefixes.


Export policy



set policy-options policy-statement export_bgp term 1 from protocol static
set policy-options policy-statement export_bgp term 1 from route-filter 200.200.200.0/23 exact
set policy-options policy-statement export_bgp term 1 then accept set policy-options policy-statement export_bgp term END then reject

Important Note: Remember that for a route policy to take affect the specified route should be present in the routing table as an active route. So if you have 200.200.200.0/24 or /22 as an active static route to your network this policy will not hold. You can only export routes which are present in the routing table. It is one of the tenets of the routing protocols that it sends the best available path to it's neighbor. Make sure you have the subnet you want to announce present in the routing table as active.


Troubleshooting:


One of the most useful commands is show route advertising-protocol bgp [neighbor ip]. It will show which routes are being advertised to your bgp peer.

Also show route receiving-protocol bgp [neighbor ip]  will show routes being received from bgp by that particular neighbor.

show route protocol bgp will tell the routes installed in the routing table which were learnt from bgp. An asterisk * represents the active route.

In the below screenshot you will see that even though I am announcing 192.168.0.0/20 to Junos1 it will not receive because of the policies applied.


bgp session with import and export policies
ISP is sending 192.168.0.0/16 but customer does not accept it because of policies. CLick on it for the full size.


In the diagram, even though the ISP is advertising 192.168.0.0/16, the customer is not accepting it because of the policy rejecting it. You can see such routes by running the command show route hidden. 

In the next tutorial of this series I will explain why multihoming (more than two ISP connections) is necessary for a production network and how we can play around, manipulate traffic on our links.

Important Links:


1) Practical intro to BGP session establishment on Juniper
2) RFC 1918
3) Wiki on  Bogon

Monday, May 27, 2013

Configuring bgp session on Juniper MX router


I have been fortunate to be working on one of the heavy duty routers found in the networking jungle. Juniper is an established and trusted brand among many ISPs and big enterprises. The MX series routers are industry leaders and combine the functionalities of an M series router and EX series switch. But this post will not delve into the product description. This post is more about how to configure a basic bgp session and putting some default policies to ensure that you receive and transmit correct routes.

You can do very cool things with BGP, the exterior gateway protocol that is used by everyone today to announce their route to everyone on the Internet.

How BGP works in a nutshell


This is a just a very brief introduction to BGP. Anyone even remotely interested in knowing BGP should google for more relevant links. We will also be concerned about a few attributes which we can manipulate traffic according to our whims and fancies.

Basically BGP is a path vector protocol which informs a router about the direction and the complete path to a particular destination. When two neighbors form a relationship they establish what is called a bgp session. After forming a bgp session and negotiating on timers such as holddown timer values they start sending all the bgp routes that they know. Whenever there is an update they send triggered updates although it may take a while for all the routers in the routers to know the update. Moreover whenever there is an update they have to run algorithms to figure out the new best route. When all routers have the new route they are said to have been converged.  This state of stabilizing is called convergence.

From a practical viewpoint, each organisation tells their neighboring organisation ie an ISP aka peer what public routes they want to announce to the world. The ISP then advertises these routes to its upstream providers or a peer ISP (there is a heirarchy of ISPs). Soon all the ISPs in the world know each other's routes. It takes some diplomacy for two ISPs to peer with each other. Currently there are more than 400,000 routes.

Every organisation is assigned an AS Number(ASN) from their local internet registries or IANA. All routing devices in a particular AS belongs to a particular organisation, which need not be confined to a single geographical location. Suppose there is a company X which has its routers in USA,UK and India. Then all these routers will belong to a single AS. Two AS establish BGP session with each other.

That is all you need to know before setting a BGP session for your organisation.

Prerequisites for forming a bgp session:

1) Ensure that your router supports BGP.
2) It should have adequate memory. Run show task memory command to check available memory. All routes take a maximum of 100mbs of memory

Before showing the actual commands, remember the sub-goals we are trying to achieve

1) Establish a BGP session
2) Start receiving route
3) Start sending routes
4) Load balancing and applying firewall policies.

Establishing a BGP session


1) First make sure you have acquired an ASN for your organisation and partnered with an ISP who is willing to share the full routing table.
2) Note down the wan ips assigned and the next hop ip as well as their AS number.

Remember that the default routing policy of BGP is to accept all bgp routes and export all active routes. A general practice is to establish a bgp session and leave it without announcing any subnets for a day so as to ensure that the bgp session remains stable and does not hamper your production network.

The following configuration will set up a bgp session with 200.10.10.2/30, ASN 200 and will not send or receive any routes.

set policy-options policy-statement test reject any
set protocols bgp group test type external export test import test peer-as 200 neighbor 200.10.10.2;
set routing-options autonomous-system 300 ;//your-asn


The above commands are self-explanatory. I have basically created a policy to reject all routes and have applied it at the input and output interfaces.

To check whether the bgp session is up, run the command show bgp summary and show bgp neighbor in operational mode. You should see the neighbor's ip and active under the state column. Anything else means the BGP session is not yet established correctly. Ensure the wan ips are reachable from one another. Otherwise run show log messages | last 10. It should give the error message as to why bgp session failed.

To check if the session is flapping, note the flaps and Last Up/down column. If the number is rising it means BGP is not stable and you should contact your upstream provider. Last up/down column tells how much time back the flap happened.

Leave this link for about a day and start receiving routes when you are sure there won't be any bgp flaps.


bgp session
A successfull bgp session between two routers
bgp session
Both routers are neither receiving nor sending any route

In the subsequent parts I will discuss how to:

  • start receiving and advertising routes
  • Load balance incoming traffic
  • Load balance outgoing traffic
  • Applying appropriate filters and much more

After these tutorials you should be able to run and manage multihomed bgp sessions successfully.

Part 2: Configuring BGP session and implementing import and export policies

Important links


1) Wiki on Path vector protocol 
2) Wiki on Autonomous system
3) Wiki on bgp