MPLS encapsulation for TRILL

by Kris Price on 4 March 2009

Back in September, 2008, I watched an interesting Google Tech Talk by Radia Perlman that introduced me to TRILL: Routing without tears; Bridging without danger. If you find this interesting you should also read Radia’s original Infocomm paper: Rbridges: Transparent Routing.

After watching this I became very interested in why MPLS was not used. MPLS appears to naturally lend itself to this problem, and shares similarities with the approach ultimately adopted by TRILL. Why use MPLS? Mainly for the re-use of existing hardware. Adding a new control-plane is much simpler as it is predominantly a matter of software development. Re-using this hardware would reduce development costs and the time to market.

At the time I attempted to find out why this was, but no clear answers were forthcoming. I became distracted, but the subject never fully left my mind. After pondering it at some length I thought I would summarise the basics here, in case it interests other people too.

Overview

  1. Routers run a link-state protocol, building a topological database from which they identify the shortest paths to other routers. Routers set up label switched paths (LSPs) to the other routers.
  2. Routers use either control-plane or data-plane learning of reachability information. In control-plane learning, routers learn the MAC addresses of active end stations on attached links, as a normal IEEE bridge would, and distribute this over the link-state protocol so that sending routers can identify the appropriate LSPs on which to forward frames. In data-plane learning, each router learns the appropriate LSPs by examining packets during decapsulation.
  3. When a frame from an end station arrives at a router, the router looks up the destination MAC address, identifying the egress router and appropriate label switch path. The router encapsulates the frame in an MPLS header, and the encapsulated frame is then label switched across the network to the egress node. At the egress node the label is popped and the frame forwarded out the appropriate interface.

Forwarding

Single-destination frames would be forwarded on multipoint-to-point (MP2P) LSPs. Point-to-point (P2P) LSPs do not scale as well. For single-destination forwarding, scaling is essentially O(n) using MP2P LSPs compared to O(n^2) using P2P LSPs.

Multi-destination forwarding is trickier, with many variations on approaches that could be adopted. It depends on clearly identifying how VLAN tagged multi-destination frames and IP multicast is to be handled. Then balancing the trade offs between bending the rules in the MPLS architecture, any requirements for data-plane learning, and the intricacy of the control-plane. Trees would need to be set up, either shared trees using multipoint-to-multipoint (MP2MP) LSPs, or source-based trees using point-to-multipoint (P2MP) LSPs.

Constraining delivery of multi-destination frames

Constraining the delivery of multi-destination frames, to prevent unnecessary resource consumption in the case of VLANs and IP multicast, would fall to setting up additional trees or using hardware filtering (a feature many modern platforms either have, could adopt, or will adopt for other purposes anyway).

Some form of multi-topology routing could be used to discover the topology of each VLAN, and when transmitting across each link the outer MAC header would contain the VLAN tag of the topology in question, on receipt the router can use this additional information to classify the packet. This can be seen to increase the MPLS label space (“per-subinterface”) and constrain the forwarding of frames in a VLAN to links which are in those VLANs. Directly coupled router to router links would simply be full 802.1Q trunks.

Another option is not to be concerned about constraining frames to only those links which allow those VLANs, and each router advertises VLAN membership, with constrained distribution trees established for each VLAN. The outer VLAN tags used on each link would not be relevant. This is the approach taken by TRILL, and it is distinctly different from 802.1Q behaviour (but that is for another post).

IP multicast can also be optimised with constrained trees, and the multicast group membership information distributed by the link-state protocol.

Control-plane

I did discover that MPLS encapsulation was proposed for RBridges in draft-bryant-perlman-trill-pwe-encap-00, which violates the MPLS architecture, because label assignments are made upstream, but that upstream allocation does certainly make things easier for the control-plane in a system like this, and it is an easy violation to live with. A new ethertype is used in this case to distinguish a new global labelspace.

The draft does discuss how the paths would be established and it is possible to conceive ways of establishing additional constrained trees for VLANs and IP multicast optimisation too. However, after cursory examination of methods for this, it would seem to require routers to do SPF computations for all routers in the network. A router would need to know its position in the tree to decide if it needed to install relevant forwarding state.

A signalling protocol in the control-plane (perhaps separate, or perhaps carried on the link-state protocol) could remove the need to do these SPF computations, and remove the upstream assignment of labels, making it compliant to the MPLS architecture. But this would be a trade off in that the resulting protocol would become more intricate.

Learning

It appears from the experience of TRILL that data-plane learning is a desirable feature. To achieve this, there are a few options that could be explored.

If the use of a global labelspace was undesirable, learning in the case of single-destination forwarding would be solved by placing an extra label in the stack when encapsulating a frame in MPLS. This can be used by the egress router to identify the ingress router. In essence these are P2P LSPs tunnelled over the MP2P LSPs. Learning in the case of multi-destination forwarding require that P2MP LSPs be used. Because a P2MP LSP has a single source, the egress router knows where the arriving packet was encapsulate.

The simpler option is to use the global labelspace, as discussed in draft-bryant-perlman-trill-pwe-encap-00. This would permit an ingress router to place an extra label in the stack identifying itself. Because the labelspace is global, and a protocol has permitted all routers to select their own unique label, they can safely use this to identify themselves for both single- and multi-destination packets using a label stack.

The final option is to look at using some kind of MAC-in-MAC encapsulation before the MPLS encapsulation, with the outer MAC header identifying the source and destination addresses of the routers in question (where only the source address is actually relevant). The egress router, after popping the MPLS label, would learn the ingress router via this outer MAC header. Again this works for both single- and multi-destination frames, but it is not very attractive. It would require more customised hardware. It is also stepping away from the use of MPLS. In this case, it would probably be more suitable to straight out use 802.1ah and that is already being explored by the IEEE’s Shortest Path Bridging effort. Many extensible MPLS platforms today have the ability to learn from labels. Those that do not could acquire this ability over the development time of the protocol.

That is the sum of it. The rest gets into the details of weighing these approaches against each other and designing the protocols.

Leave a Comment

 

Previous post:

Next post: