[NLNOG] FW: Forwarding issues related to MACs starting with a 4 or a 6 (Was: [c-nsp] Wierd MPLS/VPLS issue)

Robert Heuvel RHeuvel at atom86.net
Mon Dec 5 18:00:24 CET 2016


Wij hebben hetzelfde probleem vastgesteld op een Arista 7150S-52 met EOS (transient MPLS traffic via Layer2).
Arista EOS 4.16.9M heeft dit voor ons opgelost.

MACs die niet getransporteerd werden:

Misschien zijn er nog meer geweest, maar zijn nog niet gevonden…

Dank gaat uit naar: 
Richard van Looijen van Flowmailer voor het vaststellen van het probleem. 
Edwin Kalle van 2hip voor het wijzen op onderstaande tread.
En natuurlijk Job Snijders voor zijn mail van vrijdagavond, waardoor voor ons alles in 1 keer op zijn plaats viel en we de tussenliggende switch gingen bekijken…

Robert Heuvel

On 02/12/2016, 21:45, "NLNOG on behalf of Job Snijders" <nlnog-bounces at nlnog.net on behalf of job at instituut.net> wrote:

    TL;DR: Cisco Nexus 92160 switches kunnen packets met bepaalde specifieke
    payloads niet forwarden. Er komt geen software fix, de ASIC is niet goed
    Stel je hebt twee routers aan elkaar geknoopt via een VLAN over zo'n
    Nexus 9000 switch, en je doet VPLS tussen deze routers, dan kan het zijn
    dat packets waarbij de payload een ethernet frame binnen die VPLS
    instance is, waarbij het destination MAC begint met een 4 of een 6,
    gedropped worden.
    Als je precies weet wat voor payloads je stuurt over deze switches dan
    kun je er wel omheen werken (in een enterprise omgeving bijvoorbeeld),
    maar als je als service provider VLANs van A naar B verkoopt dan weet je
    natuurlijk niet wat de klant over het circuit gaat sturen, en dan kan
    dit flink bijten.
    Met vriendelijke groeten,
    ----- Forwarded message from Job Snijders <job at instituut.net> -----
    Date: Fri, 2 Dec 2016 15:32:13 +0100
    From: Job Snijders <job at instituut.net>
    To: nanog at nanog.org
    Subject: Forwarding issues related to MACs starting with a 4 or a 6  (Was: [c-nsp] Wierd MPLS/VPLS
    Hi all,
    Ever since the IEEE started allocating OUIs (MAC address ranges) in a
    randomly distributed fashion rather then sequentially, the operator
    community has suffered enormously.
    Time after time issues pop up related to MAC addresses that start with a
    4 or a 6. I believe IEEE changed their strategy to attempt to
    purposefully higher the chance of collisions with MAC squatters, to
    encourage people to register and pay the fee. 
    The forwarded email at the bottom is yet another example of a widely
    deployed, but fundamentally broken ASIC. The switch can't forward VPLS
    frames which contain a payload where the inner packet is destined to a
    MAC starting with a 4 or a 6. This is with the switch operating in pure
    layer-2 mode, it doesn't know what MPLS or VPLS even are. The switch is
    dropping packets on the floor, based on their _payload_. Try selling
    such circuits to customers "discounted layer-2 service, some flows might
    not be forwarded".
    Had IEEE continued the sequential OUI allocations, it probably would've
    taken many years before we ever reached MACs starting with a 4 or a 6,
    but instead, in 2012 the first linecards started rolling out of
    factories with MACs burned in which start with a 4 or a 6, and this took
    some vendors by surpise.
    There have been quite some issues, both in hardware and software:
    Brocade produced a 24x10GE linecard to the market in 2013/2014, with
    limited FIB scale, meant for a BGP-free MPLS core, but the card can't
    keep flows together on LACP bundles if the inner packets in a pseudowire
    were destined for a 4 or 6 MAC. The result: out of order delivery,
    hurting performance.
    Cisco ASR 9k's had a bug where if a payload started with a 6, it assumed
    it would be an IPv6 packet, compare the calculated packet-length with
    the packet-length in the packet and obviously fail because an ethernet
    packet is not an IPv6 packet. The result: packets dropped on the floor.
    (Fixed in 4.3(0.32)I)
    The Nexus 9000 issue described at the top of this mail. Brocade IronWare
    had an issue related to packet reordering for flows inside pseudowires,
    fixed in 2013/2014. There are probably many more examples out there in
    the wild, slowly driving operators insane.
    At this moment, some issues related to MACs starting with a 4 or a 6 can
    be mitigated if you enable Pseudowire Control-Word (RFC 4385) _AND_
    Flow-Aware Transport (RFC 6391). You need both to mitigate certain issues
    in multi-vendor networks (for instance if you have Cisco edge + Juniper
    core). But what to do when the ASIC won't forward the payload? As ISP
    you often don't control the payload.
    Unfortunatly, I don't think we've seen the end of this. The linecards
    bought in 2012 will trickle down to the grey/second-hand market about
    now, often without accompanying support contracts. In a world with
    increased complexity in our interconnectedness, and lack of visibility
    into the underlaying infrastructure (think remote peering, cloud
    connectivity, resellers reselling layer-2) it will hurt when some
    flows inexplicably fail to arrive.
    Dear IEEE, please pause assigning MAC addresses that start with a 4 or a
    6 for the next 6 years. Or at least, next time you change the policy,
    consult the operational community. This 4/6 MAC issue was well
    documented in BCP128 back in 2007. The control-word drafts mentioned
    that there would be dragons related to 4 and 6 back in 2004.
    Dear Vendors, take this issue more serious. Realise that for operators
    these issues are _extremely_ hard to debug, this is an expensive time
    sink. Some of these issues are only visible under very specific, rare
    circumstances, much like chasing phantoms. So take every vague report of
    "mysterious" packetloss, or packet reordering at face value and
    immediately dispatch smart people to delve into whether your software or
    hardware makes wrong assumptions based on encountering a 4 or a 6
    somewhere in the frame. 
    And you, my fellow operators, please continue to publicly document these
    issues and possible workarounds.
    Kind regards,
    c-nsp thread "Wierd MPLS/VPLS issue": https://puck.nether.net/pipermail/cisco-nsp/2016-December/thread.html
    BCP128: https://tools.ietf.org/html/bcp128
    ----- Forwarded message from Simon Lockhart <simon at slimey.org> -----
    Date: Fri, 2 Dec 2016 11:44:21 +0000
    From: Simon Lockhart <simon at slimey.org>
    To: cisco-nsp at puck.nether.net
    Subject: Re: [c-nsp] Wierd MPLS/VPLS issue
    On Wed Nov 23, 2016 at 12:01:20PM +0000, Simon Lockhart wrote:
    > On Fri Nov 04, 2016 at 03:40:05PM +0000, Simon Lockhart wrote:
    > > To me, everything *looks* right, it's just that some VPLS traffic traversing
    > > the new link gets lost.
    > For those who are interested...
    > Well, I finally got to the bottom of this, and have pushed it to Cisco TAC
    > for a fix...
    Cisco TAC finally accepted the issue. Bug CSCvc33783 has been logged.
    Nexus BU has investigated.
    Response is...
    "[...] unfortunately this is an ASIC limitation on the Nexus 9000
    switches and is therefore not fixable."
    If you want a Layer 2 switch that will forward all valid Ethernet
    frames, I'd suggest avoiding the Nexus 9000 range...
    cisco-nsp mailing list  cisco-nsp at puck.nether.net
    archive at http://puck.nether.net/pipermail/cisco-nsp/
    ----- End forwarded message -----
    ----- End forwarded message -----
    NLNOG mailing list
    NLNOG at nlnog.net

More information about the NLNOG mailing list