<?xml version="1.0" encoding="iso-8859-1" ?>

<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<rfc ipr="trust200902"
    docName="draft-ietf-bier-mld-08"
    category="std">

<?rfc toc="yes"?> <?rfc symrefs="yes"?> <?rfc autobreaks="yes"?>
<?rfc tocindent="yes"?> <?rfc compact="yes"?> <?rfc subcompact="no"?>

<front>

<title abbrev="MLD BIER overlay">BIER Ingress Multicast Flow
Overlay using Multicast Listener Discovery Protocols</title>

<author initials="P" surname="Pfister" fullname="Pierre Pfister">
    <organization>Cisco Systems</organization>
    <address>
        <postal>
            <street/>
            <city>Paris</city>
            <country>France</country>
        </postal>
        <email>pierre.pfister@darou.fr</email>
    </address>
</author>

<author initials="IJ" surname="Wijnands" fullname="IJsbrand Wijnands">
    <organization>Cisco Systems</organization>
    <address>
        <postal>
            <street>De Kleetlaan 6a</street>
            <city>Diegem</city>
            <code>1831</code>
            <country>Belgium</country>
        </postal>
        <email>ice@cisco.com</email>
    </address>
</author>

<author initials='S.' surname='Venaas' fullname='Stig Venaas'>
    <organization>Cisco Systems</organization>
    <address><postal>
        <street>Tasman Drive</street>
	<city>San Jose</city> <region>CA</region>
	<code>95134</code>
	<country>USA</country>
      </postal>
      <email>stig@cisco.com</email>
    </address>
</author>

<author initials='C.' surname='Wang' fullname='Cui(Linda) Wang'>
    <address>
      <email>lindawangjoy@gmail.com</email>
    </address>
</author>

<author initials='Z.' surname='Zhang' fullname='Zheng(Sandy) Zhang'>
    <organization>ZTE Corporation</organization>
    <address><postal>
        <street>No.50 Software Avenue, Yuhuatai District</street>
	<city>Nanjing</city> <region>CA</region>
	<country>China</country>
      </postal>
      <email>zhang.zheng@zte.com.cn</email>
    </address>
</author>

<author initials="M" surname="Stenberg" fullname="Markus Stenberg">
    <address>
        <postal>
            <street/>
            <city>Helsinki</city>
            <code>00930</code>
            <country>Finland</country>
        </postal>
        <email>markus.stenberg@iki.fi</email>
    </address>
</author>

<date/>

<keyword>Bier</keyword>
<keyword>MLD</keyword>
<keyword>Control</keyword>

<abstract>
    <t>This document specifies the ingress part of a multicast flow overlay
    for BIER networks. Using existing multicast listener discovery protocols,
    it enables multicast membership information sharing from egress routers,
    acting as listeners, toward ingress routers, acting as queriers. Ingress
    routers keep per-egress-router state, used to construct the BIER bit mask
    associated with IP multicast packets entering the BIER domain.</t>
</abstract>

</front>
<middle>

<section anchor="intro" title="Introduction">
    <t>The Bit Index Explicit Replication (BIER - <xref target="RFC8279"/>)
    forwarding technique enables IP multicast transport across a BIER domain.
    When receiving or originating a packet, ingress routers have to construct
    a bit mask indicating which BIER egress routers located within the same
    BIER domain will receive the packet. A stateless approach would consist
    of forwarding all incoming packets toward all egress routers, which would
    in turn make a forwarding decision based on local information. But any
    more efficient approach would require ingress routers to keep some state
    about egress routers multicast membership information, hence requiring
    state sharing from egress routers toward ingress routers.</t>

    <!--<t>State sharing techniques inspired from existing multicast routing
	protocols such as PIMv2 <xref target='RFC7761' /> would require
	intermediate routers to keep subscription states, which is undesirable
	as BIER does not need such state on intermediate routers. In order to
	prevent that, PIMv2 could also be tunneled over BIER, but that would
	require border routers to exchange additional routing information. On
	the other hand, the Multicast Listener Discovery protocol version 2
	<xref target='RFC3810' /> is widely available and has multiple
	interoperable implementations, but is only able to share multicast
	subscription state between routers and hosts connected on the same
	link.</t>-->

    <t>This document specifies how to use the Multicast Listener Discovery
    protocol version 2 <xref target='RFC3810' /> (resp. the Internet Group
    Management protocol version 3 <xref target="RFC3376"/>) as the ingress
    part of a BIER multicast flow overlay (BIER layering is described in
    <xref target="RFC8279"/>) for IPv6 (resp. IPv4). It enables multicast
    membership information sharing from egress routers, acting as listeners,
    toward ingress routers, acting as queriers. Ingress routers keep
    per-egress-router state, used to construct the BIER bit mask associated
    with IP multicast packets entering the BIER domain.</t>
    
    <t>This document defines an MLDv2 and IGMPv3 extension type, using the
    extension scheme defined in
    <xref target='I-D.ietf-pim-igmp-mld-extension'/>,
    that is used to provide BIER specific information about the message
    originator.</t>

    <t>This specification is applicable to both IP version 4 and version 6.
    It therefore specifies two separate mechanisms operating independently.
    For the sake of simplicity, the rest of this document uses IPv6
    terminology. It can be applied to IPv4 by replacing 'MLDv2' with 'IGMPv3',
    and following specific requirements when explicitly stated.</t>
</section>

<section anchor="terminology" title="Terminology">
    <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
    "OPTIONAL" in this document are to be interpreted as described in BCP 14
    <xref target="RFC2119"/> <xref target="RFC8174"/> when,
    and only when, they appear in all capitals, as shown here.</t>

    <t>The terms "Bit-Forwarding Router" (BFR), "Bit-Forwarding Egress Router"
    (BFER), "Bit-Forwarding Ingress Router" (BFIR), "BFR-id" and "BFR-Prefix"
    are to be interpreted as described in <xref target="RFC8279"/>.</t>

    <t>Additionally, the following definitions are used:
        <list style="hanging">
            <t hangText="BIER Multicast Listener Discovery (BMLD):">The
	    modified version of MLD specified in this document.</t>
            <t hangText="BMLD Querier:">A BFR implementing the Querier part
	    of this specification. A BMLD Node MAY be both a Querier and a
	    Listener.</t>
            <t hangText="BMLD Listener:">A BFR implementing the Listener part
	    of this specification. A BMLD Node MAY be both a Querier and a
	    Listener.</t>
        </list>
    </t>
</section>

<section anchor="overview" title="Overview">
    <t>This document proposes to use the mechanisms described in MLDv2 in
    order to enable multicast membership information sharing from BFERs
    toward BFIRs within a given BIER domain. BMLD queries (resp. reports)
    are sent over BIER toward all BMLD Nodes (resp. BMLD Queriers) using
    modified MLDv2 messages which IP destination is set to a configured
    'all BMLD Nodes' (resp. 'all BMLD Queriers') IP multicast address.</t>

    <t>By running MLDv2 instances with per-listener explicit tracking,
    BMLD Queriers are able to map BMLD Listeners with MLDv2 membership
    states. This state is then used to construct the set of BFERs associated
    with each incoming IP multicast data packet.</t>
</section>

<section anchor="as" title="Applicability Statement">
    <t>BMLD runs on top of a BIER Layer and provides the ingress part of a BIER multicast flow overlay, i.e, it specifies how BFIRs construct the set of BFERs for each ingress IP multicast data packet. The BFER part of the Multicast Flow Overlay is out of scope of this document.</t>

    <t>The BIER Layer MUST be able to transport BMLD messages toward all BMLD Queriers and Listeners. Such packets are IP multicast packets with a BFR-Prefix as source address, a multicast destination address, and containing a MLDv2 message.</t>

    <t>BMLD only requires state to be kept by Queriers, and is therefore more scalable than PIMv2 <xref target="RFC7761"/> in terms of overall state, but is also likely to be less scalable than PIMv2 in terms of the amount of control traffic and the size of the state that is kept by individual routers.</t>

    <t>This specification is applicable to both IP version 4 and version 6. It therefore specifies two separate mechanisms operating independently. For the sake of simplicity, this document uses IPv6 terminology. It can be applied to IPv4 by replacing 'MLDv2' with 'IGMPv3', and following specific requirements when explicitly stated.</t>

    <t>If multiple BFIRs have connectivity to the same source, a mechanism is needed to determine
    which BFIR should be the forwarder, that is not specified in this document. As a special case,
    if BIER is used end-to-end such that sources would be directly connected to the BFIRs, then an
    election mechanism is needed if there are multiple BFIRs on the same link as the source. One
    option is to utilize PIM DR Election where the DR is the BIER forwarder, but other election
    mechanisms could be used. In order to allow quick failover, the BFIRs that are not forwarders
    should still track BFER interest so that they have the correct state in case they become
    forwarders.</t>
</section>

<section anchor="specs" title="Querier and Listener Specifications">
    <t>Routers desiring to receive IP multicast traffic (e.g., for their own use, or for forwarding) MUST behave as BMLD Listeners. Routers receiving IP multicast traffic from outside the BIER domain, or originating multicast traffic, MUST behave as BMLD Queriers.</t>

    <t>BMLD Queriers (resp. BMLD Listeners) MUST act as MLDv2 Queriers (resp. MLDv2 Listeners) as specified in <xref target="RFC3810"/> unless stated otherwise in this section.</t>

    <section title="Configuration Parameters">
        <t>Both Queriers and Listeners MUST operate as BFIRs and BFERs within
	the BIER domain in order to send and receive BMLD messages. They MUST
	therefore be configured accordingly, as specified in
	<xref target="RFC8279"/>.</t>

        <t>All Listeners MUST be configured with an 'all BMLD Queriers'
	multicast address and the BFR-ids of all the BMLD Queriers. This is
	used by Listeners to send BMLD reports over BIER toward all Queriers.
	All Queriers MUST be configured to accept BMLD reports sent to this
	address.</t>

        <t>All Queriers MUST be configured with an 'all BMLD Nodes' multicast
	address and the BFR-ids of all the Queriers and Listeners. This
	information is used by Queriers to send BMLD queries over BIER toward
	all BMLD Nodes. All BMLD Nodes MUST be configured to accept BMLD
	queries sent to this address.</t>

	<t>It may be cumbersone to configure the exact set of BFR-ids for
	Queriers and Listeners. One MAY configure the set of BFR-ids to
	contain any potentially used BFR-id, perhaps having all bit positions
	set. There is no harm in configuring unused BFR-ids. Configuring the
	BFR-ids of additional routers would in most cases cause no harm, as a
	router would drop the BMLD message unless it is configured as a
	Querier or a Listener.</t>

        <t>Note that BMLD (unlike MLDv2) makes use of per-instance configured multicast group addresses rather than well-known addresses so that multiple instances of BMLD (using different group addresses) can be run simultaneously within the same BIER domain. Configured group addresses MAY be obtained from allocated IP prefixes using <xref target="RFC3306"/>. One MAY choose to use the well-known
MLDv2 addresses in one instance, but different instances MUST use different
addresses.</t>

        <t>IP packets coming from outside of the BIER domain and having a destination address set to the configured 'all BMLD Queriers' or the 'all BMLD Nodes' group address MUST be dropped. It is RECOMMENDED that these configured addresses have a limited scope, enforcing this behavior by scope-based filtering on BIER domain's egress interfaces.</t>
    </section>

    <section title="MLDv2 instances.">
        <t>BMLD Queriers MUST run a MLDv2 Querier instance with per-host tracking, which means they keep track of the MLDv2 state associated with each BMLD Listener. For that purpose, Listeners are identified by their respective BFR-Prefix, used as IP source address in all BMLD reports.</t>

        <t>BMLD Listeners MUST run a MLDv2 Listener instance expressing their interest in the multicast traffic they are supposed to receive for local use or forwarding.</t>

        <t>BMLD Listeners and Queriers MUST NOT run the MLDv1 (IGMPv2 and IGMPv1 for IPv4) backward compatibility procedures.</t>

        <section title="Sending Queries">
            <t>BMLD Queries are IP packets sent over BIER by BMLD Queriers:
                <list style="symbols">
                    <t>Toward all BMLD Nodes (i.e., providing to the BIER
		    Layer the BFR-ids of all BMLD Nodes).</t>
                    <t>Without the IPv6 router alert option
		    <xref target="RFC2711"/> in the hop-by-hop extension
		    header <xref target="RFC8200"/> (or the IPv4 router alert
		    option <xref target="RFC2113"/> for IPv4).</t>
                    <t>With the IP destination address set to the 'all BMLD
		    Nodes' group address.</t>
                    <t>With a deterministic IP source address. It is
		    RECOMMENDED that the address is a BFR-Prefix of the sender,
		    but it MAY be another value. This address is only used for
		    querier election.</t>
                    <t>With a TTL value large enough such that the packet can
		    be received by all BMLD Nodes, depending on the underlying
		    BIER layer (whether it decrements the IP TTL or not) and
		    the size of the network. The default value is 64.</t>
		    <t>The extension type defined in <xref target='exttype'/>
		    MUST be included once, specifying the Sub-domain-id,
		    BFR-id and BFR-Prefix of the sender. This information may
		    be useful for logging and debugging.
		    </t>
                </list>
            </t>
        </section>

        <section title="Sending Reports">
            <t>BMLD Reports are IP packets sent over BIER by BMLD Listeners:
                <list style="symbols">
                    <t>Toward all BMLD Queriers (i.e., providing to the BIER
		    layer the BFR-ids of all BMLD Queriers).</t>
                    <t>Without the IPv6 router alert option
		    <xref target="RFC2711"/> in the hop-by-hop extension
		    header <xref target="RFC8200"/> (or the IPv4 router alert
		    option <xref target="RFC2113"/> for IPv4).</t>
                    <t>With the IP destination address set to the 'all BMLD
		    Queriers' group address.</t>
                    <t>With a deterministic IP source address. It is
		    RECOMMENDED that the address is a BFR-Prefix of the sender.
		    </t>
                    <t>With a TTL value large enough such that the packet
		    can be received by all BMLD Queriers, depending on the
		    underlying BIER layer (whether it decrements the IP TTL
		    or not) and the size of the network. The default value
		    is 64.</t>
		    <t>The extension type defined in <xref target='exttype'/>
		    MUST be included once, specifying the Sub-domain-id,
		    BFR-id and BFR-Prefix of the sender. This information is
		    used to create the necessary forwarding state for requested
		    flows, and may be useful for logging and debugging.</t>
                </list>
            </t>
	    <t>Since the reports may contain a large number of records, they
	    may become larger than the maximum BIER payload that can be
	    delivered to all the BMLD Queriers. Hence an implementation will
	    need to either use a small default maximum size, allow
	    configuration of a maximum size, or rely on MTU discovery. MTU
	    discovery may be done for a sub-domain using BIER MTU Discovery
	    <xref target="I-D.ietf-bier-mtud"/> or for the set of BMLD
	    Queriers using Path MTU Discovery
	    <xref target="I-D.ietf-bier-path-mtu-discovery"/>.
	    </t>
        </section>

        <section title="Receiving Queries">
          <t>BMLD Queriers and Listeners MUST check the destination address
	  of all the IP packets that are received or forwarded over BIER
	  whenever their own BIER bit is set in the packet. If the destination
	  address is equal to the 'all BMLD Nodes' group address the packet is
	  processed as specified in this section.
          </t>
          <t>If the IPv6 (resp. IPv4) packet contains an ICMPv6 (resp. IGMP)
	  message of type 'Multicast Listener Query' (resp. of type
	  'Membership Query'), and include the extension defined in
	  <xref target="exttype"/>), it is processed
	  by the MLDv2 (resp. IGMPv3) instance run by the BMLD Querier.
	  It MUST be dropped otherwise.</t>
          <t>During the MLDv2 processing, the packet MUST NOT be checked
	  against the MLDv2 consistency conditions (i.e., the presence of the
	  router alert option, the TTL equaling 1 and, for IPv6 only, the
	  source address being link-local).</t>
        </section>

        <section title="Receiving Reports">
          <t>BMLD Queriers MUST check the destination address of all the IP
	  packets that are received or forwarded over BIER whenever their own
	  BIER bit is set. If the destination address is equal to the 'all
	  BMLD Queriers' the packet is processed as specified in this section.
          </t>
          <t>If the IPv6 (resp. IPv4) packet contains an ICMPv6 (resp. IGMP)
	  message of type 'Multicast Listener Report Message v2' (resp.
	  'Version 3 Membership Report'), and include the extension defined in
	  <xref target="exttype"/>), it is processed by the MLDv2 (resp. IGMPv3)
	  instance run by the BMLD Querier. It MUST be dropped otherwise.</t>
          <t>During the MLDv2 processing, the packet MUST NOT be checked
	  against the MLDv2 consistency conditions (i.e., the presence of
	  the router alert option, the TTL equaling 1 and, for IPv6 only, the
	  source address being link-local).</t>
        </section>

    </section>

    <section title="Packet Forwarding">
      <t>BMLD Queriers configure the BIER Layer using the information obtained
      using BMLD, and the extension <xref target="exttype"/>), to track
      membership state, including the Sub-domain-id, BFR-id and BFR-Prefix
      of the members.
      </t>

      <t>More specifically, the membership state associated with each BMLD
      Listener is provided to the BIER layer such that whenever a multicast
      packet enters the BIER domain, if that packet matches the membership
      information from a BMLD Listener, its Sub-domain-id and BFR-id is
      added to the set of Sub-domains and BFR-ids the packet should be
      forwarded to by the BIER-Layer.</t>
    </section>

</section>

<!--<section anchor="IPv4" title="Legacy IP Support">
    <t>The present specification may be applied to IPv4 by either:
        <list style="symbols">
            <t>Substituting MLDv2 by IGMPv3 <xref target="RFC3376"/> in the present document and replacing IP unicast or multicast addresses by IPv4 addresses.</t>
            <t>Encoding IPv4 addresses into MLDv2 messages as IPv6-mapped IPv4 addresses <xref target="RFC4291"/>.</t>
        </list>
    </t>
</section>-->

<section anchor="exttype" title="BIER MLD/IGMP Extension Type">
  <t>A new MLD/IGMP extension type adds BIER specific information to IGMP/MLD
  messages, using the extension scheme defined in
  <xref target="I-D.ietf-pim-igmp-mld-extension"/>). The BIER specific
  information is the same as the PTA tunnel identifier in
  <xref target="RFC8556"/> and is shown in <xref target="exttypefigure"/>.
  Note that, as defined in the MLD (resp. IGMP), existing implementations
  are supposed to ignore this additional data.</t>

  <figure anchor="exttypefigure" title="MLD/IGMP Extension Type for BIER">
    <artwork>
      <![CDATA[
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |          Ext Type TBD         |       Extension Length        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | Sub-domain ID |   Reserved    |             BFR-ID            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         BFR-Prefix 1                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    ~                                                               ~
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         BFR-Prefix n                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      ]]></artwork>
      <postamble></postamble>
  </figure>    
  <t>
    <list style="symbols">
      <t>Ext Type: Assigned by IANA, identifying this BIER extension.</t>

      <t>Extension Length: The length in octets of the data after this field. If there are
      n IPv4 prefixes, the length would be 4 + 4 * n, if there are n IPv6 prefixes, the
      length would be 4 + 16 * n.</t>

      <t>Sub-domain-id: A single octet containing a
      BIER sub-domain-id (see [<xref target="RFC8279"/>]).  This indicates
      the BIER sub-domain of the router originating the message.</t>

      <t>Reserved: A single octect, MUST be set to 0 when sending and ignored when receiving.</t>

      <t>BFR-id: A two-octet field containing the BFR-id,
      in the specified sub-domain, of the router originating the message.</t>

      <t>BFR-prefix: The BFR-prefix (see [<xref target="RFC8279"/>]) of the
      router that is originating the message. The BFR-prefix will either be
      a /32 IPv4 address or a /128 IPv6 address.</t>
    </list>
  </t>
  <t>This extension type MUST be present once in all IGMP and MLD messages
  when originated with a BIER header to identify the BIER originator. It is
  expected that any BIER router originating IGMP/MLD messages in BIER supports
  this specification. Any IGMP/MLD messages that do not contain the extension
  <xref target="exttype"/>) MUST be dropped by the decapsulating router with
  no processing other than potentially logging or debugging. It is expected
  that any BIER router processing IGMP/MLD messages with BIER encapsulation
  supports this specification. If they do not, they will likely ignore the
  report since they cannot identify the BIER receiver, but they may be able
  to derive some of the receiver information from the BIER header.
  </t>
</section>

<section anchor="sc" title="Security Considerations">
  <t>BMLD makes use of IGMPv3/MLDv2 messages transported over BIER in order to
  configure the BIER Layer of BFIRs. BMLD messages MUST be secured, either
  by relying on physical or link-layer security, by securing the IP packets
  (e.g., using IPSec <xref target="RFC4301"/>), or by relying on security
  features provided by the BIER Layer.</t>
  <t>By spoofing the IP source address, an attacker could become the IGMP/MLD
  querier. Once one becomes the querier, several attack vectors are possible.
  This is similar to regular IGMP/MLD without BIER encapsulation.</t>
  <t>An attacker could send reports with the BIER IGMP/MLD extension
  <xref target="exttype"/>) specifying a BFR-ID and BIER prefix identifying
  another router. This would allow the attacker to:
  <list style="symbols">
    <t>Redirect undesired traffic toward the spoofed router by subscribing to
    undesired multicast traffic.</t>
    <t>Prevent desired multicast traffic from reaching the spoofed router by
    unsubscribing to some desired multicast traffic.</t>
  </list>
  </t>
</section>

<section anchor="ic" title="IANA Considerations">
  <t>This document requests that IANA assigns a new type called BIER
  information in the registry defined in
  <xref target='I-D.ietf-pim-igmp-mld-extension'/>.
  </t>
</section>

<section anchor="ack" title="Acknowledgements">
    <t>Comments concerning this document are very welcome.</t>
</section>

</middle>

<back>

<references title="Normative References">
    <?rfc include="reference.RFC.2113.xml"?>
    <?rfc include="reference.RFC.2119.xml"?>
    <?rfc include="reference.RFC.3376.xml"?>
    <?rfc include="reference.RFC.3810.xml"?>
    <?rfc include="reference.RFC.8174.xml"?>
    <?rfc include="reference.RFC.8279.xml"?>
    <?rfc include="reference.I-D.ietf-pim-igmp-mld-extension.xml"?>
</references>

<references title="Informative References">
    <?rfc include="reference.RFC.2711.xml"?>
    <?rfc include="reference.RFC.3306.xml"?>
    <?rfc include="reference.RFC.4301.xml"?>
    <?rfc include="reference.RFC.5015.xml"?>
    <?rfc include="reference.RFC.7348.xml"?>
    <?rfc include="reference.RFC.7365.xml"?>
    <?rfc include="reference.RFC.7761.xml"?>
    <?rfc include="reference.RFC.8200.xml"?>
    <?rfc include="reference.RFC.8556.xml"?>
    <?rfc include="reference.I-D.ietf-bier-mtud.xml"?>
    <?rfc include="reference.I-D.ietf-bier-path-mtu-discovery.xml"?>
</references>

<section title="BIER Use Case in Data Centers">
  <t>In current data center virtualization, virtual eXtensible Local Area
  Network (VXLAN) <xref target="RFC7348"/> is a kind of network virtualization
  overlay technology which is overlaid between NVEs and is intended for
  multi-tenancy data center networks, whose reference architecture is
  illustrated as per <xref target="nvo3arch"/>.</t>

  <figure anchor="nvo3arch" title="NVO3 Architecture">
    <artwork align="center">
    +--------+                                             +--------+
    | Tenant +--+                                     +----| Tenant |
    | System |  |                                    (')   | System |
    +--------+  |          ................         (   )  +--------+
                |  +-+--+  .              .  +--+-+  (_)
                |  | NVE|--.              .--| NVE|   |
                +--|    |  .              .  |    |---+
                   +-+--+  .              .  +--+-+
                   /       .              .
                  /        .  L3 Overlay  .  +--+-++--------+
    +--------+   /         .    Network   .  | NVE|| Tenant |
    | Tenant +--+          .              .--|    || System |
    | System |             .              .  +--+-++--------+
    +--------+             ................
    </artwork>
  </figure>

  <t>
    And there are two kinds of most common methods about how to forward
    BUM packets in this virtualization overlay network.  One is using PIM
    as underlay multicast routing protocol to build explicit multicast
    distribution tree, such as PIM-SM <xref target="RFC7761"/> or PIM-BIDIR
    <xref target="RFC5015"/> multicast routing protocol.  Then, when BUM
    packets arrive
    at NVE, it requires NVE to have a mapping between the VXLAN Network
    Identifier and the IP multicast group.  According to the mapping, NVE
    can encapsulate BUM packets in a multicast packet which group address
    is the mapping IP multicast group address and steer them through
    explicit multicast distribution tree to the destination NVEs.  This
    method has two serious drawbacks.  It need the underlay network
    supports complicated multicast routing protocol and maintains
    multicast related per-flow state in every transit nodes.  What is
    more, how to configure the ratio of the mapping between VNI and IP
    multicast group is also an issue.  If the ratio is 1:1, there should
    be 16M multicast groups in the underlay network at maximum to map to
    the 16M VNIs, which is really a significant challenge for the data
    center devices.  If the ratio is n:1, it would result in inefficiency
    bandwidth utilization which is not optimal in data center networks.
  </t>
  <t>
    The other method is using ingress replication to require each NVE to
    create a mapping between the VXLAN Network Identifier and the remote
    addresses of NVEs which belong to the same virtual network.  When NVE
    receives BUM traffic from the attached tenant, NVE can encapsulate
    these BUM packets in unicast packets and replicate them and tunnel
    them to different remote NVEs respectively.  Although this method can
    eliminate the burden of running multicast protocol in the underlay
    network, it has a significant disadvantage: large waste of bandwidth,
    especially in big-sized data center where there are many receivers.
  </t>
  <t>
    BIER <xref target="RFC8279"/> is
    an architecture that provides optimal multicast forwarding through a
    "BIER domain" without requiring intermediate routers to maintain any
    multicast related per-flow state.  BIER also does not require any
    explicit tree-building protocol for its operation.  A multicast data
    packet enters a BIER domain at a "Bit-Forwarding Ingress Router"
    (BFIR), and leaves the BIER domain at one or more "Bit-Forwarding
    Egress Routers" (BFERs).  The BFIR router adds a BIER header to the
    packet.  The BIER header contains a bit-string in which each bit
    represents exactly one BFER to forward the packet to.  The set of
    BFERs to which the multicast packet needs to be forwarded is
    expressed by setting the bits that correspond to those routers in the
    BIER header.  Specifically, for BIER-TE, the BIER header may also
    contain a bit-string in which each bit indicates the link the flow
    passes through.
  </t>
  <t>
    The following sub-sections try to propose how to take full advantage of
    overlay multicast protocol to carry virtual network information, and
    create a mapping between the virtual network information and the
    bit-string to implement BUM services in data centers.
  </t>

  <section title="Convention and Terminology">
    <t>
      The terms about NVO3 are defined in <xref target="RFC7365"/>.
      The most common terminology used in this appendix is listed below.
    </t>
    <t>
      <list style="hanging">
	<t hangText="NVE:">
	  Network Virtualization Edge, which is the entity that implements
	  the overlay functionality.  An NVE resides at the boundary between a
	  Tenant System and the overlay network.
	</t>
	<t hangText="VXLAN:">
	  Virtual eXtensible Local Area Network
	</t>
	<t hangText="VNI:">
	  VXLAN Network Identifier
	</t>
	<t hangText="Virtal Network Context Identifier:">
	  Field in an overlay encapsulation
	  header that identifies the specific VN the packet belongs to.
	</t>
      </list>
    </t>
  </section>
  <section title="BIER in data centers">
    <t>
      This section tries to describe how to use BIER as an optimal scheme
      to forward the broadcast, unknown and multicast (BUM) packets when
      they arrive at the ingress NVE in data centers.
    </t>
    <t>
      The principle of using BIER to forward BUM traffic is that: firstly,
      it requires each ingress NVE to have a mapping between the Virtual
      Network Context Identifier and the bit-string in which each bit
      represents exactly one egress NVE to forward the packet to.  And
      then, when receiving the BUM traffic, the BFIR/Ingree NVE maps the
      receiving BUM traffic to the mapping bit-string, encapsulates the
      BIER header, and forwards the encapsulated BUM traffic into the BIER
      domain to the other BFERs/Egress NVEs indicated by the bit-string.
    </t>
    <t>
      Furthermore, as for how each ingress NVE knows the other egress NVEs
      that belong to the same virtual network and creates the mapping is
      the main issue discussed below.  Basically, BIER Multicast Listener
      Discovery is an overlay solution to support ingress routers to keep
      per-egress-router state to construct the BIER bit-string associated
      with IP multicast packets entering the BIER domain.  The following
      section tries to extend BIER MLD to carry virtual network
      information(such as Virtual Network Context identifier), and
      advertise them between NVEs.  When each NVE receive these
      information, they create the mapping between the virtual network
      information and the bit-string representing the other NVEs belonged
      to the same virtual network.
    </t>
  </section>
  <section title="A BIER MLD solution for Virtual Network information">
  <t>The BIER MLD solution allows having multiple MLD instances by having
     unique pairs of BMLD Nodes and BMLD Querier addresses for each instance.
     Assume for now that we have a unique instance per VNI and that all BMLD
     routers are using the same mapping between VNIs and BMLD address pairs.
     Also for each VNI there is a multicast group used for encapsulation of
     BUM traffic over BIER. This group may potentially be shared by some or
     all of the VNIs.
  </t>
  <t>
    Each NVE acquires the Virtual Network information, and advertises
    this Virtual Network information to other NVEs through the MLD
    messages. For a given VNI it sends BMLD reports to the BMLD nodes
    address used for that VNI, for the group used for delivering BUM
    traffic for that VNI. This allows all NVE routers to know which other
    NVE routers have interest in BUM traffic for a particular VNI.
    If one attached virtual network is
    migrated, the NVE will withdraw the Virtual Network information by
    sending an unsolicited BMLD report. Note that NVEs also respond to
    periodic queries to BMLD Nodes addresses corresponding to VNIs for
    which they have interest.
  </t>
  <t>
    When ingress NVE receives the Virtual Network information
    advertisement message, it builds a mapping between the receiving
    Virtual Network Context Identifier in this message and the bit-string
    in which each bit represents one egress NVE who sends the same
    Virtual Network information.  Subsequently, once this ingress NVE
    receives some other MLD advertisements which include the same Virtual
    Network information from some other NVEs , it updates the bit-string
    in the mapping and adds the corresponding sending NVE to the updated
    bit-string.  Once the ingress NVE removes one virtual network, it
    will delete the mapping corresponding to this virtual network as well
    as send withdraw message to other NVEs.
  </t>
  <t>
    After finishing the above interaction of MLD messages, each ingress
    NVE knows where the other egress NVEs are in the same virtual
    network.  When receiving BUM traffic from the attached virtual
    network, each ingress NVE knows exactly how to encapsulate this
    traffic and where to forward them to.
  </t>
  <t>
   This can be used in both IPv4 network and IPv6 network.  In IPv4,
   IGMP protocol does the similar extension for carrying Virtual Network
   information TLV in Version 2 membership report message.
  </t>
  <t>Note that it is possible to have multiple VNIs map to the same
     pair of BMLD addresses. Provided VNIs that map to the same BMLD
     address uses different multicast groups for encapsulation, this
     is not a problem, because each instance is tracking interest for
     each multicast group separately. If multiple VNIs map to the same
     pair and the multicast group used is not unique, some NVEs may
     receive BUM traffic for which they are not interested. An NVE
     would drop packets for an unknown VNI, but it means wasting some
     bandwidth and processing. This is similar to the non-BIER case
     where there is not a unique multicast group for encapsulation. The
     improvement offered by using BMLD is by using multiple instance,
     hence reducing the problems caused by using the same transport
     group for multiple VNIs.
  </t>
  </section>
</section>
</back>
</rfc>
