Multicast On-path Telemetry using IOAM

                 Multicast On-path Telemetry using IOAM


   This document specifies the solutions to meet the requirements of on-
   path telemetry for multicast traffic using In-situ OAM.  While In-
   situ OAM is advantageous for multicast traffic telemetry, some unique
   challenges are present.  This document provides the solutions based
   on the In-situ OAM trace option and direct export option to support
   the telemetry data correlation and the multicast tree reconstruction
   without incurring data redundancy.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119][RFC8174] when, and only when, they appear in all
   capitals, as shown here.

1.  Introduction

   IP Multicast has had many useful applications for several decades.
   [I-D.ietf-pim-multicast-lessons-learned] provides a thorough
   historical perspective about the design and deployment of many of the
   multicast routing protocols in use with the various applications.  We
   will mention of few of these throughout this document and in the
   Applications Considerations section.  IP Multicast has been used by
   residential broadband customers across operator networks, private
   MPLS customers and internal customers within corporate intranets.  IP
   Multicast has provided real time interactive online meetings or
   podcasts, IPTV, and financial markets real-time data, which all have
   a reliance on UDP's unreliable transport.  End-to-end QOS, therefore,
   should be a critical component of multicast deployment in order to
   provide a good end user experience within a specific operational
   domain.  In multicast real-time media streaming, if a single packet
   is lost within a keyframe and cannot be recovered using forward error
   correction, this can result in many receivers being unable to decode
   subsequent frames within the Group of Pictures (GoP), resulting in
   video freezes or black pictures until another keyframe is delivered.
   Unexpectedly long delays in delivery of packets can result in
   timeouts within similar results.  Multicast packet loss and delays
   can therefore affect application performance and the user experience
   within a domain.

   It is essential to monitor the performance of multicast traffic.  New
   on-path telemetry techniques, such as In-situ OAM (IOAM) [RFC9197],
   IOAM Direct Export (DEX) [RFC9326] IOAM Marking-based Postcard
   (PBT-M) [], and Hybrid Two-Step
   (HTS) [I-D.ietf-ippm-hybrid-two-step], complement existing active OAM
   performance monitoring methods like ICMP ping [RFC0792].  However,
   multicast traffic's unique characteristics present challenges in
   applying these techniques efficiently.

   The IP multicast packet data for a particular (S, G) state remains
   identical across different branches to multiple receivers.  When IOAM
   trace data is added to multicast packets, each replicated packet
   retains telemetry data for its entire forwarding path.  This results
   in redundant data collection for common path segments, unnecessarily
   consuming extra network bandwidth.  For large multicast trees, this
   redundancy is substantial.  Using solutions like IOAM DEX could be
   more efficient by eliminating data redundancy, but IOAM DEX lacks a
   branch identifier, complicating telemetry data correlation and
   multicast tree reconstruction.

   This draft provides two solutions to the IOAM data redundancy problem
   based on the IOAM standards.  The requirements for multicast traffic
   telemetry are discussed along with the issues of the existing on-path

   telemetry techniques.  We propose modifications and extensions to
   make these techniques adapt to multicast in order for the original
   multicast tree to be correctly reconstructed while eliminating
   redundant data.  This document does not cover the operational
   considerations such as how to enable the telemetry on a subset of the
   traffic to avoid overloading the network or the data collector.

2.  Requirements for Multicast Traffic Telemetry

   Multicast traffic is forwarded through a multicast tree.  With PIM
   [RFC7761] and P2MP, the forwarding tree is established and maintained
   by the multicast routing protocol.

   The requirements for multicast traffic telemetry which are addressed
   by the solutions in this document are:

   *  Reconstruct and visualize the multicast tree through data plane

   *  Gather the multicast packet delay and jitter performance on each

   *  Find the multicast packet drop location and reason.

   In order to meet all of these requirements, we need the ability to
   directly monitor the multicast traffic and derive data from the
   multicast packets.  The conventional OAM mechanisms, such as
   multicast ping [RFC6450] trace [RFC8487], and RTCP [RFC3605] are not
   sufficient to meet all of these requirements.  The telemetry methods,
   in this draft, do meet these requirements by providing granular hop
   by hop network monitoring along with the reduction of data

3.  Issues of Existing Techniques

   On-path Telemetry techniques that directly retrieve data from
   multicast traffic's live network experience are ideal for addressing
   the aforementioned requirements.  The representative techniques
   include In-situ OAM (IOAM) Trace option [RFC9197], IOAM Direct Export
   (DEX) option [RFC9326], and PBT-M
   [].  However, unlike unicast,
   multicast poses some unique challenges to applying these techniques.

   Multicast packets are replicated at each branch fork node in the
   corresponding multicast tree.  Therefore, there are multiple copies
   of the original multicast packet in the network.

   When the IOAM trace option is utilized for on-path data collection,
   partial trace data is replicated into the packet copy for each branch
   of the multicast tree.  Consequently, at the leaves of the multicast
   tree, each copy of the multicast packet contains a complete trace.
   This results in data redundancy, as most of the data (except from the
   final leaf branch) appears in multiple copies, where only one is
   sufficient.  This redundancy introduces unnecessary header overhead,
   wastes network bandwidth, and complicates data processing.  The
   larger the multicast tree or the longer the multicast path, the more
   severe the redundancy problem becomes.

   The postcard-based solutions (e.g., IOAM DEX), can eliminate data
   redundancy because each node on the multicast tree sends a postcard
   with only local data.  However, these methods cannot accurately track
   and correlate the tree branches due to the absence of branching
   information.  For instance, in a multicast tree shown in Figure 1,
   Node B has two branches, one to Node C and the other to node D;
   further, Node C leads to Node E and Node D leads to Node F.  When
   applying postcard-based methods, it is impossible to determine
   whether Node E is the next hop of Node C or Node D from the received
   postcards alone, unless one correlates the exporting nodes with
   knowledge about the tree collected by other means (e.g., mtrace).
   Such correlation is undesirable because it introduces extra work and

   The fundamental reason for this problem is that there is not an
   identifier (either implicit or explicit) to correlate the data on
   each branch.

4.  Modifications and Extensions based on Existing Solutions

   We provide two solutions to address the above issues.  One is based
   on IOAM DEX and requires an extension to the instruction header of
   the IOAM DEX Option.  The second solution combines the IOAM trace
   option and postcards for redundancy removal.

4.1.  Per-hop postcard using IOAM DEX

   One way to mitigate the postcard-based telemetry's tree tracking
   weakness is to augment it with a branch identifier field.  This works
   for the IOAM DEX option because the IOAM DEX option has an
   instruction header which can be used to hold the branch identifier.
   To make the branch identifier globally unique, the branch fork node
   ID plus an index is used.  For example, as shown in Figure 1, Node B
   has two branches: one to Node C and the other to Node D.  Node B may
   use [B, 0] as the branch identifier for the branch to C, and [B, 1]
   for the branch to D.  The identifier is carried with the multicast
   packet until the next branch fork node.  Each node MUST export the

   branch identifier in the received IOAM DEX header in the postcards it
   sends.  The branch identifier, along with the other fields such as
   flow ID and sequence number, is sufficient for the data collector to
   reconstruct the topology of the multicast tree.

   Figure 1 shows an example of this solution.  "P" stands for the
   postcard packet.  The square brackets contains the branch identifier.
   The curly brace contains the telemetry data about a specific node.

   P:[A,0]{A}  P:[A,0]{B} P:[B,1]{D}  P:[B,0]{C}   P:[B,0]{E}
        ^            ^          ^        ^           ^
        :            :          :        :           :
        :            :          :        :           :
        :            :          :      +-:-+       +-:-+
        :            :          :      |   |       |   |
        :            :      +---:----->| C |------>| E |-...
      +-:-+        +-:-+    |   :      |   |       |   |
      |   |        |   |----+   :      +---+       +---+
      | A |------->| B |        :
      |   |        |   |--+   +-:-+
      +---+        +---+  |   |   |
                          +-->| D |--...
                              |   |

                         Figure 1: Per-hop Postcard

   Each branch fork node needs to generate a unique branch identifier
   (i.e., branch ID) for each branch in its multicast tree instance and
   include it in the IOAM DEX option header.  The branch ID remains
   unchanged until the next branch fork node.  The branch ID contains
   two parts: the branch fork node ID and an interface index.

   Conforming to the node ID specification in IOAM [RFC9197], the node
   ID is a 3-octet unsigned integer.  The interface index is a two-octet
   unsigned integer.  As shown in Figure 2, the branch ID consumes 8
   octets in total.  The three unused octets MUST be set to 0; otherwise
   the header is considered malformatted and the packet MUST be dropped.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    |                 node_id                       |     unused    |
    |       Interface Index         |           unused              |

                    Figure 2: Multicast Branch ID format

   Figure 3 shows that the branch ID is carried as an optional field
   after the flow ID and sequence number optional fields in the IOAM DEX
   option header.  Two bits "N" and "I" (i.e., the third and fourth bits
   in the Extension-Flags field) are reserved to indicate the presence
   of the optional branch ID field.  "N" stands for the Node ID and "I"
   stands for the interface index.  If "N" and "I" are both set to 1,
   the optional multicast branch ID field is present.  Two Extension-
   Flag bits are used because [RFC9326] specifies that each extension
   flag only indicates the presence of a 4-octet optional data, while we
   need more than 4 octets to encode the branch ID.  The two flag bits
   MUST be both set or cleared; otherwise the header is considered
   malformatted and the packet MUST be dropped.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    |        Namespace-ID           |     Flags     |F|S|N|I|E-Flags|
    |               IOAM-Trace-Type                 |   Reserved    |
    |                         Flow ID (optional)                    |
    |                     Sequence Number  (Optional)               |
    |          Multicast Branch ID (as shown in Figure 2)           |
    |                            (optional)                         |

            Figure 3: Carry Branch ID in IOAM DEX option header

   Once a node gets the branch ID information from the upstream, it MUST
   carry this information in its telemetry data export postcards, so the
   original multicast tree can be correctly reconstructed based on the

4.2.  Per-section postcard for IOAM Trace

   The second solution is a combination of the IOAM trace option
   [RFC9197] and the postcard-based telemetry
   [].  To avoid data redundancy, at each
   branch fork node, the trace data accumulated up to this node is
   exported by a postcard before the packet is replicated.  In this
   solution, each branch also needs to maintain some identifier to help
   correlate the postcards for each tree section.  The natural way to

   accomplish this is to simply carry the branch fork node's data
   (including its ID) in the trace of each branch.  This is also
   necessary because each replicated multicast packet can have different
   telemetry data pertaining to this particular copy (e.g., node delay,
   egress timestamp, and egress interface).  As a consequence, the local
   data exported by each branch fork node can only contain the common
   data shared by all the replicated packets (e.g., ingress interface
   and ingress timestamp).

   Figure 4 shows an example in a segment of a multicast tree.  Node B
   and D are two branch fork nodes and they will export a postcard
   covering the trace data for the previous section.  The end node of
   each path will also need to export the data of the last section as a

                P:{A,B'}            P:{B1,C,D'}
                   ^                     ^
                   :                     :
                   :                     :
                   :                     :    {D1}
                   :                     :    +--...
                   :        +---+      +---+  |
                   :   {B1} |   |{B1,C}|   |--+ {D2}
                   :    +-->| C |----->| D |-----...
       +---+     +---+  |   |   |      |   |--+
       |   | {A} |   |--+   +---+      +---+  |
       | A |---->| B |                        +--...
       |   |     |   |--+   +---+             {D3}
       +---+     +---+  |   |   |{B2,E}
                        +-->| E |--...
                       {B2} |   |

                       Figure 4: Per-section Postcard

   There is no need to modify the IOAM trace option header format as
   specified in [RFC9197].  We just need to configure the branch fork
   nodes, as well as the leaf nodes, to export the postcards which
   contains the trace data collected so far, and refresh the IOAM header
   and data in the packet (e.g., clear the node data list to all zero
   and reset the Remaining Length field to the initial value).

5.  Application Considerations for Multicast Protocols

5.1.  Mtrace version 2

   Mtrace version 2 (Mtrace2) [RFC8487] is a protocol that allows the
   tracing of an IP multicast routing path.  Mtrace2 provides additional
   information such as the packet rates and losses, as well as other
   diagnostic information.  Unlike unicast traceroute, Mtrace2 traces
   the path that the tree building messages follow from receiver to
   source.  An Mtrace2 client sends an Mtrace2 Query to a Last-Hop
   Router (LHR) and the LHR forwards the packet as an Mtrace2 Request
   towards the source or a Rendezvous Point (RP) after appending a
   response block.  Each router along the path proceeds the same
   operations.  When the First-Hop Router (FHR) receives the Request
   packet, it appends its own response block, turns the Request packet
   into a Reply, and unicasts the Reply back to the Mtrace2 client..

   New on-path telemetry techniques will enhance Mtrace2, and other
   existing OAM solutions, with more granular and realtime network
   status data through direct measurements.  There are various multicast
   protocols that are used to forward the multicast data.  Each will
   require their own unique on-path telemetry solution.  Mtrace2 doesn't
   integrate with IOAM directly, but network management systems may use
   Mtrace2 to learn about routers of interest.

5.2.  Application in PIM

   PIM-SM [RFC7761] is the most widely used multicast routing protocol
   deployed today.  PIM-SSM, however, is the preferred method due to its
   simplicity and removal of network source discovery complexity.  With
   PIM, control plane state is established in the network in order to
   forward multicast UDP data packets.  PIM utilizes network based
   source discovery.  PIM-SSM, however, utilizes application based
   source discovery.  IP multicast packets fall within the range of through for IPv4 and ff00::/8 for IPv6.
   The telemetry solution will need to work within these IP address
   ranges and provide telemetry data for this UDP traffic.

   A proposed solution for encapsulating the telemetry instruction
   header and metadata in IPv6 packets is described in

5.3.  Application of MVPN X-PMSI Tunnel Encapsulation Attribute

   IOAM, and the recommendations of this document, are equally
   applicable to multicast MPLS forwarded packets.  Multipoint Label
   Distribution Protocol (mLDP), P2MP RSVP-TE, Ingress Replication (IR)
   and PIM MDT SAFI with GRE Transport are all commonly used within a
   Multicast VPN (MVPN) environment utilizing MVPN procedures such as
   Multicast in MPLS/BGP IP VPNs [RFC6513] and BGP Encoding and

   Procedures for Multicast in MPLS/BGP IP VPNs [RFC6514].  MLDP LDP
   Extension for P2MP and MP2MP LSPs [RFC6388] provides extensions to
   LDP to establish point-to-multipoint (P2MP) and multipoint-to-
   multipoint (MP2MP) label switched paths (LSPs) in MPLS networks.  The
   telemetry solution will need to be able to follow these P2MP and
   MP2MP paths.  The telemetry instruction header and data should be
   encapsulated into MPLS packets on P2MP and MP2MP paths.

6.  Security Considerations

   The schemes discussed in this document share the same security
   considerations for the IOAM trace option [RFC9197] and the IOAM DEX
   option [RFC9326].  In particular, since multicast has a built-in
   nature for packet amplification, the possible amplification risk for
   the DEX-based scheme is greater than the case of unicast.  Hence,
   stricter mechanisms for protections need to be applied.  In addition
   to selecting packets to enable DEX and limiting the exported traffic
   rate, we can also allows only a subset of the nodes in a multicast
   tree to process the option and export the data (e.g., only the
   branching nodes in the multicast tree are configured to process the

7.  IANA Considerations

   The document requests two new extension flag registrations in the
   "IOAM DEX Extension-Flags" registry, as described in Section 4.1.

   Bit 2 "Multicast Branching Node ID [RFC XXXX] [RFC Editor: please
   replace with the RFC number of the current document]".

   Bit 3 "Multicast Branching Interface Index [RFC XXXX] [RFC Editor:
   please replace with the RFC number of the current document]".

8.  Acknowledgments

   The authors would like to thank Gunter Van de Velde, Brett Sheffield,
   Eric Vyncke, Frank Brockners, Nils Warnke, Jake Holland, Dino
   Farinacci, Henrik Nydell, Zaheduzzaman Sarker and Toerless Eckert for
   their comments and suggestions.

