<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-du-computing-resource-representation-01"
     ipr="trust200902">
  <front>
    <title abbrev="Computing Resource Representation in CAN">Computing
    Resource Representation in Computing Aware Networking</title>

    <author fullname="Zongpeng Du" initials=" Z." surname="Du">
      <organization>China Mobile</organization>

      <address>
        <postal>
          <street>No.32 XuanWuMen West Street</street>

          <city>Beijing</city>

          <code>100053</code>

          <country>China</country>
        </postal>

        <email>duzongpeng@foxmail.com</email>
      </address>
    </author>

    <author fullname="Yuexia Fu" initials="Y." surname="Fu">
      <organization>China Mobile</organization>

      <address>
        <postal>
          <street>No.32 XuanWuMen West Street</street>

          <city>Beijing</city>

          <code>100053</code>

          <country>China</country>
        </postal>

        <email>fuyuexia@chinamobile.com</email>
      </address>
    </author>

    <date month="" year=""/>

    <area>Routing Area</area>

    <workgroup>Network Working Group</workgroup>

    <keyword>CAN, Dyncast, service metric</keyword>

    <abstract>
      <t>This document introduces the way of encoding service-specific
      information and the way of signaling it in the network.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>Traditionally, the network can only do traffic engineering according
      to the network statuses. As the trend of computing and network
      convergence, some works are proposed for network to be aware of service
      information, and can make a better choice in the traffic steering
      accordingly. Computing Aware Networking (CAN) could steer the traffic
      based on both the network and computing statuses, which is considered as
      a mechanism for computing and network convergence.</t>

      <t>In the traditional network architecture, the network is only
      responsible for delivering packets between servers and clients, and is
      not aware of the computing information. <xref
      target="I-D.liu-dyncast-ps-usecases"/> and <xref
      target="I-D.liu-dyncast-reqs"/> show that, when service instances are
      deployed at multiple geographical edge sites, CAN would achieve service
      equivalence and load balancing by considering both the service metrics
      and network metrics.</t>

      <t>However, the method of notifying the service metrics in the network,
      representation of computing resources, and signaling of computing
      resource to the network are still uncertain, which is important for the
      network domain to know about the computing domain.</t>

      <t>This document dose further explorations on the way of service metrics
      encoding and signaling. Some requirements about the service metric
      representation and signaling can be found in the document <xref
      target="I-D.liu-dyncast-gap-reqs"/>.</t>

      <t/>
    </section>

    <section anchor="definition-of-terms" title="Definition of Terms">
      <t>This document makes use of the following terms:</t>

      <t><list hangIndent="2" style="hanging">
          <t hangText="Computing-Aware Networking (CAN):">Aiming at computing
          and network resource optimization by steering traffic to appropriate
          computing resources considering not only routing metric but also
          computing resource metric and service affiliation.</t>

          <t hangText="Service:">A monolithic functionality that is provided
          by an endpoint according to the specification for said service. A
          composite service can be built by orchestrating monolithic
          services.</t>

          <t hangText="Service instance:">Running environment (e.g., a node)
          that makes the functionality of a service available. One service can
          have several instances running at different network locations.</t>

          <t hangText="Service identifier:">Used to uniquely identify a
          service, at the same time identifying the whole set of service
          instances that each represent the same service behavior, no matter
          where those service instances are running.</t>

          <t hangText="Computing capacity:">The ability of nodes with
          computing resource achieve specific result output through data
          processing, specifically including computing, communication, memory
          and storage capacity.</t>
        </list></t>
    </section>

    <section title="Requirements of Computing Resource Representation and Signaling">
      <section title="Requirements of Computing Resource Representation">
        <t>The CAN needs to obtain the computing information of the computing
        resource for a service, to realize the traffic steering considering
        both network and computing status. As described in <xref
        target="I-D.liu-dyncast-reqs"/>, the representation and encoding of
        computing metric is crucial, which is conveyed to CAN system to
        support the CAN components to act upon. The representation needs to
        express the capabilities of computing resources accurately, and the
        CAN system must agree on the service-specific metrics and their
        representation between service elements in the participating edges for
        the CAN components to act upon them.</t>

        <t>Moreover, the computing resource representation need to consider
        the computing modeling as the requirements described in <xref
        target="I-D.liu-can-computing-resource-modeling"/>:</t>

        <t>Support the representation of computing resources in multiple
        dimensions, including computing capacity, communication capacity,
        cache capacity and storage capacity.</t>

        <t>Support the representation of the computing capacity in chip
        category, such as CPU, GPU, FPGA, ASIC, and in computing type, such as
        int calculation, float calculation and hash calculation.</t>
      </section>

      <section title="Requirements of Computing Resource Signaling">
        <t>The representation results of computing resources need to be
        exposed in the network to support the efficient utilizing of computing
        resources, or joint utilizing of both computing resources and network
        resources as describe in <xref target="I-D.liu-dyncast-reqs"/>. CAN
        aims at dynamic scenarios of which the status of computing resources
        may vary frequently, e.g., changing with the number of sessions,
        CPU/GPU utilization and memory space. More frequent distribution of
        more accurate synchronization of the real-time representation of
        computing resources may result in more overhead in terms of signaling.
        Thus, the signaling of computing resources needs to distribute and
        synchronize the real-time representation of computing resources
        efficiently to reduce the unnecessary signaling and meet the service
        requirements. The requirements contain several aspects as described
        below.</t>

        <t>Support to signal various message based on the representation of
        computing resources.</t>

        <t>Support to control the signaling rate, such as define at what
        interval or events to signal the information of computing resources.
        </t>

        <t>Support to signal the updated information of computing
        resources.</t>

        <t>Support to implement mechanisms for loop avoidance in signaling
        metrics, when necessary.</t>
      </section>
    </section>

    <section title="Representation of Computing Information">
      <t>The main job of the network is to forward the packets of the users
      from the source to the destination, while the main job of the computing
      is to complete the various tasks of the users.</t>

      <t>The network metrics include the bandwidth, latency, jitter, etc. They
      can describe the capabilities of the network, and are independent of the
      detailed realization of the underlayer technologies, such as the mode of
      the optical fiber, or the structure of a switch.</t>

      <t>The computing metrics are more complex, which is hard to match the
      QoS/QoE. For example, if the task is the AI computing, such as the image
      processing, the computing resource can be measured by using FLOPS
      (Floating-point Operations Per Second) or TFLOPS (Tera FLOPS). However,
      it is more difficult to get the process time, which will be influenced
      by the current utilization rate of CPU, cache, and so on. Even some
      real-time OS or protocol are used, sometimes it will fail because of the
      deadlock or other mechanisms of OS.That is not to say there is any
      problem with the OS, but the complex environment in it. So, the service
      metric will consider more factors to judge the performance, and how to
      be used in another domain to guarantee the E2E service quality.</t>

      <t><xref target="I-D.liu-can-computing-resource-modeling"/> proposes a
      basic architecture of computing resource modeling, which considers the
      computing hardware types, computing task types, communication, cache,
      storage status, and uses the vector to represent the basic result of
      modeling. The vector could be:</t>

      <t>a group of multiple vectors, to represent the evaluated level of
      computing, communication, cache, and storage capacity.</t>

      <t>a single vector, to represent the single comprehensive level of
      overall capacity.</t>

      <t>How to use the vector depends on the specific application domain. For
      the network, to preserve the metadata privacy of computing domain,
      usually, weighted or fuzzy processing methods are used.</t>

      <section title="Representation of Computing Metric">
        <t>How to use the vector depends on the specific application demands.
        To preserve the metadata privacy of computing domain, usually, the
        weighted or fuzzy processing methods are used by CAN.</t>

        <t>Based on <xref target="I-D.liu-can-computing-resource-modeling"/>,
        to use the information of computing resource for network, we can use
        two general ways to represent them. One is to use single vector to
        represent the level, the other is to use a group of vectors to
        represent more detailed information.</t>

        <section title="Representing in a Single value">
          <t>At one aspect, we can offer a general computing load information
          to the ingress nodes. As an example, we perhaps only need to three
          values:</t>

          <t>one red value stands for the busy status,</t>

          <t>one yellow value stands for relatively busy status,</t>

          <t>one green value stands for free status.</t>

          <t>Therefore, the ingress node only needs to consider the yellow
          edge sites and green edge sites when steering traffic, in which the
          green ones are more preferred.</t>
        </section>

        <section title="Representing in Multiple values">
          <t>At the other aspect, we can also offer detailed computing related
          information but also are expected to be the weighted value as
          described in <xref
          target="I-D.liu-can-computing-resource-modeling"/>, such as
          computing capacity information includes chips category and computing
          task category, communication information, cache information and
          storage information.</t>

          <t>Moreover, some additional information could also be represented
          if needed:</t>

          <t>the service information deployed on edge sites, for example,
          Service ID,</t>

          <t>the maximum session number that the edge sites can provide,</t>

          <t>the current session number that the edge sites can provide,</t>

          <t>the available computing infrastructure of the server, etc.</t>

          <t>Those information may be optional and encoded as TLVs. A specific
          service may have a specific preferred set of TLVs. For example, if
          multiple instances have the same free status, the additional TLVs
          could be used to represent the computing resources. The detailed
          decision algorithm is out of scope of this document.</t>

          <t>The informing of the TLVs should be service-specific and
          on-demand. Different services may care about or have subscribed
          different sets of TLVs. Besides, if an Ingress node receives any TLV
          that it does not support, the Ingress node can just ignore it.</t>
        </section>
      </section>

      <section title="Example Process of Computing Load Information">
        <t>For a specific service, we can offer both a general computing load
        information and some more specific information about the computing. A
        general process about it is described as below.</t>

        <t>Step1: The service instances are deployed in multiple edge sites.
        The ingress nodes of network working as the load balancing point needs
        to obtain the computing information. The service should have a
        specific SID, for example SID1, in the network, so that the ingress
        node can recognize and treat the service request differently according
        to SID.</t>

        <t>Step2: After obtaining the computing information of a service
        related to ServiceID1 from multiple edge sites, the ingress nodes
        should record the computing information. Meanwhile, an ingress node
        should also be able to obtain network status, for example the latency
        to the egress of an edge site and record it.</t>

        <t>Step3: An ingress node receives a packet targeted to the
        ServiceID1. According to the service metrics and network metrics it
        has recorded, the ingress node makes a decision about which edge site
        to use and forward the packet to the related egress. The selection
        method may be depended on the service. For example, it may be the one
        with the lowest latency among the ones that can offer the service, or
        the one with the best computing resource among the ones that have a
        latency fulfilling the service requirements, or a hybrid method.</t>

        <t>The purpose of the procedure is to find an edge site that is
        relatively near to the client, and also have enough computing resource
        for the service. However, the edge sites that provide the service may
        be various, and perhaps have different computing abilities. Therefore,
        a load balancing method considering the computing resource is useful
        in this scenario.</t>
      </section>
    </section>

    <section title="Signaling of Computing Information">
      <t>The target of CAN is to steer traffic considering both network and
      computing resource status. To meet the use case demands in <xref
      target="I-D.liu-dyncast-ps-usecases"/>, an "on-path" decision is
      expected. For instance, the Ingress of the network works as the decision
      point to steer the traffic of the users. In this situation, the Ingress
      needs to know the computing information of the service instance, which
      could be behind the Egress. Among the computing information, some are
      relatively static, and some are dynamic. They may be delivered by using
      different means, and at different frequencies.</t>

      <t>Besides of the computing resource modeling and computing resource
      representation, CAN should also focus on how to deliver the computing
      information from the Egress to the Ingress.</t>

      <section title="General Process of Informing">
        <t>For the signaling of the computing information, a general process
        about it is described as below.</t>

        <t>Step1: The gateway of the edge site collects the computing status
        information of the specific service instance or a categorized service.
        In some cases, there will be the controller in the edge site, which
        can help to collect the information and notify the gateway.</t>

        <t>Step2: The Egress of CAN receives the service status information
        from the gateway of the edge site and notify the CAN ingress
        nodes.</t>

        <t>In the first step, the controller or the gateway perhaps can
        communicate by PCE or other protocol for the controller. In the second
        step, the controller-based method can also be used; however,
        communications between the controller of the edge site and the
        controller of the network may be complicated and inefficient.</t>

        <t>In the following section, we propose some potential ways to notify
        computing information, including the BGP extension, and others
        potential methods. When we are notifying that the edge sites have the
        service, i.e., a binding address for the service and the corresponding
        route to it, we can add additional computing information in its
        Extended Community.</t>
      </section>

      <section title="BGP Method in Informing">
        <t>As the informing of the computing information is for the edge
        network nodes, we can consider using BGP, specifically the MP-BGP<xref
        target="RFC4760"> RFC 4760</xref> . BGP is a gateway protocol that
        enables the network to exchange routing information between Autonomous
        Systems (AS). MP-BGP allows VPN edge nodes to exchange client
        information via different underlay networks (e.g., MPLS). As said
        before, we can add the computing information in the Extended
        Community.</t>

        <t>When we notify the route for the specific service (naming as
        ServiceID1) whose address is an anycast address, in a BGP UPDATE
        message, the route can include many Path Attributes. The Extended
        Community is one of the Attributes defined in <xref
        target="RFC4360">RFC 4360</xref>.</t>

        <t><figure>
            <artwork><![CDATA[      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |             Type              |           Length              |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    Flag       |    Status     |          Sub-tlvs             |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
           Figure 1. Format of the Computing Information in BGP
]]></artwork>
          </figure></t>

        <t>Type: TBD, for example, 0x0314.</t>

        <t>Length: This refers to the total length in octets of the element
        excluding the Type and Length fields.</t>

        <t>Flag: all zero.</t>

        <t>Status: the first two bits are used.</t>

        <t>Sub-tlvs: the sub-tlvs related to computing information.</t>

        <t>One example of the Sub-tlvs is that the value of FLOPS that is
        widely used in the AI analysis scenarios. For some services that need
        a large amount of computing resources, we can also provide a general
        computing grade information of a server, such as large, middle, or
        small.</t>

        <t>Besides the computing information, BGP can also be extended to
        exchange some other information for the CAN. While notifying the load
        of the computing, the network can also monitor the whole load
        balancing system. If any service becomes heavy load, i.e., all the
        service instances for the service are busy, the network should be able
        to inform potential inactive service points to join in the LB.
        Similarly, if any service becomes light-load, i.e. all the service
        instances for the service are relaxed, the network should also be able
        to inform one active service points to become inactive to release the
        resource to other services.</t>

        <t>What needs to be considered more is the update frequency. The
        UPDATE message is sent when the network topology, path, or other
        status change, not cyclical. There should be a match mechanism of the
        computing status change of edge sites, considering the effectiveness
        for a given period of time, and preventing the overload of network
        caused by the notification of network status update, for instance, a
        set threshold.</t>
      </section>

      <section title="Other Methods in Informing">
        <t>The computing information can be treated similarly to the OAM
        (Operations, Administration and Maintenance) information in the
        network. Therefore, it should also be able to be carried in the OAM
        message with some proper extensions to current OAM mechanisms.
        Therefore, the load balancing point can collect network information
        via OAM mechanisms, and it can collect computing information via OAM
        mechanisms.</t>

        <t>Some network programming mechanisms such as SRv6 can also be
        considered here. The computing information can be carried in some
        places of the IPv6 extension headers. For example, some data packets
        from the Egress to the Ingress can carry the computing information.
        The insertion of the computing information can take place on the
        Egress. It can be on-demand or periodically.</t>

        <t>Besides BGP, OAM and network programming mechanisms, if needed, the
        CAN specific methodology of computing information notification could
        also be further formulated.</t>
      </section>
    </section>

    <section anchor="Conclusion" title="Conclusion">
      <t>This document analyzes the requirements of computing representation
      and signaling, proposing some potential method to achieve them, which
      are the key functions of CAN.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>TBD.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>TBD.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>TBD.</t>
    </section>

    <section title="Contributors">
      <t>The following people have substantially contributed to this
      document:</t>

      <t><figure>
          <artwork><![CDATA[Linda Dunbar
]]></artwork>
        </figure></t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.2119'?>

      <?rfc include='reference.RFC.4760'?>

      <?rfc include='reference.RFC.4360'?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.I-D.liu-dyncast-reqs'?>

      <?rfc include='reference.I-D.liu-dyncast-gap-reqs'?>

      <?rfc include='reference.I-D.liu-can-computing-resource-modeling'?>

      <?rfc include='reference.I-D.liu-dyncast-ps-usecases'?>
    </references>
  </back>
</rfc>
