<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd"[
  <!ENTITY rfc2119 SYSTEM "http://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml">
  <!ENTITY rfc5681 SYSTEM "http://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5681.xml">
  <!ENTITY rfc3465 SYSTEM "http://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3465.xml">
  <!ENTITY rfc8312 SYSTEM "http://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8312.xml">
  <!ENTITY rfc9002 SYSTEM "http://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.9002.xml">
]>
<?rfc toc='yes' ?>
<?rfc symrefs='yes' ?>
<?rfc sortrefs='yes'?>
<?rfc compact='yes'?>
<?rfc comments="yes"?>
<?rfc inline="yes" ?>
<!-- <?rfc-ext parse-xml-in-artwork='yes' ?> -->
<!-- <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> -->

<rfc docName="draft-ietf-tcpm-hystartplusplus-06" category="std" ipr="trust200902">
  <front>
    <title abbrev='HyStart++'>HyStart++: Modified&nbsp;Slow&nbsp;Start&nbsp;for&nbsp;TCP</title>
    <author initials='P.' surname='Balasubramanian' fullname='Praveen Balasubramanian'>
      <organization>Confluent</organization>
      <address>
        <postal>
          <street>899 West Evelyn Ave</street>
          <city>Mountain View</city>
          <region>CA</region>
          <code>94041</code>
          <country>USA</country>
        </postal>        
        <email>pravb.ietf@gmail.com</email>
      </address>
    </author>
    <author initials='Y.' surname='Huang' fullname='Yi Huang'>
      <organization>Microsoft</organization>
      <address>
        <postal>
          <street>One Microsoft Way</street>
          <city>Redmond</city>
          <region>WA</region>
          <code>94052</code>
          <country>USA</country>
        </postal>         
        <phone>+1 425 703 0447</phone>
        <email>huanyi@microsoft.com</email>
      </address>
    </author>
    <author initials='M.' surname='Olson' fullname='Matt Olson'>
      <organization>Microsoft</organization>
      <address>
        <phone>+1 425 538 8598</phone>
        <email>maolson@microsoft.com</email>
      </address>
    </author>
    <date/>
    <area>Transport</area>
    <keyword>TCP</keyword>
    <keyword>congestion control</keyword>
    <abstract>
      <t> This doument describes HyStart++, a simple modification to the slow start phase of congestion control algorithms. Traditional slow start
      can overshoot the ideal send rate in many cases, causing high packet loss and poor performance. HyStart++ uses a delay increase heuristic to find an 
      exit point before possible overshoot. It also adds a mitigation to prevent jitter from causing premature slow start exit. 
      </t>
    </abstract>
  </front>

  <middle>
    <section title='Introduction'>
      <t> <xref target="RFC5681"/> describes the slow start congestion control algorithm for TCP. The slow start algorithm is used when the congestion window (cwnd) is less than the slow start threshold (ssthresh). 
      During slow start, in absence of packet loss signals, TCP increases cwnd exponentially to probe the network capacity. This fast growth can overshoot the ideal sending rate and cause significant packet loss which cannot always be recovered efficiently.
      </t>
      <t> HyStart++ uses delay increase as a signal to exit slow start before potential packet loss occurs as a result of overshoot. This is one of two algorithms specified in <xref target="HyStart"/>. 
      After the slow start exit, a novel Conservative Slow Start (CSS) phase is used to determine whether the slow start exit was premature and to resume slow start. This mitigation improves performance in presence of jitter.
      HyStart++ reduces packet loss and retransmissions, and improves goodput in lab measurements and real world deployments.
      </t>
      <t>  While this document describes Hystart++ for TCP, it can also be used for other transport protocols which use slow start such as QUIC <xref target="RFC9002"/>. 
      </t>
   </section>

    <section title="Terminology" anchor="term">
      <t>
      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in <xref target="RFC2119"/>.
      </t>
    </section>

    <section title='Definitions'>
      <t> We repeat here some definition from <xref target="RFC5681"/> to aid the reader. 
      </t>
      <t> SENDER MAXIMUM SEGMENT SIZE (SMSS): The SMSS is the size of the
      largest segment that the sender can transmit.  This value can be
      based on the maximum transmission unit of the network, the path
      MTU discovery [RFC1191, RFC4821] algorithm, RMSS (see next item),
      or other factors.  The size does not include the TCP/IP headers
      and options.
      </t>
      <t> RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of the
      largest segment the receiver is willing to accept.  This is the
      value specified in the MSS option sent by the receiver during
      connection startup.  Or, if the MSS option is not used, it is 536
      bytes [RFC1122].  The size does not include the TCP/IP headers and
      options.
      </t>
      <t> RECEIVER WINDOW (rwnd): The most recently advertised receiver window.
      </t>
      <t> CONGESTION WINDOW (cwnd): A TCP state variable that limits the amount of data
      a TCP can send.  At any given time, a TCP MUST NOT send
      data with a sequence number higher than the sum of the highest
      acknowledged sequence number and the minimum of cwnd and rwnd.
      </t>
    </section> 

    <section title='HyStart++ Algorithm'>

      <section title='Summary'>
        <t> <xref target="HyStart"/> specifies two algorithms (a “Delay Increase” algorithm and an “Inter-Packet Arrival” algorithm) to be run in parallel to detect that the sending rate has reached capacity. 
        In practice, the Inter-Packet Arrival algorithm does not perform well and is not able to detect congestion early, primarily due to ACK compression. The idea of the Delay Increase 
        algorithm is to look for spikes in RTT (round-trip time), which suggest that the bottleneck buffer is filling up. 
        </t>
        <t> In HyStart++, a TCP sender uses traditional slow start and then uses the “Delay Increase” algorithm to trigger an exit from slow start. But instead of going straight from slow start to congestion avoidance, the sender spends a number of RTTs in a
        Conservative Slow Start (CSS) phase to determine whether the exit from slow start was premature. During CSS, the congestion window is grown exponentially like in regular slow start, but with a smaller exponential base, resulting in less aggressive growth.
        If the RTT reduces during CSS, it's concluded that the RTT spike was not related to congestion caused by the connection sending at a rate greater than the ideal send rate, and the connection resumes slow start. If the RTT inflation
        persists throughout CSS, the connection enters congestion avoidance.
        </t>
      </section>

      <section title='Algorithm Details'>

        <t> For the pseudocode, we assume that Appropriate Byte Counting (as described in <xref target="RFC3465"/>) is in use and L is the cwnd increase limit as discussed in RFC 3465. </t>

        <t> lastRoundMinRTT and currentRoundMinRTT are initialized to infinity at the initialization time </t>

        <t> Hystart++ measures rounds using sequence numbers, as follows:
            <list>
               <t> Define windowEnd as a sequence number initialized to SND.NXT </t>
               <t> When windowEnd is ACKed, the current round ends and windowEnd is set to SND.NXT </t>
            </list>          
        </t>

        <t> At the start of each round during standard slow start (<xref target="RFC5681"/>) and CSS: 
            <list>
               <t> lastRoundMinRTT = currentRoundMinRTT </t>
               <t> currentRoundMinRTT = infinity </t>
               <t> rttSampleCount = 0 </t>
            </list> 
        </t>        

        <t> For each arriving ACK in slow start, where N is the number of previously unacknowledged bytes acknowledged in the arriving ACK: 
          <list>
            <t> Update the cwnd 
              <list style='none'>
                <t> cwnd = cwnd + min (N, L * SMSS) </t>
              </list>
            </t>
            <t> Keep track of minimum observed RTT
                <list style='none'>
                   <t> currentRoundMinRTT = min(currentRoundMinRTT, currRTT) </t> 
                   <t> where currRTT is the RTT sampled from the latest incoming ACK </t>
                   <t> rttSampleCount += 1 </t>
                </list>
            </t>
            <t> For rounds where at least N_RTT_SAMPLE RTT samples have been obtained and currentRoundMinRTT and lastRoundMinRTT are valid, check if delay increase triggers slow start exit
                <list style='none'>
                <t> if (rttSampleCount &gt;= N_RTT_SAMPLE AND currentRoundMinRTT != infinity AND lastRoundMinRTT != infinity)
                    <list style='none'>
                    <t> RttThresh = clamp(MIN_RTT_THRESH, lastRoundMinRTT / 8, MAX_RTT_THRESH) </t>  
                    <t> if (currentRoundMinRTT &gt;= (lastRoundMinRTT + RttThresh))
                    <list style='none'>
                        <t> cssBaselineMinRtt = currentRoundMinRTT </t>
                        <t> exit slow start and enter CSS </t>
                    </list>
                    </t>
                    </list>
                </t>
                </list>
            </t>
          </list> 
        </t>

        <t> CSS lasts at most CSS_ROUNDS rounds. If the transition into CSS happens in the middle of a round, that partial round counts towards the limit. </t>
        <t> For each arriving ACK in CSS, where N is the number of previously unacknowledged bytes acknowledged in the arriving ACK:
          <list>
            <t> Update the cwnd 
              <list style='none'>
                <t> cwnd = cwnd + (min (N, L * SMSS) / CSS_GROWTH_DIVISOR) </t> 
              </list>
            </t>
            <t> Keep track of minimum observed RTT
                <list style='none'>
                   <t> currentRoundMinRTT = min(currentRoundMinRTT, currRTT) </t> 
                   <t> where currRTT is the sampled RTT from the incoming ACK </t>
                   <t> rttSampleCount += 1 </t>
                </list>
            </t>
            <t> For CSS rounds where at least N_RTT_SAMPLE RTT samples have been obtained, check if current round's minRTT drops below baseline indicating that HyStart exit was spurious.
                <list style='none'>
                    <t> if (currentRoundMinRTT &lt; cssBaselineMinRtt)
                    <list style='none'>
                        <t> cssBaselineMinRtt = infinity </t>
                        <t> resume slow start including HyStart++ </t>
                    </list>
                    </t>
                    </list>
            </t>
          </list> 
        </t>

        <t> If CSS_ROUNDS rounds are complete, enter congestion avoidance. 
        <list style='none'>
        <t> ssthresh = cwnd </t>
        </list> 
        </t>         

        <t> If loss or ECN-marking is observed anytime during standard slow start or CSS, enter congestion avoidance.
        <list style='none'>
        <t> ssthresh = cwnd </t>
        </list>      
        </t>

      </section>

      <section title='Tuning constants and other considerations'>
      <t> It is RECOMMENDED that a HyStart++ implementation use the following constants:
      <list style='none'>
          <t> MIN_RTT_THRESH = 4 msec </t>
          <t> MAX_RTT_THRESH = 16 msec </t>
          <t> N_RTT_SAMPLE = 8 </t>
          <t> CSS_GROWTH_DIVISOR = 4 </t>
          <t> CSS_ROUNDS = 5 </t>
      </list>       
      </t>
      <t> These constants have been determined with lab measurements and real world deployments. An implementation MAY tune them for 
      different network characteristics. 
      </t>
      <t> The delay increase sensitivity is determined by MIN_RTT_THRESH and MAX_RTT_THRESH. Smaller values of MIN_RTT_THRESH may cause spurious exits from slow start. Larger values of MAX_RTT_THRESH may result in
      slow start not exiting until loss is encountered for connections on large RTT paths.
      </t>
      <t> A TCP implementation is required to take at least one RTT sample each round. Using lower values of N_RTT_SAMPLE will lower the accuracy of the measured RTT for the round; higher values will 
      improve accuracy at the cost of more processing. 
      </t>  
      <t> The minimum value of CSS_GROWTH_DIVISOR MUST be at least 2. A value of 1 results in the same aggressive behavior as regular slow start. Values larger than 4 
      will cause the algorithm to be less aggressive and maybe less performant.
      </t>
      <t> Smaller values of CSS_ROUNDS may miss detecting jitter and larger values may limit performance.
      </t>
      <t> An implementation SHOULD use HyStart++ only for the initial slow start (when ssthresh is at its initial value of arbitrarily high per <xref target="RFC5681"/>) and fall back to using traditional slow start for the remainder of the connection lifetime. 
      This is acceptable because subsequent slow starts will use the discovered ssthresh value to exit slow start and avoid the overshoot problem. An implementation MAY use HyStart++ to grow the restart window (<xref target="RFC5681"/>) after a long idle period. 
      </t>
      <t>
      In application limited scenarios, the amount of data in flight could fall below the BDP and result in smaller RTT samples which can trigger an exit back to slow start. It is expected that a connection might oscillate between CSS and slow start in such scenarios. But this behavior will neither result in a connection prematurely entering congestion avoidance nor cause overshooting compared to slow start.
      </t>
      </section>

    </section>

    <section title='Deployments and Performance Evaluations'>

        <t> As of the time of writing, HyStart++ as described in draft versions 01 through 04 was default enabled for all TCP connections in the Windows operating system for over three years with an actual L = 8. The original Hystart has been default-enabled for all TCP connections in the Linux operating system using the default congestion control module CUBIC (<xref target="RFC8312"/>) for a decade with an infinite L. 
        </t>
        <t> In lab measurements with Windows TCP, HyStart++ shows both goodput improvements as well as reductions in packet loss and retransmissions compared to traditional slow start. For example, across a variety of tests on a 100 Mbps link with a bottleneck buffer size of bandwidth-delay
        product, HyStart++ reduces bytes retransmitted by 50% and retransmission timeouts by 36%. 
        </t>
        <t> In an A/B test where we compare HyStart++ draft 01 to traditional slow start across a large Windows device population, out of 52 billion TCP connections, 0.7% of connections move from 1 RTO to 0 RTOs and another 0.7% connections move from 2 RTOs to 1 RTO with HyStart++. This test did not focus on send heavy connections and 
        the impact on send heavy connections is likely much higher. We plan to conduct more such production experiments to gather more data in the future. 
        </t>

    </section>

    <section title='Security Considerations'>
      <t> HyStart++ enhances slow start and inherits the general security considerations discussed in <xref target="RFC5681"/>.
      </t>
    </section>

    <section title='IANA Considerations'>
      <t> This document has no actions for IANA.
      </t>
    </section>

  </middle>

  <back>
    <references title='Normative References'>
      &rfc2119;
      &rfc5681;
      &rfc3465;
    </references>

    <references title='Informative References'>
      <reference anchor='HyStart' target='https://pdfs.semanticscholar.org/25e9/ef3f03315782c7f1cbcd31b587857adae7d1.pdf'>
        <front>
          <title>Hybrid Slow Start for High-Bandwidth and Long-Distance Networks</title>
          <author initials="S." surname="Ha">
          </author>
          <author initials="I." surname="Ree">
          </author>
          <date year="2008"/>
        </front>
        <seriesInfo name="DOI" value="10.1145/1851182.1851192"/>
        <seriesInfo name="" value="International Workshop on Protocols for Fast Long-Distance Networks"/>
      </reference>
      &rfc8312;
      &rfc9002;
    </references>
  </back>
</rfc>
