<?xml version="1.0" encoding="utf-8"?>
<!-- name="GENERATOR" content="github.com/mmarkdown/mmark Mmark Markdown Processor - mmark.miek.nl" -->
<rfc version="3" ipr="trust200902" docName="draft-ietf-sml-structured-email-02" submissionType="IETF" category="std" xml:lang="en" xmlns:xi="http://www.w3.org/2001/XInclude" indexInclude="true" consensus="true">

<front>
<title>Structured Email</title><seriesInfo value="draft-ietf-sml-structured-email-02" stream="IETF" status="standard" name="Internet-Draft"></seriesInfo>
<author initials="H.-J." surname="Happel" fullname="Hans-Joerg Happel"><organization>audriga GmbH</organization><address><postal><street></street>
</postal><email>happel@audriga.com</email>
<uri>https://www.audriga.com</uri>
</address></author><date/>
<area>ART</area>
<workgroup>SML</workgroup>

<abstract>
<t>This document specifies how a machine-readable version of the content of email messages can be added to those messages.</t>
</abstract>

</front>

<middle>

<section anchor="introduction"><name>Introduction</name>
<t>Information on websites and in email messages mostly addresses human readers. However, various attempts have been made to make such information - fully or in part - machine-readable, so that tools can assist users in dealing with that information more efficiently.</t>
<t>One widespread approach is the usage of <xref target="SchemaOrg"></xref> vocabulary which can be embedded in the HTML markup of websites. It is then crawled by web search engines and used to improve the quality of search result snippets (e.g., by showing displaying ratings, opening hours, or contact information).</t>
<t>Similarly, a number of web shops, hotels, or airlines include Schema.org vocabulary in order receipt email messages sent to customers. This information is extracted and used by some ISPs and open source tools (<xref target="SchemaOrgEmail"></xref>). However, these implementations differ in many details.</t>
<t>The goal of this specification is to provide a clear and comprehensive specification for this practice and to provide ground for potential future extensions.</t>
</section>

<section anchor="conventions-used-in-this-document"><name>Conventions Used in This Document</name>
<t>The terms &quot;message&quot; and &quot;email message&quot; refer to &quot;electronic mail messages&quot; or &quot;emails&quot; as specified in <xref target="RFC5322"></xref>. The term &quot;Message User Agent&quot; (MUA) denotes an email client application as per <xref target="RFC5598"></xref>.</t>
<t>The terms &quot;machine-readable data&quot; and &quot;structured data&quot; are used in contrast to &quot;human-readable&quot; messages and denote information expressed &quot;in a structured format (..) which can be consumed by another program using consistent processing logic&quot; <xref target="MachineReadable"></xref>.</t>
<t>The key words &quot;MUST&quot;, &quot;MUST NOT&quot;, &quot;REQUIRED&quot;, &quot;SHALL&quot;, &quot;SHALL NOT&quot;, &quot;SHOULD&quot;, &quot;SHOULD NOT&quot;, &quot;RECOMMENDED&quot;, &quot;NOT RECOMMENDED&quot;, &quot;MAY&quot;, and &quot;OPTIONAL&quot; in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"></xref> <xref target="RFC8174"></xref> when, and only when, they appear in all capitals, as shown here.</t>
</section>

<section anchor="representing-structured-data"><name>Representing structured data</name>
<t>In order to exchange structured data, one needs to chose a formal language and a serialization format. Based on this, vocabularies can be helpful to establish a shared understanding of structured data among heterogeneous senders and receivers.</t>

<section anchor="knowledge-representation-language"><name>Knowledge representation language</name>
<t>The Resource Description Framework (<xref target="RDF"></xref>) is a  formal language for knowledge representation standardized by the W3C. It is already used for annotating websites and emails, as it is underlying <xref target="SchemaOrg"></xref>. Among the various serializations for RDF, JSON-LD (<xref target="JSONLD"></xref>) has become the most commonly used serialization used on websites (<xref target="WDCStats"></xref>).</t>
<t>Hence, structured data in email messages <bcp14>SHOULD</bcp14> be expressed in the JSON-LD serialization of RDF.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/1
</artwork>
</section>

<section anchor="vocabularies"><name>Vocabularies</name>
<t>Using RDF/JSON-LD, users are free to express any kind of information in structured data. For reuse and reference however, it is common to agree upon certain core concepts/entities and properties for a certain domain. Those are typically collected and shared in so-called vocabularies.</t>
<t><xref target="SchemaOrg"></xref> is a widespread vocabulary, which was design for annotating content on websites. A small subset of its concepts is already used by email senders and processed by email providers.</t>
<t>Users that want to add structured data into email message <bcp14>SHOULD</bcp14> use concepts from <xref target="SchemaOrg"></xref>, if they fit their use case. They <bcp14>MAY</bcp14> however use any valid JSON-LD.</t>
<t>There might also be certain vocabularies for email-specific use cases (such as [I-D.happel-sml-structured-vacation-notices-00]), that will be specifically endorsed by the IETF or by respective RFCs.</t>
<t>MUAs may choose freely if and how to use structured data extracted from messages. If they do not explictly support a certain vocabulary, MUAs may also rely on extensions or passing data to outside applications, similar to the case of MIME body parts.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/2
</artwork>
</section>
</section>

<section anchor="structured-data-in-email-messages"><name>Structured data in email messages</name>
<t>This section discusses the placement of structured data within email messages and identifiers for referencing between structured data and other parts of a message.</t>

<section anchor="placement"><name>Placement</name>
<t>This document targets structured data describing the content of an email message itself. Since users may add other arbitrary structured data (e.g., as MIME body parts of type &quot;application/ld+json&quot;) to an email message, we need to define which kinds of structured data are supposed to be representative of the email message content.</t>
<t>For this, we distinguish the cases of full, partial, and non-representation.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/3
</artwork>

<section anchor="full-representation"><name>Full representation</name>
<t>If structured data is intended by the sender to <em>fully</em> describe the human readable content of an email message, it <bcp14>MUST</bcp14> be added as a <tt>multipart/alternative</tt> entity with the content type <tt>application/ld+json</tt>.</t>
<t>The email message <bcp14>SHOULD</bcp14> in this case also contain a <tt>text/plain</tt> and a <tt>text/html</tt> version of the content.</t>
<t>MUAs supporting this specification <bcp14>SHOULD</bcp14> prefer the <tt>application/ld+json</tt> representation when receiving such email messages if they are able to process the used vocabulary or are able to process the structured data otherwise.</t>
</section>

<section anchor="partial-representation"><name>Partial representation</name>
<t>If structured data is intended to describe only a <em>subset</em> of the human-readable content, it must be enclosed in a <tt>&lt;script&gt;</tt> HTML tag within the HTML <tt>&lt;body&gt;</tt> tag of the <tt>text/html</tt> body part of the email message (see example at the end).</t>
<t>MUAs receiving such messages may use the structured data to provide an enhanced user experience.</t>
</section>

<section anchor="non-representation"><name>Non-representation</name>
<t>In the case of non-representation, there is no relation between structured data and the human readable content.</t>
<t>This may be useful for special scenarios, such as embedding &quot;preemptive&quot; structured vacation notices as described in [I-D.happel-sml-structured-vacation-notices-00] into email messages.</t>
<t>As in the case of partial representation, MUAs receiving such messages may take according action based on the structured data extracted.</t>
</section>
</section>

<section anchor="identifiers"><name>Identifiers</name>
<t>There are existing use cases for cross-referencing between different parts of a MIME message, for which <xref target="RFC2392"></xref> defines the <tt>cid:</tt> and <tt>mid:</tt> URI schemes.</t>
<t>In a similar fashion, cross-referencing might occur between structured data and other message parts.</t>

<section anchor="using-identifiers-in-structured-data"><name>Using identifiers in structured data</name>
<t>Most nodes and properties in JSON-LD are identified using IRIs <xref target="RFC3987"></xref>. Since any <xref target="RFC2392"></xref> (cid/mid) reference forms a valid IRI, those references can be directly used in JSON-LD.</t>
<t>There are two main cases for which <tt>cid:</tt>-identifiers <bcp14>SHOULD</bcp14> be used in structured data.</t>
<t>First, if structured data references binary content such as images or other files, which already exist as MIME body parts within the same message.</t>
<t>Second, if a <tt>cid:</tt> value is used in a JSON-LD <tt>@id</tt> property, the corresponding JSON-LD node can be considered to describe the MIME body part identified by that <tt>cid:</tt>. This <bcp14>MAY</bcp14> be used to denote that certain structured data is explictily describing that MIME body part. This <bcp14>MUST NOT</bcp14> be used for the main <tt>text/plain</tt> or <tt>text/html</tt> body parts, though.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/4
</artwork>
</section>

<section anchor="using-structured-data-identifiers-in-text-html"><name>Using structured data identifiers in text/html</name>
<t>In the case of &quot;partial representation&quot;, a MUA will still primarily display the human readable part of a message (e.g., <tt>text/plain</tt> or <tt>text/html</tt>).</t>
<t>It might however be helpful if the MUA is able to determine which parts of human readable text refer to certain structured data - e.g., to offer actions based on structured data directly in the context of the corresponding human-readable content.</t>
<t>For this purpose, the sender may add a HTML &quot;data-id&quot; property (<xref target="HTMLData"></xref>) to any HTML entity in the <tt>text/html</tt> body, which references the <tt>@id</tt> property of a JSON-LD node in the structured data.</t>
<t>Besides referencing the corresponding JSON-LD node, a sender might also want to denote if the underlying data is &quot;extensively&quot; described or just mentioned in the human readable representation. Depending on that, a MUA might provide different additional visualizations for the user.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/5
</artwork>
</section>
</section>
</section>

<section anchor="structured-data-across-email-messages"><name>Structured data across email messages</name>

<section anchor="forwarding"><name>Forwarding</name>
<t>Forwarding messages including structured data needs to be considered from a privacy perspective, particularly in cases of &quot;non-representation&quot;, when the user has no way to determine structured data from the human readable part of the message.</t>
<t>A MUA <bcp14>MUST</bcp14> strip non-representative structured data when forwarding messages. Note that this does only apply to MUAs directed by users and not for automated forwarding set up by a user.</t>
<t>Beyond that, privacy issues also apply to forwarding regular email messages, such that a more general solution might be specified outside of the specific context of structured email.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/6
</artwork>
</section>

<section anchor="replies"><name>Replies</name>
<t>In order to allow responses to structured email messages, the <xref target="SchemaOrg"></xref> vocabulary specifies a property called &quot;potentialAction&quot; (<xref target="PotentialAction"></xref>).</t>
<t>Accordingly, there can be two different ways of replying to a structured email: regular email replies such as supported by many MUAs, and particular structured email replies.</t>
<t>MUAs should ensure that both types of reply can be clearly distinguished by end users.</t>
<t>If the &quot;target&quot; property of an action points to a &quot;mailto:&quot; URI, the email user agent <bcp14>SHOULD</bcp14> reply with a structured email if the user triggers the action.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/7
</artwork>
</section>

<section anchor="error-replies"><name>Error replies</name>
<t>In general, an original sender may not assume that a structured email has been processed by a recipient. Hence, there will typically be no response or error message returned, if the receiving MUA cannot make sense of a structured email for whatever reason.</t>
<t>This may be slightly different when sending a structured email in response to an initial structured email. In this case, the original sender <bcp14>MAY</bcp14> want to signal an issue with a response received, such as if a contradicting response has already been received, or if a response is formally inconsistent in another way.</t>
<t>In this case, a &quot;full representation&quot;-style error message <bcp14>MAY</bcp14> be returnend to the sender of the erroneous response. Example: TBD</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/8
</artwork>
</section>

<section anchor="updates"><name>Updates</name>
<t>In human-readable messages, human language can be used to update or recall information that was conveyed in prior messages. Accordingly, there needs to be a machine-readable mechanism that allows to express the update or recall of information of structured data.</t>
<t>Structured data <bcp14>SHOULD</bcp14> be updated, if a later email message with a `SUPERSEDES header field (<xref target="RFC4021"></xref>; &quot;superseding message&quot;) referencing the message id of the original email message is processed. In this case, structured data of the original message should be fully revoked and replaced by the structured data of the superseding message (which might be empty).</t>
<t>Structured data in a superseding message <bcp14>MUST</bcp14> be ignored if:</t>

<ul spacing="compact">
<li>Structured data from the original message is not or cannot be revoked</li>
<li>In particular, if the original message has already been replied to by the recipient</li>
</ul>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/9
</artwork>
</section>
</section>

<section anchor="header-fields-and-message-flags"><name>Header fields and message flags</name>
<t>This sections presents header fields and IMAP flags which are supposed to support MUAs in dealing with structured email.</t>

<section anchor="presence-of-structured-data"><name>Presence of structured data</name>
<t>In some use cases, MUAs might benefit from information about message details without having to evaluate the full message body.</t>
<t>For example, the <tt>$hasAttachment</tt> IMAP flag (<xref target="HasAttachment"></xref>) was proposed to signal the existence of MIME attachments in a message which otherwise would need to be redetermined based on complex MIME parsing.</t>
<t>The following procedures should apply to structured email.</t>
<t>A sending MUA (aMUA) <bcp14>SHOULD</bcp14> add a header field  <tt>Structured data</tt> if a message contains structured data. The value for this field <bcp14>MUST</bcp14> include only one of the following values (case-insensitive):</t>

<ul spacing="compact">
<li><tt>Full</tt> for full representation</li>
<li><tt>Partial</tt> for partial representation</li>
<li><tt>Other</tt> for non-representation</li>
</ul>
<t>The <tt>Structured data</tt> fields <bcp14>SHOULD</bcp14> additionally include (case-insensitive, comma-separated) the value <tt>Action</tt>, if a message contains a &quot;potentialAction&quot; a MUA might want to investigate.</t>
<t>Similarly, the IMAP flags <tt>$hasStructuredData</tt> and <tt>$hasStructuredDataAction</tt> <bcp14>MAY</bcp14> be used, if an inbound message is found to contain structured data, but neither of the aforementioned header fields.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/10
</artwork>
</section>

<section anchor="action-processing"><name>Action processing</name>
<t>A structured email can contain &quot;potentialActions&quot;. MUAs need to ensure that such actions are not triggered multiple times - either within the same MUA or across multiple concurrent MUAs.</t>
<t>For this purpose, the <tt>\Answered</tt> flag (<xref target="RFC9051"></xref>) is not appropriate, as it has an established meaning and implementations for regular, manually authored responses.</t>
<t>Therefore, a MUA <bcp14>MUST</bcp14> set a flag <tt>$structuredDataActionSent</tt> if a potentialAction has been responsed to - either by the user or some other mechanism on behalf of the user.</t>

<artwork>For discussion, see also:
https://github.com/hhappel/draft-happel-structured-email/issues/11
</artwork>
</section>
</section>

<section anchor="examples"><name>Examples</name>

<section anchor="partial-representation-1"><name>Partial representation</name>
<t>Placement of JSON-LD markup in a <tt>text/html</tt> body part:</t>

<artwork>&lt;html&gt;
&lt;body&gt;
&lt;script type=&quot;ld+json&quot;&gt;
...
&lt;/script&gt;
&lt;/body&gt;
&lt;/html&gt;
</artwork>
</section>
</section>

<section anchor="security-and-trust"><name>Security and trust</name>
<t>Email user agents that want to support structured email should follow guidance to ensure trust and security standards. These will be elaborated in a separate specification.</t>
</section>

<section anchor="implementation-status"><name>Implementation status</name>
<t>&lt; RFC Editor: before publication please remove this section and the reference to <xref target="RFC7942"></xref> &gt;</t>
<t>This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in <xref target="RFC7942"></xref>. The description of implementations in this section is intended to assist the IETF in its decision processes in progressing drafts to RFCs. Please note that the listing of any individual implementation here does not imply endorsement by the IETF. Furthermore, no effort has been spent to verify the information presented here that was supplied by IETF contributors. This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist.</t>
<t>According to <xref target="RFC7942"></xref>, &quot;this will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature. It is up to the individual working groups to use this information as they see fit&quot;.</t>

<section anchor="structured-email-plugin-for-roundcube-webmail"><name>Structured Email plugin for Roundcube Webmail</name>
<t>An open source plugin for the Roundcube Webmail software is developed to serve as a reference implementation for this specification (<xref target="RC-SML"></xref>).</t>
<t>Beyond that, some ISPs and open source tools provide implementation partly compliant with this specficiation (<xref target="SchemaOrgEmail"></xref>).</t>
</section>
</section>

<section anchor="security-considerations"><name>Security considerations</name>
<t>See section &quot;security and trust&quot;.</t>
</section>

<section anchor="privacy-considerations"><name>Privacy considerations</name>
<t>See section &quot;security and trust&quot;.</t>
</section>

<section anchor="iana-considerations"><name>IANA Considerations</name>
<t>This document has no IANA actions at this time.</t>
<t>(TBD IMAP flags)</t>
</section>

</middle>

<back>
<references><name>Informative References</name>
<reference anchor="HTMLData" target="https://html.spec.whatwg.org/multipage/dom.html#attr-data-*">
  <front>
    <title>HTML Living Standard: Embedding custom non-visible data with the data-* attributes</title>
    <author>
      <organization>WHATWG</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="HasAttachment" target="https://mailarchive.ietf.org/arch/msg/imapext/MVE5eNHOaNIVGUvN1RKtBL8b278/">
  <front>
    <title>Registering $hasAttachment &amp; $hasNoAttachment</title>
    <author>
      <organization>IETF imapext WG mailing list</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="JSONLD" target="https://www.w3.org/TR/json-ld/">
  <front>
    <title>JSON-LD 1.1</title>
    <author>
      <organization>W3C JSON-LD Working Group</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="MachineReadable" target="https://csrc.nist.gov/glossary/term/Machine_Readable">
  <front>
    <title>NIST IR 7511 Rev. 4</title>
    <author>
      <organization>NIST</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="PotentialAction" target="https://schema.org/potentialAction">
  <front>
    <title>Schema.org: potentialAction</title>
    <author>
      <organization>W3C Schema.org Community Group</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="RC-SML" target="https://github.com/audriga/roundcube-structured-email/">
  <front>
    <title>Structured Email plugin for Roundcube Webmail</title>
    <author>
      <organization>audriga GmbH</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="RDF" target="https://www.w3.org/TR/rdf11-concepts/">
  <front>
    <title>RDF 1.1 Concepts and Abstract Syntax</title>
    <author>
      <organization>W3C RDF Working Group)</organization>
    </author>
    <date></date>
  </front>
</reference>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2392.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3987.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4021.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5322.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5598.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7942.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9051.xml"/>
<reference anchor="SchemaOrg" target="https://schema.org/">
  <front>
    <title>Schema.org</title>
    <author>
      <organization>W3C Schema.org Community Group</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="SchemaOrgEmail" target="https://structured.email/content/related_work/frameworks/schema_org_for_email.html">
  <front>
    <title>Schema.org for email</title>
    <author>
      <organization>Structured Email</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="WDCStats" target="http://webdatacommons.org/structureddata/#toc3&#xA;">
  <front>
    <title>Web Data Commons - Microdata, RDFa, JSON-LD, and Microformat Data Sets</title>
    <author>
      <organization>Web Data Commons Project</organization>
    </author>
    <date></date>
  </front>
</reference>
</references>

</back>

</rfc>
