Traffic Advice

A Collection of Interesting Ideas,

Issue Tracking:
GitHub
Editor:
(Google)

Abstract

A proposal to allow site owners to advise prefetch proxies and other agents to disallow traffic.

1. Introduction

This section is non-normative.

Publishers might wish not to accept traffic from private prefetch proxies and other sources other than direct user traffic, for instance to reduce server load due to speculative prefetch activity.

We propose a well-known "traffic advice" resource, analogous to robots.txt (for web crawlers), which allows an HTTP server to request that implementing agents stop sending traffic to it for some time.

2. Implementations

This specification may be implemented by traffic advice respecting agents, such as proxy servers or other applications which direct HTTP traffic on behalf of clients such as a web browser.

While [FETCH] is used to describe the algorithm to request this resource, such agents might not implement [HTML].

3. Definitions

A traffic advice entry is a struct with the following items:

A traffic advice result is null, a traffic advice entry, or "unreachable".

An agent identity is a list of strings. It must contain at least two elements, and the last must be "*".

4. Identity

Each agent should have an brand name that specifically identifies it (such as PollyPrefetchProxy).

Its agent identity is all of the following that apply, in order:

  1. The brand name

  2. "prefetch-proxy", if the agent is a proxy server which exclusively serves prefetch traffic (for example, a private prefetch proxy)

  3. "*"

5. Fetching

To generate a traffic advice URL for origin origin, run the following steps:

  1. If origin is not a tuple origin, return failure.

  2. If origin’s scheme is not an HTTP(S) scheme, return failure.

  3. If origin is not a potentially trustworthy origin, return failure.

  4. Return a new URL as follows:

    scheme

    origin’s scheme

    host

    origin’s host

    port

    origin’s port

    path

    « ".well-known", "traffic-advice" »

To fetch traffic advice for origin origin, agent identity identity and algorithm whenComplete accepting a traffic advice result:

  1. Let url be the result of generating a traffic advice URL for origin. If it results in failure, then return failure.

  2. Let request be a request as follows:

    method

    `GET`

    URL

    url

    client

    null

    credentials mode

    "omit"

    redirect mode

    "manual"

    This means that a redirect status will not lead to another origin being contacted.
  3. Let fetchController be null.

  4. Let processResponse be the following steps, given response response:

    1. If response’s type is "error", then terminate fetchController, run whenComplete with "unreachable", and return.

    2. If response’s type is "opaqueredirect", then terminate fetchController, run whenComplete with null, and return.

    3. Assert: response’s type is "basic".

    4. If response’s status is 429 (Too Many Requests; see [RFC6585]) or 503 (Service Unavailable; see [HTTP-SEMANTICS]), then terminate fetchController, run whenComplete with "unreachable", and return.

      If present, the [HTTP-SEMANTICS] Retry-After response header could be used as a hint about when to next retry.
    5. If response’s status is not an ok status, then terminate fetchController, run whenComplete with null and return.

    6. If response’s status is a null body status, then terminate fetchController, run whenComplete with null and return.

    7. Let mimeType be the result of extracting a MIME type from response’s header list.

    8. If mimeType is failure or its essence is not "application/trafficadvice+json", then terminate fetchController, run whenComplete with null and return.

  5. Let processResponseEndOfBody be the following steps, given response response and null, failure or byte sequence body:

    1. If body is not a byte sequence, then run whenComplete with null and return.

    2. Let string be the result of UTF-8 decoding body.

    3. Let parseResult be the result of parsing traffic advice from string given identity.

    4. Run whenComplete with parseResult.

  6. Fetch request with processResponse set to processResponse and processResponseEndOfBody set to processResponseEndOfBody, and set fetchController to the result.

    Notwithstanding the usual behavior of [HTTP-CACHING], agents (especially ones shared amongst multiple users) should consider applying a minimum freshness lifetime (10 minutes is suggested) and maximum freshness lifetime (48 hours is suggested) in order to balance the security considerations discussed below. If these suggested values are used, a default freshness lifetime (if none is specified) of 30 minutes may be appropriate.

6. Parsing

To parse traffic advice from a string string given agent identity identity:

  1. Let parsed be the result of parsing JSON into Infra values given string. If this throws an exception, then return null.

  2. If parsed is not a list, then return null.

  3. Let bestMatch be null.

  4. For each entry of parsed:

    1. If entry is not a map, then continue.

    2. If entry["user_agent"] does not exist or is not a string, then continue.

    3. Let agentSelector be entry["user_agent"].

    4. If identity does not contain agentSelector, then continue.

    5. If bestMatch is null or agentSelector appears at an earlier index in identity than bestMatch["user_agent"] does, then set bestMatch to entry.

  5. If bestMatch is null, then return null.

  6. Let entry be a traffic advice entry.

  7. If bestMatch["disallow"] exists and is true, then set entry’s disallowed flag to true.

  8. If bestMatch["fraction"] exists and is a number, then:

    1. Let fraction be bestMatch["fraction"].

    2. If fraction is greater than or equal to 0 and less than or equal to 1, then set entry’s fraction to fraction.

  9. Return entry.

7. Interpretation

When they would be able to respect advice to disallow traffic to an origin (for example, when requested to proxy prefetch traffic to the origin), traffic advice respecting agents should fetch traffic advice (respecting [HTTP-CACHING] semantics).

If the result is null, then no advice was received. Agents should adopt their default behavior.

If the result is "unreachable", then the HTTP server was not able to service the request for traffic advice. Since this could indicate that the server cannot accept additional requests at this time, agents may stop traffic to the server for some interval.

If the result’s disallowed flag is true, then the HTTP server advises that traffic is discouraged at this time. Agents should respect this by not establishing new connections or sending new requests.

Otherwise, if the result’s fraction is less than 1, then the HTTP server advises that it would like to receive only a fraction of the possible traffic. Agents may implement this as they see fit, but the following algorithm is suggested on establishment of an HTTP connection on behalf of a client.

  1. Choose a uniform random number r between 0 and 1.

  2. If r is less than or equal to the result’s fraction, then the traffic is permitted by the fraction.

  3. Otherwise, a connection is not established.

This process should not be repeated as part of automatic retry logic, since this would defeat the server’s ability to shed load in this manner. Broadly, agents should aim for a fraction of 0.1 to result in approximately 10% of the traffic to the HTTP server.

This approach allows servers to scale their traffic proportionally as part of an incremental rollout. Agents should avoid approaches which might bias the permitted connections or requests in ways that might make this scaling non-linear (e.g., by preferring certain kinds of connection or user).

8. Security considerations

8.1. Type confusion

Like other resources, it is possible that the /.well-known/traffic-advice path could be used for a request with some other destination (e.g., as a script). If interpreted as JavaScript, the JSON data would either be syntactically invalid or an empty block. More generally, this specification requires the use of a MIME type that is not used for any other purpose, and standard countermeasures (e.g., X-Content-Type-Options: nosniff) can be used to prevent type confusion in some cases which are permissive of mismatched MIME types.

8.2. Caching issues

Because the traffic advice resource is expected to be cached by traffic advice respecting agents such as private prefetch proxies, it is possible that a temporary compromise of an origin server or its private key could be extended to a longer outage of some traffic due to an agent caching a policy that prevents or throttles traffic, leading to a denial of service for such traffic. This is similar to attacks against HTTP Public Key Pinning [RFC7469].

This is less of an issue if the traffic is non-essential (e.g., prefetch) traffic.

To mitigate this, well-behaved agents implement a maximum freshness lifetime when they fetch traffic advice.

8.3. Request amplification

Agents which are proxy services accessible to untrusted users (esp. the general public) may be susceptible to being used to amplify a denial of service attack conducted, for example, by a botnet. For example, if a small request from a client (e.g. CONNECT target.example:443 with small headers) can cause a larger request (e.g., GET /.well-known/traffic-advice with large headers) to the origin server, this could be used to increase the effective bandwidth available to the distributed denial of service attack against an origin server.

To mitigate this, well-behaved agents implement, in addition to other anti-abuse measures, a minimum freshness lifetime when they fetch traffic advice.

9. Privacy considerations

This specification provides general mechanisms for agents to limit the traffic they are sending. Most privacy considerations are expected to be particular to the agents in question (for example, proxies inspecting traffic they carry).

If privacy considerations related to the traffic advice mechanism itself are identified, they should be added here.

10. IANA considerations

10.1. Well-known traffic-advice URI

This document defines well-known URI suffix traffic-advice as described by [WELL-KNOWN]. It should be submitted for registration as follows:

URI suffix

traffic-advice

Change controller

The editor(s) of this document, pending a standards venue

Specification(s)

This document

Status

provisional

Related information

None

10.2. The application/trafficadvice+json MIME type

This document defines the MIME type application/trafficadvice+json as described by [RFC6838]. It should be submitted for registration as follows:

Type name

application

Subtype name

trafficadvice+json

Required parameters

N/A

Optional parameters

N/A

Encoding considerations

Always UTF-8

Security considerations

See Security considerations.

Interoperability considerations

This MIME type is not known to be in previous use. Applications which can process application/json should be able to process all valid data with this MIME type.

Published specification

This document

Applications that use this media type

traffic advice respecting agents

Fragment identifier considerations

N/A

Additional information
Deprecated alias names for this type

N/A

Magic number(s)

N/A

File extension(s)

None. This resource will be named traffic-advice when fetched over HTTP.

Macintosh file type code

Same as for application/json [RFC8259]

Person & email address to contact for further information

The editor(s) of this document

Intended usage

Common

Restrictions on usage

N/A

Change controller

The editor(s) of this document, pending a standards venue

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[ENCODING]
Anne van Kesteren. Encoding Standard. Living Standard. URL: https://encoding.spec.whatwg.org/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[MIMESNIFF]
Gordon P. Hemsley. MIME Sniffing Standard. Living Standard. URL: https://mimesniff.spec.whatwg.org/
[SECURE-CONTEXTS]
Mike West. Secure Contexts. URL: https://w3c.github.io/webappsec-secure-contexts/
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/

Informative References

[HTTP-CACHING]
R. Fielding; M. Nottingham; J. Reschke. HTTP Caching. Internet-Draft. URL: https://httpwg.org/http-core/draft-ietf-httpbis-cache-latest.html
[HTTP-SEMANTICS]
R. Fielding, Ed.; J. Reschke, Ed.. Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. June 2014. Proposed Standard. URL: https://httpwg.org/specs/rfc7231.html
[RFC6585]
M. Nottingham; R. Fielding. Additional HTTP Status Codes. April 2012. Proposed Standard. URL: https://httpwg.org/specs/rfc6585.html
[RFC6838]
N. Freed; J. Klensin; T. Hansen. Media Type Specifications and Registration Procedures. January 2013. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc6838
[RFC7469]
C. Evans; C. Palmer; R. Sleevi. Public Key Pinning Extension for HTTP. April 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7469
[RFC8259]
T. Bray, Ed.. The JavaScript Object Notation (JSON) Data Interchange Format. December 2017. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc8259
[WELL-KNOWN]
M. Nottingham. Well-Known Uniform Resource Identifiers (URIs). May 2019. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc8615