RTP

October 25, 2024

refining the ideas behind HTTP, BitTorrent, Gemini, and 9p to create a simple protocol for reliably transferring large immutable files.

(this document is a first draft, and is not intended to be implemented in its current form)

NOTE: this document has been obsolesced by RITP

Goals

as simple as possible without sacrificing our other goals (less complex than http)
performs well under poor network conditions, even when payloads are large (unlike gemini)
performs well under high latency (unlike 9p)
good single-source performance (unlike BitTorrent)

Non-Goals

content negotiation (http Accept headers)
mutable identifiers
version negotiation

all of these can be trivially handled via an outer protocol (eg. mutable identifiers can be handled with gemini cross-protocol redirects)

Strategies

a stateful request/response protocol similar to 9p, but with hash-identified urls similar to magnet links (these encode length and content hash)

streams are encoded over tcp, tcp/tls, or quic to ensure reliable delivery.

Notation

each request has a number of fields. each field is marked either as a fixed number of bits, or as a “string” field.

“string” fields consist of a 64 bit length value, followed by that many bytes.

“string” fields do not have to be valid utf-8 unless specified.

all integers are little-endian. note that tokens are not integers, they are opaque client chosen identifiers with a length of 63 bits (followed by a 1 bit flag, which takes the place of the LSB of the final bytes). servers must take to mask the correct bit.

Requests and Responses

there are 2 request types: * OPEN * READ

there are 2 response types: * OK * ERROR

each request has the following structure: * (63 bits) new_token: a token identifying the results of the request * (1 bit) request_type: integer representing type of request * (n bits) type-dependent fields

each response has the following structure: * (63 bits) request_token: the client-chosen token found in the request that generated this responce * (1 bit) is_error: set if the corresponding request generated an error (such as the server being unable to find the requested resource) * (string) payload: if is_error is set, then a human and machine readable UTF-8 string representing an error. otherwise, its interpretation depends on the type of the request.

OPEN request

open requests one extra “string” field: * (string) uri: this field represents the uri of the resource to be downloaded. what schemes are supported depends on the server, but it is recommended to support at least urn:sha256:* uris.

the payload of non-error responses to OPEN requests is ignored, and SHOULD be empty.

READ request

read requests have two extra fields: * (64 bits) offset: at what point to begin reading * (64 bits) length: the maximum number of payload bytes the server is allowed respond with.

the payload of non-error responses to READ requests MUST be the empty string if and only if offset is greater than or equal to the total number of bytes in the resource, or if length is 0. if neither of these conditions are met, the payload MUST NOT be the empty string.

when the server generates an error in response to a token, all further READ requests targeted at that token are canceled. this allows a client to begin sending READ requests before it has received a response to the OPEN request.

URL schemes

`rtp`

much like the magnet url scheme, the rtp scheme consists entirly of predefined query parameters:

(one or more) u: the uri/urn of the underlying resource. can be specified multiple times to specify multiple hashes for the resource (the client is expected to verify the hash, so specifying multiple u allows slowly migrating to a new hash algo). if multiple values are specified, they must correspond to the same resource.
(up to one) l: the length of the resource
(one or more) s: the server(s) that the resource can be retrieved from. if specified multiple times, the client may choose one, or perform a swarm download from several at once. these servers take the form of proto!addr!port, for example, tcp!example.com!7777. (this is based off of plan9 dial strings, since it seems to be the only well-specified way of specifying a method of transport)
(zero or more) t: list of mime types that the resource may be interpreted as. clients MAY ignore values other than the first.
(up to one) v: protocol version. if not specified, it defaults to version 1, which is the version specified in this document. clients MUST reject urls with an unrecognized version.

it is RECOMMENDED that every value of u is recognized by every server s. if a client encounters an error when downloading from one server, it SHOULD try downloading from another server.

clients SHOULD NOT use rtp urls in OPEN requests, instead they should choose a value listed in u.

unrecognized fields SHOULD be ignored.

`gemini+rtp`

use the Gemini Protocol in order to implement mutable identifiers.

gemini is used in “proxy mode”, that is, the sent url has a scheme of gemini+rtp and not gemini. the gemini server then uses a cross-protocol redirect to return an rtp url.

Rationale

READ.length is defined as a maximum so that clients that do not know the length of the resource they are downloading can use a value of 2^64 - 1 to request as much of the rest of the resource as the server is able to provide.

READ.offset exists both so that downloads can be resumed, and also to allow seeking within complex formats (eg. allowing you do download just one file out of a zip archive). It also allows doing swarm downloads from multiple equally trusted sources.

Appendix A: error strings

an error string consists of a machine readable string representing the kind of error, optionally followed by a colon, and then a human-readable string further clarifying the error.

the following predefined error strings: * error: a generic error kind usable when nothing else is applicaple * not found: no resource with the given uri is known to the server. * unsupported scheme: the given uri or urn scheme is not supported

Errata 2024-10-28

READ needs an additional field, open_token, which corresponds to the the request_token of an OPEN request.
RTP is commonly used as an abbreviation for the Real-time Transport Protocol, and is not descriptive enough.
this is a protocol for reliably downloading large files. it is not designed to be a drop in replacement for http or BitTorrent.

I am currently drafting a successor proposal that addresses these issues.

#networking #programming

You can follow this blog via its RSS feed or by searching for @[email protected] on your Mastodon/ActivityPub instance.