Skip to content

Latest commit

 

History

History
665 lines (500 loc) · 24.9 KB

umf.md

File metadata and controls

665 lines (500 loc) · 24.9 KB

Universal Messaging Format

A message format specification for use with distributed applications.

Version

The current version of this specification is: UMF/1.4.6, which introduces the signature keyword.

License

UMF is licensed under the Open Source The MIT License (MIT) and authored by Carlos Justiniano (cjus34@gmail.com) This specification is hosted on Github at: https://github.com/cjus/umf/blob/master/umf.md

Table of Contents


1. Introduction

This specification describes an application-level messaging format suitable for use with WebSockets, Message Queuing and in traditional HTTP JSON payloads. The proposed message format is designed as a replacement format to avoid using inconsistent formats.

From this point forward we’ll refer to the Universal Messaging Format as UMF. We’ll also refer to UMFs as documents because they can be stored in memory, transmitted along communication channels and retained in offline storage and message queues.

2. Message Format

The UMF is a valid JSON document that is required to validate with existing JSON validators and thus be fully compliant with the JSON specification.

Example validators:

JSON Specification: http://www.json.org/

JSON is a data interchange format based on JavaScript Object Notation. As such, UMF which is encoded in JSON follows well-established JavaScript naming conventions. For example, UMF retains the use of camel case for compound names.

2.1 Envelope format

UMF uses an envelope format where the outer portion is a valid JSON object, and the inner portion consist of directives (headers) and a message body. Variations of this approach are used in the SOAP XML format where the outer portion is called an envelope, and the inner portion contains both header and body sections. In other formats, headers and body may be side by side as is the case of HTTP.

In UMF the inner portion consists of UMF reserved key/value pairs with an optional body object.

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "rmid": "66c61afc-037b-4229-ace4-5ec4d788903e",
  "to": "uid:123",
  "from": "uid:56",
  "type": "dm",
  "version": "UMF/1.4.3",
  "priority": "10",
  "timestamp": "2013-09-29T10:40Z",
  "body": {
    "message": "How is it going?"
  }
}

Only UMF reserved words may be used. However, application specific (custom) key/value pairs may be used freely inside the body value. This strict requirement ensures that the message format has a strict agreed upon format as defined by its version number.

2.1.1 Envelop format considerations for routing

A UMF message is considered to be a system routable message. The UMF message fields aid first and foremost in routing but can also secondarily be used by message processing systems (i.e. message handlers) to obtain additional routing related fields. However, the fields are reserved for routing systems and not intended to be extensible with application specific fields. Where application specific fields are needs, they should be placed on the body value instead.

2.2 Reserved Fields

As described earlier UMF consists of reserved key/value pairs with an optional embedded body object. This section describes each of the reserved fields and their intended use. A reserved field consists of a name key followed by a value which is encoded in a strict format. As we’ll see later in this specification, only the body field differs from this requirement. The take-away here is that a reserved field has a value content which follows a strict format.

2.2.1 Mid field (Message ID)

Each message must contain an mid field with a universally unique identifier (UUID) as a value.

For example:

{
  "mid":"ef5a7369-f0b9-4143-a49d-2b9c7ee51117"
}

The mid field is required to uniquely identify a message across multiple applications, services, and replicated servers. All programming environments support the use of UUIDs.

Because UMF is an asynchronous message format, an individual message may be generated on either a server or client application. Each is required to create a UUID for use with a newly created message.

The mid field is a required field.

2.2.2 Rmid field (Refers to Message ID)

The Refers to Message ID (rmid) is a method of specifying that a message refers to another message. The use of the rmid field is helpful in the case of a message that requires a reply, or where a reply finalizes or changes an application's state machine. This is also useful in threaded conversations there a message may be sent in reply to another pre-existing message.

An important use-case for the rmid field is when it's desirable to trace the path an originating message took as it moved through a processing pipeline.

The rmid field is NOT a required field.

2.2.3 To field (routing)

The to field is used to specify message routing. A message may be routed to other users, internal application handlers or other remote servers. Additionally, a message could be routed directly to an API server with careful use of the value specified in the to.

The value of a to field is a colon, or forward slash-separated list of sub names.

For example:

{
  :
  "to":"server:service"
}

specifies that the message should be routed to a server called server and a service called service. Another example would be a message which is intended to be routed to a specific user:

{
  :
  "to":"UID:143"
}

Sending a message to a service which exposes an API might look like this:

{
  :
  "to":"emailer:[post]/v1/send/email"
}

In the example above a UMF formatted message is being sent to a service called emailer and routed to the v1/send/email API endpoint. In a use case such as this, parsing the to field value split on : would yield an array with two values:

  • emailer
  • [post]/v1/send/email

This simplifies routing API calls since the first array field is the service name and the second field specifies the API route. Note that you may optionally specify an HTTP VERB (such as get, post, put, delete, head, trace) inside of opening and closing braces. In the example above and HTTP POST operation is intended. If the optional HTTP verb is missing then post is assumed. Care should be taken to encode URL endpoint characters : and / to avoid mishandling when used in UMF message routing.

Where applicable, the HTTP body becomes the UMF message body.

The use of the colon (:) or forward slash (/) separator is intended to simplify parsing using the split function available in string parsing libraries. Colon and forward slash may be used in the same key value, and their use may differ across messages. Keep in mind that the value of a to field may not be a single entity but rather a broadcasting service which sends the message to various subscribers.

The to field is a required field and must be present in all messages.

2.2.4 Forward field (routing)

The forward field can be used to designate where a message should be sent to. Potential uses include:

  • processing a message by the service specified in the to field and then having that service send its results to the service specified in the forward field for additional processing.
  • transforming the contents (including body) of a message before forwarding it to another service. In this way, a service might act as a proxy for one or more additional services.

Example:

{
  :
  "to":"customer:registration:service",
  "forward":"customer:account:billing"
}

2.2.5 From-field-(routing)

The from field is used to specify a source during message routing. Like the to field, the value of a from field is a colon, or forward slash-separated list of sub names.

For example:

{
  :
  "from":"server:service"
}

The above specifies that the message should be routed to a server called server and a service called service. Another example would be a message which is intended to be routed to a specific user:

{
  :
  "from":"UID:64"
}

The use of the colon (:) or forward slash (/) separator is intended to simplify parsing using the split function available in string parsing libraries. Colon and forward slash may not be used in the same key value. However, their use may differ across messages.

The from field is a required field and must be present in all messages.

2.2.6 Type field (message type)

The message type field describes a message as being of a particular classification.

{
  :
  "type":"event"
}

The type field is a NOT a required field. If type is missing from a message, a type of msg is assumed:

{
  :
  "type":"msg"
}

An application developer may choose to implement a sub-message type inside of their message’s body object.

{
  :
  "body":{
    "type":"chat",
    "message":"How is it going?"
  }
}

However, the message will still be handled as a generic message (msg) by other servers and infrastructure components which may or may not inspect the contents of the custom body object. For this reason, it’s recommended that standard type fields be used whenever possible.

2.2.7 Version field

UMF messages have a version field that identifies the version format for a given message.

{
  :
  "version": "UMF/1.3"
}

The version value format is of the form of "umf/major version . minor version . revision version". The version field should begin with UMF/ (in caps) to identify the JSON as a UMF compatible message.

"...the major number is increased when there are significant jumps in functionality, the minor number is incremented when only minor features or significant fixes have been added, and the revision number is incremented when minor bugs are fixed." - http://en.wikipedia.org/wiki/Software_versioning

Applications implementing UMF may choose to only consider the major and minor version segments, ignoring the revision version.

The version field is a required field, must be present in all messages and is required to include both a major and minor version number.

2.2.8 Priority field

UMF documents may include an optional priority field. If not present a default value equal to default priority is assumed. If present, priority field values are in the range of 10 (highest) to 1 (lowest).

Normal priority is valued at 5.

{
  :
  "priority": "10"
}

In addition to numeric values the strings “low”, ”normal”, ”high” may be used to indicate message priorities:

{
  :
  "priority": "high"
}

The priority field is NOT a required field and defaults to “normal” priority if unspecified.

2.2.9 Timestamp field

UMF supports a timestamp field which indicates when a message was sent. The format for a timestamp is ISO 8601, a standard date format. http://en.wikipedia.org/wiki/ISO_8601

When using an ISO 8601 formatted timestamp, UMF requires that the time be in the UTC timezone.

{
  :
  "timestamp":"2013-09-29T10:40Z",
}

All programming environments include support for converting local machine time to UTC.

There are a number of reasons why message timestamps are useful:

  • Communication can be ordered chronologically during display, searching, and storage.
  • Messages past a set time can be handled differently, including being purged from a system.

The timestamp field is a required UMF field.

2.2.10 Ttl field (time to live)

The ttl field is used to specify how long a message may remain alive within a system. The value of this field is an amount of time specified in seconds.

{
  :
  "ttl": "300"
}

The ttl field is optional and if not present in a UMF document the default is a ttl which never expires.

2.2.11 Body field (application level data)

The body field is used to host an application-level custom object. This is where an application may define a message content which is meaningful in the context of its application.

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "uid:56",
  "from": "game:store",
  "version": "UMF/1.3",
  "timestamp": "2013-09-29T10:40Z",
  "body": {
    "type": "store:purchase",
    "itemID": "5x:winnings:multiplier",
    "expiration": "2014-02-10T10:40Z"
  }
}

In the example above a user receives confirmation of a purchase (power-up item) from the game store, the items can then be added to the user’s inventory.

2.2.11.1 Overriding UMF restricted key / value pairs

As mentioned earlier UMF restricted fields may not be used in occurrence to this specification, however, an application may override UMF restricted fields by including those fields in its custom body object.

The following are potential use-cases:

  • The application may choose to include its own sub-routing and override the to and from fields.
  • It might be desirable to have an application-level message version.
  • The application may require its own message id (mid) format.

The application level code is free to override the meaning of UMF restricted keys by looking inside its body object for potential overrides.

2.2.11.2 Sending binary data

Binary or encrypted / encoded messages may be sent via the UFM by using a JSON compatible data converter such as Base64 http://en.wikipedia.org/wiki/Base64

When using a converter, the base format should be indicated in a user level field such as “contentType” whose value should be a standard Internet Media Type (formally known as a MIME type) http://en.wikipedia.org/wiki/Internet_media_type

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "uid:134",
  "from": "uid:56",
  "version": "UMF/1.3",
  "timestamp": "2013-09-29T10:40Z",
  "body": {
    "type": "private:message",
    "contentType": "text/plain",
    "base64": "SSBzZWUgeW91IHRvb2sgdGhlIHRyb3VibGUgdG8gZGVjb2RlIHRoaXMgbWVzc2FnZS4="
  }
}

In this way, audio and images may be transmitted via UMF.

2.2.11.3 Sending multiple application messages

An application may, for efficiency reasons, decide to bundle multiple sub-messages inside of a single UMF document. The recommended method of doing this to define a body object which contains one or more sub-messages.

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "uid:134",
  "from": "chat:room:14",
  "version": "UMF/1.3",
  "timestamp": "2013-09-29T10:40Z",
  "body": {
    "type": "chat:messages",
    "messages": [
      {
        "from": "moderator",
        "text": "Susan welcome to chat Nation NYC",
        "ts": "2013-09-29T10:34Z"
      },
      {
        "from": "uid:16",
        "text": "Rex, you are one lucky SOB!",
        "ts": "2013-09-29T10:30Z"
      },
      {
        "from": "uid:133",
        "text": "Rex you're going down this next round",
        "ts": "2013-09-29T10:31Z"
      }
    ]
  }
}

In the example above messages consists of an array of objects.

2.2.12 Authorization field

The authorization field is used to pass an HTTP authorization value or authentication token.

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "uid:56",
  "from": "game:store",
  "version": "UMF/1.3",
  "authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ",
  "timestamp": "2013-09-29T10:40Z",
  "body": {
    "type": "store:purchase",
    "itemID": "5x:winnings:multiplier",
    "expiration": "2014-02-10T10:40Z"
  }
}

In the example above the authorization field contains a JSON Web Token.

2.2.13 For - on behalf of

The for field is used to indicate who a message is being sent on behalf of. This helps support the use-case where a message may not be sent by an actual client but instead created by a service on behalf of a client. In this case, routable information needs to be sent to help associate the entity which may ultimately require notification.

The value of the for field is left up to the host application, but the use of unique hashes are recommended.

2.2.14 Via - sent through

The presence of a via field indicates that the message was sent via an intermediary such as a router. When sending replies, it's often useful to send the reply to the intermediary for routing.

The format of the via field should be the same as to and from fields.

2.2.15 Headers - protocol headers

When necessary communication protocol headers may be sent inside of a UMF message.

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "uid:123",
  "from": "uid:56",
  "version": "UMF/1.4.4",
  "headers": {
    "Content-Type": "text/html"
  },
  "body": {}
}

The header field should contain an object consisting of key/value pairs.

2.2.16 Timeout - timeout recommendation

Messages can carry a recommended timeout value which can be used at the discretion of the application transport and routing layer. For example, an HTTP request may use this timeout to abort the wait for a response after some time.

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "uid:123",
  "from": "uid:56",
  "version": "UMF/1.4.5",
  "timeout": 5,
  "headers": {
    "Content-Type": "text/html"
  },
  "body": {}
}

The timeout value is a number representing a number of seconds. Sub-second timeout isn't supported.

2.2.17 Signature field

Messages may be signed with an HMAC signature to help ensure that can a message was created by a known source.

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "uid:123",
  "from": "uid:56",
  "version": "UMF/1.4.6",
  "signature": "c0fa1bc00531bd78ef38c628449c5102aeabd49b5dc3a2a516ea6ea959d6658e",
  "body": {}
}

The creation of a signed UMF message is accomplished by first creating a UMF message and obtaining a signature using a cryptographic library and algorithm such as sha256. Once the signature is obtained, it can be added to the UMF message using the signature field. The receiving end of the UMF message would then remove the signature from the UMF message and perform the same HMAC pass using the shared secret. The message is considered valid if the resulting signatures match.

3. Use inside of HTTP

UMF documents may be sent via HTTP request and responses. Proper use requires setting HTTP content header, Content-Type: application/json

POST http://server.com/api/v1/message HTTP/1.1
Host: server.com
Content-Type: application/json; charset=utf-8
Content-Length: {length}

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "rmid": "66c61afc-037b-4229-ace4-5ec4d788903e",
  "to": "uid:123",
  "from": "uid:56",
  "type": "dm",
  "version": "UMF/1.3",
  "priority": "10",
  "timestamp": "2013-09-29T10:40Z",
  "body": {
    "message": "How is it going?"
  }
}

4. Peer-to-Peer Communication

UMF documents may be used to exchange P2P messages between distributed services. One example of this would be a service which sends its application health status to a monitoring and data aggregation service.

For example:

{
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "stats:server",
  "from": "chat:server:23",
  "version": "UMF/1.3",
  "priority": "10",
  "timestamp": "2013-09-29T10:40Z",
  "body": {
    "totalRooms": "200",
    "averageUsersPerRoom": "13",
    "averageRoomStay": "1200"
  }
}

In the example above a game, server is sending a message to the stats:server indicating its stats at UTC 2013-09-29T10:40Z.

5. Infrastructure considerations

5.1 Message storage

UMF is designed with distributed servers in mind. The use of mid’s (unique message IDs) allows messages to be stored by their message id in servers such as Redis, MongoDB and in other data stores / databases.

5.2 Message routing

The use of the UMF to and from fields support message routing. The implementation of message routers is deferred to UMF implementers. The use of the colon and forward slash separators is intended to allow routes to easily parse for target handlers.

5.2.1 Message forwarding

It’s possible to route a message between (through) servers by implementing message forwarding.

For example:

{
  :
  "mid": "ef5a7369-f0b9-4143-a49d-2b9c7ee51117",
  "to": "router:uk-router:chat:room:12"
}

The message above might be sent to a router (service) which then parses the to field and realizes that the message is intended for a UK server. So it forwards the message to the uk-router which in turn sends the message to a server which hosts room 12.

6. Short form syntax

UMF is being used in IoT (Internet of Things) applications where message sizes need to remain small. To allow UMF to be used in resource contained environments a short form of UMF is offered.

For each UMF keyword an abbreviated alternative may be used:

Keyword Abbreviation
authorization aut
body bdy
for for
forward fwd
from frm
headers hdr
mid mid
priority pri
rmid rmi
signature sig
timeout tmo
timestamp ts
to to
ttl ttl
type typ
version ver

This is an example of a message in shorten format:

{
  "mid": "2b9c7ee51117",
  "to": "uid:123",
  "frm": "uid:56",
  "ver": "UMF/1.4",
  "ts": "2013-09-29T10:40Z",
  "bdy": {
    "msg": "How is it going?"
  }
}

When using UMF in IoT applications mid and rmid fields should use short hashes where possible. Also avoid using UMF fields which your application may not require such as priority and type.

Also, it's assumed that messages are transmitted as strings without carriage returns, and line feeds to reduce transmission sizes.

{"mid": "2b9c7ee51117","to": "uid:123","frm": "uid:56","ver": "UMF/1.4","ts": "2013-09-29T10:40Z","bdy": {"msg": "How is it going?"}}

The use of gzip compression also greatly reduces message sizes.