- Title:
hybrid-tx-building
- Authors: @lehnberg
- Start date:
Mar 19, 2020
- RFC PR: Edit if merged: mimblewimble/grin-rfcs#0000
- Tracking issue: [Edit if merged with link to tracking github issue]
This RFC establishes a baseline for transaction building in Grin between users and services, through the introduction of a new 'hybrid' method. Transactions are initiated used two standardized formats, QR code and blob, and can then be completed over Tor or clearnet using https.
To support this, the transaction slate data at each step in the process is redesigned and minified.
The introduction of this method improves security as it does not support unsafe communication methods, and it improves usability as friction and potential drop-off points are reduced in the transaction building flow.
To create valid transactions in Grin, a round-trip of communication between transacting parties is required ("transaction building"). Transaction data ("transaction slates") is exchanged. The round-trip can be carried out in either direction, sender initiated ("default flow") or recipient initiated ("invoice flow").
- Sender creates a transaction for a certain amount, locks spending outputs, sends data to Recipient (trip 1).
- Recipient creates destination outputs, produces a response, returns data to Sender (trip 2).
- Sender finalizes and broadcasts the transaction.
- Recipient creates a payment request (an invoice) for a certain amount, creates destination outputs, sends data to Sender (trip 1).
- Sender locks spending outputs, produces a response, returns data to Recipient (trip 2).
- Recipient finalizes and broadcasts the transaction.
The transaction slates up until this point (v0-3) have been serialized as JSON objects, making it easy to read for both humans and machines. The structure of data passed at each step is described in detail in the grin-wallet documentation.[1] The intention to that point was for slates to contain a complete picture of the transaction state at every stage. It was not a priority to limit the information shared or minimizing the payload transmitted in each communication round.
After more than a year of mainnet transactions between users and services like mining pools and exchanges, observations include the following pain points.
Often behind NAT or a firewall, users can only reliably be expected to initiate connections. This complicates receiving payments from services. Users often end up having to tunnel through trusted third party services[2] that relay their communication, which introduces privacy and security concerns.
→ How can users behind NAT and Firewalls be better supported?
It's challenging to run a http listener service on mobile wallets, making it difficult to receive payments. While file based transacting is possible, files are not always handled well on mobile phone OSes, and it can be difficult to exchange data between makes and models without relying on third party services.
→ How can the experience for users on mobile devices be improved?
Current http(s) based transaction methods expect synchronous communication when no third party relaying service is used. While it can be expected of businesses to be constantly available and listening for inbound connections, it's not realistic to expect the same from end users. This often leads to failures in the transaction building process, resulting in poor user experience when outputs become locked for transactions that do not finalize.
→ How can the requirement for fully synchronous communication be relaxed?
Since http communication is supported as a transaction building method, many services default to using that as it is straight forward to implement. This un-encrypted communication can easily be intercepted or spoofed and introduces considerable security risks. The argument for deprecating the http method for end users has been made a long time[3], but has so far been unconvincing as there's not been a good enough solution presented as an alternative.
→ What alternative can encourage services to replace http as their default?
Support for transaction building over the Tor overlay network was introduced as part of RFC#0010.[4] In addition to disguising the IP addresses of the transacting parties, Tor solves the problem of accessing users behind NAT or Firewalls, and is end-to-end encrypted by design. It is therefore a good solution for many users, especially those with formidable threat models. It is not a good solution for all however, as it is blocked in some regions and/or may not culturally be accepted to be used by companies and end users. While there are methods to bypass the technological restrictions, they add complexity and are not guaranteed to always work. Futhermore, although Tor access on mobile is possible, it is far from trivial and makes mobile wallet development more complicated. More generally, it creates an external dependency on Tor being operational in order for the transaction building process to function. For these reasons, there's a need for falling back to other methods when Tor is unavailable or is not suitable.[5]
→ Tor is a good solution that should be used when it is appropriate. What is a good fallback alternative that works in a wider range of conditions?
The pain points above can be addressed by handling the two communication trips in the transaction building process differently. Standard rules and formats are enforced for the first trip where the slate is passed as a QR code or a blob of text. This slate then includes instructions for completing the second trip over https.
End users are assumed to use mobile or desktop wallets, being behind NAT or Firewalls, and to come online intermittently.
Online services (such as exchanges, mining pools, or online stores) are assumed to be available and able to listen for inbound connections as required.
- On checkout, store generates QR code and presents to the browser window of the user's laptop.
- User scans the QR code with their mobile camera. A URI triggers the Grin mobile wallet, which processes the message and displays a confirmation window with the amount and destination URL. User taps to confirm.
- Store receives the response, finalizes the transaction, and broadcasts it.
- At the exchange deposit screen, user makes a deposit request for a given amount. The exchange generates a QR code and presents to the browser window of the user's laptop.
- User scans the QR code with their mobile camera. A URI triggers the Grin mobile wallet, which processes the message and displays a confirmation window with the amount and destination URL. User taps to confirm.
- Exchange receives the response, finalizes the transaction, and broadcasts it.
- User logs into the pool, makes a withdrawal request for a certain amount. The mining pool generates a blob that is presented in the browser window of the user.
- User copies the blob and pastes it into their wallet. Wallet processes the message and displays a confirmation window with the amount and the sender URL. User taps to confirm.
- Mining pool receives the response, finalizes the transaction and broadcasts it.
-
A universal format for transactions that is flexible. The hybrid approach supports Tor and clearnet communication interchangeably in trip 2, following an identical process that is seamless to the user. It can be extended to support other methods as required without altering the flows or formats.
-
Bypasses NAT & Firewall restrictions. The QR code or blob in trip 1 is delivered through an existing channel, and since this includes instructions for opening an outbound connection to complete trip 2, NAT & Firewall issues are not relevant.
-
Asynchronous. Communication in the two trips is broken up, and the transacting party that receives the slate as a QR code or blob does not need to respond immediately. The initiator is however still required to listen on the designated trip 2 response address.
-
Mobile friendly. A mobile wallet user can scan the QR code with their phone, or quickly copy/paste the blob from a chat message, and open an outbound connection to complete the transaction without having to run a listener at any time.
-
An improved user experience.
- Fewer friction points. The notion of addresses can be abstracted, and the process can become much more streamlined: Users can transact by scanning a code with their phone and confirming with a single tap.
- Fewer drop-off points. This method is less error-prone, as it makes no assumptions of wallet connectivity in trip 1.
- Reduced locked output limbo. When a service is the initiating party, spending outputs of end users stay locked for a shorter duration.
- More descriptive errors. In those events where transacting still fails, wallet developers can use the additional information provided in the slate of trip 1 to present the user with better descriptive error messages.
-
Encrypted communication by default. By enforcing https in the response address, encrypted communication becomes the standard. As the hybrid method improves usability over http transacting, it stands a chance to become adopted as a new default method to transact with services.
The main contribution of this RFC is the introduction a novel way to pass data in trip 1 of transaction building.
Slate information is packaged into two standardized formats that are interchangeable:
- A QR code image, which can be read by the camera of a mobile device, making it cross-device portable: The slate can be presented on one device and read from a different.
- A blob, which is a string of characters that can be copy/pasted and shared over any medium. It's similar to the file exchange method, without requiring the handling of an actual file.
The slate in trip 1 includes an instruction to the wallet of the other party for how to communicate in trip 2 over an https address in order to complete the round trip, alongside an optional TTL for how long this address will be valid.
- Sender creates a transaction for a certain amount, locks spending outputs, chooses a reply-to https address, creates a QR code or blob that is sent to Recipient (trip 1).
- Recipient processes the QR code or blob, creates destination outputs, produces a response, responds to Sender at the designated address (trip 2).
- Sender finalizes and broadcasts the transaction.
- Recipient creates a payment request (an invoice) for a certain amount, creates destination outputs, chooses a reply-to https address, creates a QR code or blob that is sent to Sender (trip 1).
- Sender processes the QR code or blob, locks spending outputs, produces a response, responds to Sender at the designated address (trip 2).
- Recipient finalizes and broadcasts the transaction.
The technical details of this RFC can be broken down into three distinct efforts:
- Step 1: Redesigning the transaction slate to ensure that only required data is passed at every stage in the communication process, and add new fields for the hybrid method.
- Step 2: Change the serialization of transaction slates to make the slate footprint as small as possible.
- Step 3: Define two standards for creating the first trip of communication: a QR code format and a text blob format.
- Only required information included at each phase
- Treat defaults as implicit
Stage | Standard | Invoice |
---|---|---|
Init | ||
Slate trip 1 |
||
Receive | ||
Slate trip 2 |
||
Finalize |
- Adds complexity?
- Does it solve the problem well?
Tbd
- Tor transacting only?
- File transacting only?
Tbd
-
Grinbox
- Does this make it easier to transact user <-> user? How do two mobile users transact?
- What happens to file exchange? Should we consider supporting it in trip 2?
- What are the implications of adding TTL? Are there negative consequences?
- Where does the keybase transaction building method fit in all this? Should it be deprecated?
- Open ended transaction payment request
- Payment proofs for invoice flows
- NFC standard
- End to end encrypt blobs / QR codes? Password protect them?
[1]: https://github.com/mimblewimble/grin-wallet/blob/master/doc/transaction/basic-transaction-wf.png
[2]: Examples of such tunnels include ngrok, Grin++ Relay (URL needed), and Hedwig.
[3]: mimblewimble/grin-wallet#66
[4]: https://github.com/mimblewimble/grin-rfcs/blob/master/text/0010-online-transacting-via-tor.md
[5]: For a detailed discussion, see https://github.com/j01tz/network-protocol-obfuscation/blob/master/grin_obfuscation.md