Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple serialization, frame splitting, and compression in protocol #5150

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Commits on Aug 1, 2021

  1. Split large header in comms

    Today we try to split up large messages in comms.
    This is useful in a few situations:
    
    1.  Websockets, which often pass frames through middleware that requires
        small messages
    2.  TLS, which fails on some OpenSSL versions with frames above the size
        of an int
    
    We correctly cut up data frames into smaller pieces to address these
    issues.  However we don't apply this same logic to the header frame,
    which may still contain very large bytestrings.  This commit adds a
    workaround in protocol dumps/loads which watches for this event and
    splits the header frame up if necessary.
    
    It works, but it's not very smooth.  I would prefer that in the future
    we think about what a proper header should look like and ensure that it
    contains no user data.  In the meantime this should help.
    mrocklin committed Aug 1, 2021
    Configuration menu
    Copy the full SHA
    b391620 View commit details
    Browse the repository at this point in the history
  2. Decouple serialization, frame splitting, and compression in protocol

    Currently compression and frame splitting are tightly interwoven with
    traversing through messages.  This can be efficient, but results in a
    complex system where it's hard to reason about when things get split or
    compressed (indeed, this lead to a difficult to track down bug with
    frame splitting).
    
    This commit separates these processes into three separate stages:
    
    1.  Serialize all objects into frames
    2.  Split large frames
    3.  Compress compressible frames
    
    This results in a much more uniform application of splitting and
    compressing.  However, this comes with a couple of undesired effects.
    
    1.  We add a new header if either splitting or compressing has occurred
    2.  We no longer avoid decompression when we don't want to deserialize
    
    There is probably a clean way to achieve most/all of our goals here.
    I wanted to push this up to start this conversation.
    mrocklin committed Aug 1, 2021
    Configuration menu
    Copy the full SHA
    bd9c22a View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2021

  1. cleanup dead code

    mrocklin committed Aug 2, 2021
    Configuration menu
    Copy the full SHA
    f4ca371 View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2021

  1. Configuration menu
    Copy the full SHA
    c0ddfb7 View commit details
    Browse the repository at this point in the history