Skip to content
Ross Light edited this page Mar 25, 2016 · 2 revisions

Author: @zombiezen

This page describes a large backward-compatible API change that was made on 2016-03-25.

Rationale

In January 2016, @glycerine made the observation that v2 was underperforming on benchmarks because of ~10X allocations during unmarshaling. While some of the allocations were coming from Message/Arena initialization, the biggest offender was the capnp.Pointer interface conversion. Since the type gets used in generated code, there is not a way to optimize out these allocations without affecting generated code and therefore the public API. Because of the dramatic improvement in performance for little application change, I (@zombiezen) deemed the added API complexity worth the change.

Effect for Library Users

This change is backward-compatible. The capnp.Pointer type and functions that use it still exist in the capnp package. Most application code shouldn't be directly using this type, since it primarily exists for generated code support. However, if your application provides generic functions over Cap'n Proto objects, they should be rewritten to use the new Ptr type.

Migrating Code

Since upgrading to the new code path requires type changes, this is difficult to do in an automated manner. In general, use the capnp.Ptr type over the capnp.Pointer type, and follow the deprecation notices to see what you should use instead. As long as your application code doesn't directly use capnp.Pointer (most applications), you don't need to make any changes: just regenerate the code for your schemas.

Details

In Cap'n Proto, there are some operations that apply to all types of Cap'n Proto pointers. These can be naturally represented as Go interfaces. Each of the three represented Cap'n Proto pointer types (Struct, List, and Interface) implement this interface as value receivers. When these concrete types are passed around, they are copied by-value on the stack. However, these structs are larger than a pointer, so as described in the linked article, when they are converted to the interface type, they are copied to the heap. This increases GC pressure and has a noticeable effect on CPU time.

The most natural fit for this type would be some sort of C union or algebraic data type. Go supports neither. The Ptr type can be thought of as a hand-crafted union type. It has storage for all of the fields of any of the pointer types and converts back and forth. This is still an improvement over the v1 API, which solved the same problem by making one struct with all the fields and then embedding it in the specific types.

As a side note, I don't find Go's lack of a union or ADT type at fault here. Any performance-sensitive code runs into these sorts of edge cases. The simplicity of the type system actually helped, because along with the great memory profiling tooling, it was straightforward to root-cause the issue.