In This Section
Overview
The data modeler supports generating objects with 4 different encodings - Json, Protobuf, Xbuf and Xbuf2. From an API perspective the generated interfaces are functionally equivalent (with a few exceptions), but each encoding has different performance characteristics that are described in the sections below.
This section assumes that you have already familiarized yourself with the basics of Modeling Message and State.
Encoding Types
Json
With json encoding fairly simple classes are created that use jackson for serializing to/from json. It is suitable for lightweight applications or for applications that natively work with json (e.g. web applications).
Pros:
- Memory utilization: Because there isn't much serialization machinery or caching of the backing serialized format, Json generated objects don't use much memory which can be useful for long lived state objects.
Cons
- Jackson serialization is slow and produces a lot of garbage, and json objects can't be pooled.
- Serializing to text is not very compact which leads to higher disk usage and network bandwidth.
Note also that all encoding types allow a message to be serialized to or from json as a secondary functionality.
Protobuf
With protobuf encoding objects are create with backing google protobuf generated objects. Protobuf is suitable for applications with higher performance requirements than is afforded by Json encoding. It should be used by applications with moderate to high performance requirements.
Protobuf is recommended for generating ADM objects used for application state.
Pros:
- Memory Utilization: Protobuf generated objects are fairly compact in memory compared to Xbuf objects, non repeated field values are store directly in the generated message object, making protobuf encoded objects a good candidate for usage as state entities.
- Faster serialization than Json.
- Protobuf is a well known standard, making it easy to interoperate with applications not uses ADM generated code.
- Supports serialization / from json.
Cons
- Google protobuf generated messages are not zero garbage, and the serialized Protobuf messages are technically reusable so for very low latency applications that aren't using the Platform's zing distribution, this can result in garbage related pauses.
Xbuf
Xbuf generated objects, are backed by the X Platform's high performance implementation of Google protobufs which supports zero garbage operation and cut-through serialization (the ability to read/write fields directly to from a backing buffer). It should be used for applications with the most stringent performance requirements.
Xbuf is recommended for use with ADM message models particularly for applications that require very low latency.
Pros:
- Faster serialization than Json or Protobuf
- Protobuf is a well known standard, making it easy to interoperate with applications not uses ADM generated code.
- Compatible with protobuf on the wire.
- When not interoperating with google protobuf generated recipients, code can also be generated to perform even faster.
Cons:
- Memory Utilization: Currently Xbuf objects can take up more memory than protobuf encoded objects (a memory performance tradeoff).
- Pooling Complexity: Working with the more advanced message api around pooling take / lend, is harder to achieve.
- Not currently optimized for use as entities in State Replication.
Known Limitations:
- Accessors for String and Uuid arrays fields in Xbuf are not currently zero garbage. Consequently, Xbuf may be more expensive than Protobuf for types that use array fields.
Xbuf2
SINCE 3.12.1
Xbuf is the next version of the Xbuf encoding type. It is 100% wire compatible with Protobuf and Xbuf but offers considerable improvements in performance and memory efficiency over both these encoding types.
Xbuf2 is the most recommended encoding type to use with ADM message models
Future of Xbuf
Xbuf2 is the intended replacement of Xbuf. It is presented in the current version as a separate encoding type only because the API of the generated Xbuf2 classes have some minor compatibilities with the Xbuf generated classes (see below). Therefore, Xbuf2 is being offered as a separate encoding type in the current version. The next major Talon version will only contain the Xbuf encoding type which will be the current Xbuf2 encoding renamed to Xbuf. Talon applications that currently use the Xbuf encoding type, particularly those that use the Xbuf API methods not supported by Xbuf2 (see below), are strongly encouraged to move to Xbuf2 to ease the migration to the next major version of Talon
Pros:
- All the benefits of Xbuf listed above plus the following
- Fastest serialization of all the encoding types
- Significantly lower memory footprint than the Xbuf
- Lower memory footprint than the Protobuf encoding type
- Between being faster, having a lower memory footprint and being zero garbage, Xbuf2 is a better choice as an encoding type than Protobuf across all dimensions. As a result, the Protobuf encoding is likely to be deprecated or removed in the next major Talon release.
- Offers several knobs to manage the tradeoff between performance and memory conservation
- Offers multiple data access patterns
- Random access (as is offered by the other encoding types)
- Serial access
- Direct Deserialization: The ability to serially traverse a Google Protobuf encoded buffer and dispatch the fields to the application via a callback
- Direct Serialization: The ability for an application to directly serialize application fields into a buffer in the Google Protobuf wire format
- Optimized for both messages and state
Cons:
- Xbuf2 stores field data off-heap. This can result in more complexities in the following areas
- Troubleshooting issues
- Monitoring memory utilization
- Performing capacity planning particularly related to memory utilization.
- Xbuf2 generated classes are larger.
Known Limitations:
- Does not support the following field types
- UUID
- UUID[]
- Currency
- Currency[]
Incompatibilities with Xbuf:
- With Xbuf, setting a date field using a timestamp of -1 clears the field. This is not the case with Xbuf2
- Xbuf2 does not implement the getXXXField() accessor methods
- With Xbuf, setting a Date[], String[] or Enum[] containing a null element results in an NPE being thrown. With Xbuf2, the null elements are ignored.
API Differences
For the most part code generated for the different encoding types behaves the same, but there are some key differences that stem from both the underlying serialization mechanisms and features supported.
Unrecognized Field Values
For Json encoding unrecognized enum array values are treated as null, and for non array fields an unrecognized array value will be treated as null and hasXXX will return true.
- For Protobuf, Xbuf and Xbuf2, unrecognized fields (those with unrecognized field tags) are preserved when an inbound message is written to a transaction log (although they are inaccessible). If the message is copied by serializing to bytes and deserializing into a new message instance, the unrecognized fields from the original message are sent on the wire. If the message is modified prior to sending, the unrecognized fields may be lost.
- For repeated enum fields in Protobuf, unrecognized enum values are ignored. For Protobuf encoding the underlying protobuf may reorder the unrecognized enum values and put them at the end. Xbuf and Xbuf2 generated code preserves the order of unrecognized enums. When deserializing from Json, unrecognized enum values are treated as null so the effect on a deserialized message or entity is the same as adding an enum array with null values (see below).
Null Value Handling
- Message and Entities generated with Json encoding support serializing null values and null values in arrays.
- For Xbuf setting a timestamp of -1 or less results in the field value being cleared
- Note: This is not the case with Xbuf2.
- For Xbuf, Xbuf2 and Protobuf, setting a null value for a String, Date, Enum or Embedded Entity Field results in the field being cleared (the Google Protobuf wire format doesn't support null values on the wire).
- For Xbuf and Protobuf, setting a Date[], String[], or Enum[] containing a null element results in a NullPointerException being thrown.
- Note: With Xbuf2, the behavior is the same as with Entity[] i.e. the null values are ignored
- For Xbuf, Xbuf2 and Protobuf, setting an Entity[] with a null element results in the null value(s) being ignored during serialization. The same holds true when using the XIterator setters or when calling addXXX to add the set of values.
- For Xbuf, Protobuf, after setting null values in an array field, subsequently calling the getter MAY or MAY NOT result in the null values being returned. Applications are encouraged to use the getXXXIterator accessors, and should be coded to handle either case for maximum portability both between encodings and for handling cases where the null values have been filtered out due to serialization. At present, Xbuf messages generated with protobuf compatibility do not cache a reference to the array passed in, and Protobuf messages do cache values passed in so that nulls are returned ... but this is an implementation detail that could change.
- For Xbuf2, a subsequent call to get or iterator over array elements after setting a null element will NOT return the null element
Pooling Considerations
A major difference between Xbuf/Xbuf2 and Protobuf or Json encoded entities is that Xbuf/Xbuf2 messages and entities are pooled by the platform by default. From a coding standpoint this means that when working with Xbuf/Xbuf2 encoded messages or entities:
- An application may not hold onto an XBuf/Xbuf2 encoded message beyond the scope of a message handler.
- An application may not hold onto an XString or embedded entity type from a message beyond the scope of a message handler because these objects are pooled along with the message and will be reset once the message is returned to its pool. See Zero Garbage Nested Entities for detailed usage, but the general rule of thumb is to copy any entity that needs to be retained by applications state, or to use the more advanced 'take' apis. Note that 'take' is not supported for String fields as string fields are not pooled for Xbuf messages.
- Setting an XString or embedded entity field on a message transfers ownership to the message. If the application wants to retain the entity is application state, then it should copy it into a new entity or use the more advanced 'lend' apis. See Zero Garbage Nested Entities for more details.
- An application may not mutate a returned array type from a message and should not hold onto to the array beyond the duration of a message handler. See Zero Garbage Array Accessors for more details.