In This Section
Overview
The data modeler supports generating objects with 3 diffferent ecodings, Json, Protobuf and Xbuf. From an API perspective the generated interfaces are functionally equivalend (with a few exceptions), but each encoding has different performance characteristics that are described in the sections below.
This section assumes that you have already familiarized yourself with the basics of Modeling Message and State.
Encoding Types
Json
With json encoding fairly simple classes are created that use jackson for serializing to/from json. It is suitable for lightweight applications or for applications that natively work with json (e.g. web applications).
Pros:
- Memory utilization: Because there isn't much serialization machinery or caching of the backing serialized format, Json generated objects don't use much memory which can be useful for long lived state objects.
Cons
- Jackson serialization is slow and produces a lot of garbage, and json objects can't be pooled.
- Serializing to text is not very compact which leads to higher disk usage and network bandwidth.
Note also that all encoding types allow a message to be serialized to or from json as a secondary functionality.
Protobuf
With protobuf endoding objects are create with backing google protobuf generated objects. Protobuf is suitable for applications with higher performance requirements than is afforded by Json encoding. It should be used by applications with moderate to high performance requirements.
Protobuf is recommended for generating ADM objects used for application state.
Pros:
- Memory Utilization: Protobuf generated objects are fairly compact in memory compared to Xbuf objects, non repeated field values are store directly in the generated message object, making protobuf encoded objects a good candidate for usage as state entities.
- Faster serialization than Json.
- Protobuf is a well known standard, making it easy to interoperate with applications not uses ADM generated code.
- Supports serialization / from json.
Cons
- Google protobuf generated messages are not zero garbage, and the serialized Protobuf messages are technically reusable so for very low latency applications that aren't using the Platform's zing distribution, this can result in garbage related pauses.
XBuf
Xbuf generated objects, are backed by the X Platform's high performance implementation of Google protobufs which supports zero garbage operation and cut-through serialization (the ability to read/write fields directly to from a backing buffer). It should be used for applications with the most stringent performance requirements.
Xbuf is recommended for use with ADM message models particularly for applications that require very low latency.
Pros:
- Faster serialization than Json or Protobuf
- Protobuf is a well known standard, making it easy to interoperate with applications not uses ADM generated code.
- Compatible with protobuf on the wire.
- When not interoperating with google protobuf generated recipients, code can also be generated to perform even faster.
Cons:
- Memory Utilization: Currently Xbuf objects can take up more memory than protobuf encoded objects (a memory performance tradeoff).
- Pooling Complexity: Working with the more advanced message api around pooling take / lend, is harder to achieve.
- Not currently optimized for use as entities in State Replication.
Known Limitations:
- Accessors for String and Uuid arrays fields in Xbuf are not currently zero garbage. Consequently, Xbuf may be more expensive than Protobuf for types that use array fields.
API Differences
For the most part code generated for the different encoding types behaves the same, but there are some key differences that stem from both the underlying serialization mechanisms and features supported.
Unrecognized Field Values
For Json encoding unrecognized enum array values are treated as null, and for non array fields an unrecognized array value will be treated as null and hasXXX will return true.
- For both Protobuf and Xbuf unrecognized fields (those with unrecognized field tags) are preserved when an inbound message is written to a transaction log (although they are inaccessible). If the message is copied by serializing to bytes and deserializing into a new message instance, the unrecognized fields from the original message are sent on the wire. If the message is modified prior to sending, the unrecognized fields may be lost.
- For repeated enum fields in protobuf unrecognized enum values are ignored. For Protobuf encoding the underlying protobuf may reorder the unrecognized enum values and put them at the end. Xbuf generated code preserves the order of unrecognized enums. When deserializing from Json unrecognized enum values are treated as null so the effect on a deserialized message or entity is the same as adding an enum array with null values (see below).
Null Value Handling
- Message and Entities generated with Json encoding support serializing null values and null values in arrays.
- For Xbuf setting a timestamp of -1 or less results in the field value being cleared.
- For Xbuf and Protobuf, setting a null value for a String, Date, Enum or Embedded Entity Field results in the field being cleared (protobuf doesn't support null values on the wire).
- For Xbuf and Protobuf, setting a Date[], String[], or Enum[] with containg a null value results in a NullPointerException being thrown. Setting an Embedded Entity [] array or an enum [] with a null value results in the null value being ignored during serialization. The same holds true when using the XIterator setters or when calling addXXX to add the set of values.
- For Xbuf and Protobuf, when setting null values in an array field, subsequently calling the getter MAY or MAY NOT result in the null values being returned. Applications are encouraged to use the getXXXIterator accessors, and should be coded to handle either case for maximum portability both between encodings and for handling cases where the null values have been filtered out due to serialization. At present, Xbuf messages generated with protobuf compatibility do not cache a reference to the array passed in, and Protobuf messages do cache values passed in so that nulls are returned ... but this is an implementation detail that could change.
Pooling Considerations
A major difference between Xbuf and Protobuf or Json encoded entities is that Xbuf messages and entities are pooled by the platform by default. From a coding standpoint this means that when working with Xbuf Encoded Messages or Entities:
- An application may not hold onto an XBuf encoded message beyond the scope of a message handler.
- An application may not hold onto an XString, or Embedded Entity type from a message beyond the scope of a message handler because these objects are pooled along with the message and will be reset once the message is returned to its pool. See Zero Garbage Nested Entities for detailed usage, but the general rule of thumb is to copy any entity that needs to be retained by applications state, or to use the more advanced 'take' apis. Note that 'take' is not supported for String fields as string fields are not pooled for Xbuf messages.
- Setting an XString or Embedded Entity field on a message transfers ownership to the message. If the application wants to retain the entity is application state, then it should copy it into a new entity or use the more advanced 'lend' apis. See Zero Garbage Nested Entities for more details.
- An application may not mutate a returned array type from a message and should not hold onto to the array beyond the duration of a message handler. See Zero Garbage Array Accessors for more details.