The Talon Manual

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

s

div
idtoc
classtoc
div
classtocTitle

In This Section

div
classtocContents

Table of Contents
maxLevel2
indent8px
stylenone

...

Configuration SettingDefaultDescription

captureTransactionLatencyStats

10240false

Property that can be used to control the default sampling size for series stats.If the number of datapoints collected in a stats interval exceeds this size, the computation for histographical data will be lossy. Increasing the value reduces loss of datapoints, but results in greater overhead in stats collection in terms of both memory usage and pressure on the process caches.enables collection of latency stats as messages flow through the AEP engine's transaction processing machinery.

captureEventLatencyStats

false

Property that globally enables collection of message latency stats as messages flow through the system. These statistics include latencies in the flow outside of transaction processing. For received messages these statistics include transmission, deserialization and dispatch costs. For sent messages these include serialization and transmission costs.

When set to true, timings for messages are captured as they flow through the system. Enablement of these stats is required to collect message bus latency stats. Enabling this property can increase latency due to the overhead of tracking timestamps.

captureMessageTypeStats

false

Property that enables tracking of message statistics on a per message type basis.

When set to true, timings for each message type are individually tracked as separate stats

(lightbulb) Due to their overhead, these statistics are not included in heartbeats emitted by an XVM.

messageTypeStatsLatenciesToCapture

all

Property that enables tracking of latency statistics on a per message type babasis

Property controlling which latency stats on a per message type basis. This property is specified as a comma separated list of values. Valid value include:

  • all Indicates that all available per message type latency stats should be collected.
  • none Indicates that no message type latency stats should be collected.
  • c2o Indicates create to offer latencies should be captured.
  • o2p Indicates offer to poll (input queueing time) should be captured.
  • mfilt Indicates that time spent in application message filters should be captured.
  • mpproc Indicates that time spent in the engine prior to message dispatch should be captured.
  • mproc Indicates that the time spent in application message handlers should be captured.

The values 'all' or 'none' may not be combined with other values.

This value only applies when captureMessageTypeStats is true. When not specified the value defaults to all.

(info) See Also:

capturePerTransactionStats

perTransactionStatsLogging

false

Configuration

See Per Transaction Stats for more details.

Below is example of enablThe above settings can be configured in config.xml as follows:

Code Block
xml
xml
<apps>
  <templates>
    <app name="app-template">
      <captureTransactionLatencyStats>true</captureTransactionLatencyStats>
      <captureEventLatencyStats>true</captureEventLatencyStats>
      <captureMessageTypeStats>true</captureMessageTypeStats>
      <messageTypeStatLatenciesToCapture>c2o,o2p,mpproc,mproc,mfilt</messageTypeStatLatenciesToCapture>
      <capturePerTransactionStats>false</capturePerTransactionStats>
      <perTransactionStatsLogging policy="Off">
        <detachedWrite enabled="true"></detachedWrite>
      </perTransactionStatsLogging>
    </app>
  </templates>
</apps>

...

PhaseDescription
mpprocRecords the time (in microseconds) spent by the engine dispatching the message to an application.
mprocRecords latencies for application message process times (in an EventHandler).
mfiltRecords latencies for application message filtering times (by a message filter).
msend

Time spent in AepEngine.sendMessage().

The time in the AepEngine's send call. This latency will be a subset of mproc for solicited sends and it includes msendc.

msendc

Time spent in the AepEngine's core send logic.

This leg includes enqueuing the message for delivery in the corresponding bus manager.

cstartTime spent from the point the first message of a transaction is received to the time the transaction is committed.
cprolo

Time spent from the point where transaction commit is started to send or store commit, whichever occurs first.

This latency measures the time taken in any bookkeeping done by the engine prior to commit the transaction to store (or for an engine without a store until outbound messages are released for delivery).

csend

The send commit latency: i.e. time from when send commit is initiated, to receipt of send completion event.

This latency represents the time from when outbound messages for a transaction are released to the time that all acknowledgements for the messages are received.

Because this latency includes acknowledgement time a high value for csend does not necessarily indicate that downstream latency will be affected. The Message Latencies listed below allow this value to be decomposied further.

ctrans

Time spent from the point the store commit completes to the beginning of the send commit which releases a transaction's outbound messages for delivery.

If the engine doesn't have a store, then this statistic is not captured as messages are released immediately.

cstore

The store commit latency i.e. time from when store commit is initiated to receipt of store completion event.

This latency includes the time spent serializing transaction contents, persisting to the store's transaction log, inter cluster replication, and replication to backup members including the replication ack.

(lightbulb) High values in cstore will impact downstream message latencies because store commit must complete before outbound messages are released for delivery. The cstore latency is further broken down in the Store Latencies listed below.

cepiloTime spent from the point the store or the send commit completes, whichever is last, to commit completion.
cfullTime spent from the time the first message of a transaction is received to commit completion.
tleg1

Records latencies for the first transaction processing leg.

Transaction Leg One includes time spent from the point where the first message of a transaction is received to submission of send/store commit. It includes message processing and and any overhead associated with transactional book keeping done by the engine.

(lightbulb) Each transaction leg is a portion of the overall commit time that is processed on the Aep Engine's thread. The sum of the transaction leg stats are important in that they determine the overall throughput that an application can sustain in terms of transactions per second.

tleg2

Records latencies for the second transaction processing leg.

Transaction Leg Two includes time spent from the point where the send/store commit completion is received to the submission of store/send commit.

(lightbulb) Each transaction leg is a portion of the overall commit time that is processed on the Aep Engine's thread. The sum of the transaction leg stats are important in that they determine the overall throughput that an application can sustain in terms of transactions per second.

tleg3

Records latencies for the third transaction processing leg.

Transaction Leg Three includes time spent from the point where the store/store commit completion is received to the completion of the transaction commit.

(lightbulb) Each transaction leg is a portion of the overall commit time that is processed on the Aep Engine's thread. The sum of the transaction leg stats are important in that they determine the overall throughput that an application can sustain in terms of transactions per second.

inout

Records latencies for receipt of a message to transmission of the last outbound message.

inackRecords latencies for receipt of a message to stabilization (and upstream acknowledgement for Guaranteed).

...

Code Block
xml
xml
<apps>
  <app name="MyApp">
    <captureMessageTypeStats>true</captureMessageTypeStats>
  <app>
</apps
 
<xvms>
  <xvm name="MyXVM">
    <hearbeats enabled=true" interval="5">
      <includeMessageTypeStats>true</includeMessageTypeStats>
    </heartbeats>
  </xvm>
<xvms>

Bus Connection Stats

The engine stats thread can also trace summary statistics for its message buseseach of the message bus connections managed by the engine. Each message bus connection is wrapped by a Bus Manager which handles bus connect and reconnect, and also provides transactional semantics around the bus by queuing up messages that will be sent as part of an engine transactionhandled by bus manager. The manager handles bus connection establishment, reconnect handling and message IO through the underlying connection. From the perspective on an engine bus manager, a bus connection is synonymous with an SMA bus connection. Each Bus Manager maintains statistics for across bus binding reconnects, allowing continuous stats across bus binding reconnects. The following sections break these statistics down in more detail.

...

FieldDescription
NumMsgsRcvdThe number of message received by the bus.

NumMsgsInBatches

The number of messages received by the bus that were part of a batch.
NumMsgBatchesRcvd

The number of batch message received by the bus.

NumPacketsRcvdThe number of raw packets received by the bus.
NumMsgsEnqueued

The total number of batch messages enqueued for delivery by this bus.

NumAcksSentThe total number of acknowledgment sent upstream for received messages by this bus.
NumStabilityRcvdThe number of stability events (acks) received by this bus.
NumStabilityBatchesRcvdThe number of batched stability events received by this bus.
NumMsgsEnqueued

The total number of batch messages enqueued for delivery by this bus.

NumMsgsSentThe total number of batch messages enqueued message that were actually sent by the bus.
NumFlushesSync

The number of times this bus has been synchronously flushed.

NumFlushesAsyncThe number of times this bus has been asynchronously flushed.
NumMsgsFlushedSyncThe number of messages flushed by synchronous flushes.
NumMsgsFlushedAsync

The number of messages flushed by asynchronous flushes.

NumAsyncFlushCompletionsThe number of asynchronous flushes for this that have completed.
NumCommitsThe number of transactions committed by the bus.
NumRollbacks

The number transactions rolled back for the bus.

Bus Disruptor Latencies

When a bus manager is configured for detached send (aka detached commit), a transaction's outbound messages are sent on the underlying SMA connection by the bus manager's I/O thread. The bus manager uses a disruptor to manage the handoff of messages to the manager's IO thread. The "Offer to Poll" latency measures the time the outbound messages are latent in the disruptor queue. High o2p latencies in a Bus Manager may indicate that messages are being released for send faster than we can actually send them through the underlying SMA connection.

Disruptor statistics follow the message and transaction statistics in the bus manager statistics trace:
Panel

Disruptor..{[<DisruptorNumUsed> of <DisruptorNumAvailable>] <DisruptorUsagePct>% (<DisruptorClaimStrategy>, <DisruptorWaitStrategy>)}
...[o2p] [sample=0, min=-1 max=-1 mean=-1 median=-1 75%ile=-1 90%ile=-1 99%ile=-1 99.9%ile=-1 99.99%ile=-1]

When a Bus Manager is configured for detached send (aka detached commit), a transaction's outbound messages are dispatched on the Bus Manager's I/O thread. The "Offer to Poll" latency measures the time from when messages are released for delivery after being stabilized in the store until the detached bus manager picks them up for send. High o2p latencies in a Bus Manager may indicate that messages are being released for send faster than we can actually send them.

Bus Clients, Channels and Fails

After the disruptor statistics are counters indicating the number of connected clients, active channels and binding failures:

...

PhaseDescription
c2o

The create to send latencies in microseconds, the time in microseconds from message creation to when send was called for it.

(lightbulb) Note, this statistic is for outbound messages sent through a bus and is different from the c2o statistic captured for an AepEngine which tracks the create to offer times for received/injected messages offered to the application's input queue.

o2s

The send to serialize latencies in microseconds, the time from when the message was sent until it was serialized in preparation for transmission on the wire.

For an engine with a store this will include the time from the application's send call, the replication hop (if there is a store) and time through the bus manager's disruptor if detached commit is enabled for the bus manager.

sThe serialize latencies in microseconds, the spent serializing the MessageView to its transport encoding.
s2wThe serialize to wire latencies in microseconds, the time post serialize to just before the message is written to the wire.
wThe wire latencies in microseconds, the time an inbound messages spent on the wire.

The time spent on the wire from when the message was written to the wire by the sender to the time it was received off the wire by the receiver.

Note: that this metric is subject to clock skew when the sending and receiving sides are on different hosts.

w2d

The time from when the serialized form was received from the wire to deserialization.

dThe time (in microseconds) spent deserializing the message and wrapping it in a MessageView.
d2iThe time (in microseconds) from when the message was deserialized to when it is received by the engine.

This measure the time from when the bus has deserialized by the bus to when the app's engine picks it up from it's input queue (before it dispatches it to an application) handler (it includes the o2p time of the engine's disruptor).

Additional time spent by the engine dispatching the message to the application handler is covered by mpproc (see the Transaction Latencies table).

o2i

The origin to receive latencies in microseconds.

The time from when a message was originally created to when it was received by the binding.

w2w

The wire to wire latencies in microseconds, for outbound messages the time from when the corresponding inbound message was received off the wire to when the outbound message was written to the wire.

...

Anchor
_Toc221568581
_Toc221568581
Transactions Processed, Outstanding Commits & Transaction Rate

The number of outstanding commits represents the transactions currently in flight i.e. (NumCommitsStarted - NumCommitsCompleted)

Panel

Flows{1} Msg{In{25,901(364 0) 25,901(364 0) 0(0 0) 0(0 0) 0X(0 0) (0)} Out{25,901(364 0) 25,901(364 0) 0(0 0) (25,901 25,901 0 0) (0)} Latency{InOut{0 us} InAck{0 us}}} Ev{51,806(728 0) 25,901[25,901,0,25,901](364 0)} Txn{<NumTransactions>[(<NumCommitsStarted>,<NumCommitsCompleted>),(<NumSendCommitsStarted>,<NumSendCommitsCompleted> (<SendCommitCompletionQueueSize>)),(<NumStoreCommitsStarted>,<NumStoreCommitsCompleted> (<StoreCommitCompletionQueueSize>)),<NumRollbacks>](<TransactionRate> <DeltaTransactionRate>) <NumFlowEventsPerTransaction>} Store{-1}

Anchor
_Toc221568582
_Toc221568582
Engine Store Size

...