An operational AEP engine and its underlying components such as its HA Store and Bus Bindings can be configured to continuously collect many raw capture many metrics and statistics during the course of its operation. Such stats can be reported by an Talon Server and reported in a zero garbage fashion in the form of server heartbeats. Additionally for quick debugging and diagnostic purporses the engine can spin up a background thread that periodically performs the following:

Perform higher level statistical computations such as calculating message rates and average latencies and output its results to a named trace logger using the platform's tracing and logging system.
Detect violation/abatement of alert thresholds and dispatch appropriate alert events to the application.

The raw metrics collected by the engine are used by the background statistical thread for its computations and can also be retrieved programmatically by an application for its own use.
In this document, we describe:

the various raw metrics captured by an engine,
how to switch on and configure the statistics thread operation,
the higher level statistics calculations performed by the statistics thread,
and the format of the output of the statistics thread.

Configuration Settings

Stats Enablement Properties

As collection of AEPEngine involves overhead, the engine provides somewhat granular controls on the types of stats that are collected to allow applications to balance the cost of collecting statisticsThese metrics are periodically collected by the XVM in which the engine is running and reported in XVM heartbeats which can then be traced, logged in a binary format or emitted to monitoring applications. This document describes the statistics that are collected, how to enable those statistics that aren't enabled by default and describes the format in which statistics are traced.

Configuration Settings

Most engine metrics are low overhead and are their collection cannot be disabled. Other types of statistics such as Latency Statistics, Per Message Type statistics, and per Transaction Statistics can impact application performance for performance sensitive apps, and their collection is disabled by default. The subsections below describes the additional statistics that can be collected and how they can be enabled.

Global Latency Stats Settings

Latency statistics, however, are more expensive to collect and can impact application performance. Consequently latency stats collection is disable by default.The platform provides several knobs that allow applications to control what latency statistics are collected and balance the performance cost against operational visibility. The table below summarizes the various stats collection that can be enabled.configuration settings that control latency collection at a process-wide level.

aep.<engine>.latency.Indicates whether or not latency stats for the engine should be collected The latency stats include transaction level latencies that can be used in monitoring the overall latencies of the various legs of a transaction.aep<engine>msgtype.

Configuration Setting	Default	Description
nv.	stats	false	.	nv.stats.series.samplesize	10240	Property that can be used to control the default sampling size for series stats. If the number of datapoints collected in a stats interval exceeds this size, the computation for histographical data will be lossy. Increasing the value reduces loss of datapoints, but results in greater overhead in stats collection in terms of both memory usage and pressure on the process caches.
nv.	msg.	latency.	stats	false	Indicates whether or not per-message-type statistics should be collected. When enabled, processing latencies are recorded on a per message basis. Per message type stats introduce a fair amount of overhead, so typically enabling their collection is enabled only during application profiling for application with stringent performance requirements.
nv.aep.<engine>.event.latency.stats	false	Indicates whether or not per-event-type latency statistics should be included in a given engine's stats trace output. Event latency statistics can be used to capture the processing latencies for each type of event being processed in the engine. This is useful in determining if particular event types are consuming considerable engine thread processing time. Per event type stats introduce a fair amount of overhead, so typically enabling their collection is enabled only during application profiling for application with stringent performance requirements.
nv.msg.latency.stats	false	Property that enabled message latency stats tracking. Property that globally enables collection of message latency stats as messages flow through the system. These statistics include latencies in the flow outside of transaction processing. For received messages these statistics include transmission, deserialization and dispatch costs. For sent messages these include serialization and transmission costs. When set to true, timings for messages are captured as they flow through the system. Enablement of these stats is required to collect message bus latency stats. Enabling this property can increase latency due to the overhead of tracking timestamps.
nv.msgtype.latency.stats	false	Property that enabled per enables tracking of message type latency stats trackingon a type by type basis. When set to true, timings for each message type are individually tracked as separate stats. This can be useful in tracking down issues in which a particular message type is problematic (for example, tracking down a high application handler message processing time). However, it results in a higher overhead Due to their overhead, these statistics are not included in heartbeats emitted by an XVM.
nv.ods.latency.stats	false	Indicates whether or not store latency statistics are captured. Store latency stats expose statistics related to serializing replicating and persisting transactions. These stats must be enabled in order to include latency stats along with an application's store statistics.
nv.event.latency.stats	false	Indicates whether or not event latency statistics are captured. Enabling Event latency stats record timestamps for enqueue and dequeue of events across event multiplexer multiplexers, such as the AepEngine's input multiplexer queue. Enabling event latency stats is useful for determining if an engine's event multiplexer queue is backing up by recording the time that events remain on the input queue. These stats must be enabled in order to capture input queuing times.
nv.link.network.stampiots	false	Indicates whether or not timestamps should be stamped on inbound and outbound messages. Disabled by default. Enabling this setting will allow engines to provide more detail in the form of transaction legs in message latency statistics.

Statistics Output Threads

...

These stats must be enabled to

...

Configuration Setting	Default	Description
nv.aep.<engine>.stats.interval	0	The interval (in seconds) at which engine stats will be traced for a given engine. Can be set to a positive integer indicate the period in seconds at which the engine's stats dump thread will dump recorded engine statistics. Setting a value of 0 disables creation of the stats thread. When enabled, engine stats are traced to the logger 'nv.aep.engine.stats' at a level of `Tracer.Level.INFO`; therefore, to see dumped stats, a trace level of 'nv.aep.engine.stats.trace=info' must be enabled. NOTE: disabling the engine stats thread only stops stats from being periodically traced. It does not stop the engine from collecting stats; stats can still be collected by an external thread (such as the Talon Server which reports the stats in server heartbeats). In other words, enabling the stats thread is not a prerequisite for collecting stats, and disabling the stats reporting thread does not stop them from being collected. NOTE: while collection of engine stats is a zero garbage operation, tracing engine stats is not a zero garbage when performed by this stats thread. For latency sensitive apps, it is recommended to run in a Talon server which can collect engine stats and report them in heartbeats in a zero garbage fashion.
nv.aep.<engine>.sysstats.interval	0	The interval (in seconds) at which engine sys stats will be reported. Set to 0 (the default) to completely disable sys stats tracing for a given engine. In most cases, AEP sys stats will not be used and system level stats would be recorded in the Server Statistics from which an AEPEngine is running.
nv.event.mux.<name>.stats.interval	0	The interval (in seconds) at which multiplexer stats will be traced. Multiplexer stats can also be reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero.
nv.msg.latency.stats.interval	0	The interval (in seconds) at which message latency stats are traced. This setting has no effect if nv.msg.latency.stats is false. This allows granular tracing of just message latency stats on a per bus basis. Message latency stats can also be reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero.
nv.aep.busmanager.<engine>.<bus>.stats.interval	0	The interval (in seconds) at which bus stats will be traced. Bus stats reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero. When engine stats output is disabled this can be used to trace only bus stats for a particular message bus.

report 'wire' times for an application's store.

The above settings can be configured in config.xml as follows:

Code Block

	xml
	xml

<env>
  <nv>
    <msg.latency.stats>true</msg.latency.stats>
    <event.latency.stats>true</event.latency.stats>
    <ods.latency.stats>true</ods.latency.stats>
    <link.network.stampiots>true</link.network.stampiots>
    <latencymanager.samplesize>10240</latencymanager.samplesize>
  </nv>
</env>

Per Engine Stats Settings

Latencies related to a particular application's transaction pipeline can be configured

Configuration Setting	Default	Description
captureTransactionLatencyStats	false	Property that enables collection of latency stats as messages flow through the AEP engine's transaction processing machinery.
captureEventLatencyStats	false	Property that globally enables collection of message latency stats as messages flow through the system. These statistics include latencies in the flow outside of transaction processing. For received messages these statistics include transmission, deserialization and dispatch costs. For sent messages these include serialization and transmission costs. When set to true, timings for messages are captured as they flow through the system. Enablement of these stats is required to collect message bus latency stats. Enabling this property can increase latency due to the overhead of tracking timestamps.
captureMessageTypeStats	false	Property that enables tracking of message statistics on a per message type basis. When set to true, timings for each message type are individually tracked as separate stats Due to their overhead, these statistics are not included in heartbeats emitted by an XVM.
messageTypeStatsLatenciesToCapture	all	Property that enables tracking of latency statistics on a per message type basis Property controlling which latency stats on a per message type basis. This property is specified as a comma separated list of values. Valid value include: all Indicates that all available per message type latency stats should be collected. none Indicates that no message type latency stats should be collected. c2o Indicates create to offer latencies should be captured. o2p Indicates offer to poll (input queueing time) should be captured. mfilt Indicates that time spent in application message filters should be captured. mpproc Indicates that time spent in the engine prior to message dispatch should be captured. mproc Indicates that the time spent in application message handlers should be captured. The values 'all' or 'none' may not be combined with other values. This value only applies when captureMessageTypeStats is true. When not specified the value defaults to all. See Also: ISrvMonAppEngineStats for details on general engine stats. ISrvMonMsgTypeStats for details on stats collected on a per message type basis. AepEngineDescriptor.html.setMessageTypeStatsLatenciesToCapture(String) for programmatic configuration DDL Reference - for configuration settings.
capturePerTransactionStats perTransactionStatsLogging	false	Configuration See Per Transaction Stats for more details.

The above settings can be configured in config.xml as follows:

Code Block

	xml
	xml

<apps>
  <templates>
    <app name="app-template">
      <captureTransactionLatencyStats>true</captureTransactionLatencyStats>
      <captureEventLatencyStats>true</captureEventLatencyStats>
      <captureMessageTypeStats>true</captureMessageTypeStats>
      <messageTypeStatLatenciesToCapture>c2o,o2p,mpproc,mproc,mfilt</messageTypeStatLatenciesToCapture>
      <capturePerTransactionStats>false</capturePerTransactionStats>
      <perTransactionStatsLogging policy="Off">
        <detachedWrite enabled="true"></detachedWrite>
      </perTransactionStatsLogging>
    </app>
  </templates>
</apps>

Anchor
_Toc221568552
_Toc221568552
Metrics Collected

...

Metric	Description
NumFlows	Total number of message flows functioning in the engine.
NumMsgsRcvdBestEffort	Total number of messages received by the engine on best-effort channels.
NumMsgsRcvdGuaranteed	Total number of messages received by the engine on guaranteed channels.
NumMsgsSourced	[EventSourcing Only] Total number of messages sourced from from the recovery log or primary agent in a cluster during agent inialization.
NumMsgsFiltered	The number of messages that were filtered.
NumDupMsgsRcvd	Total number of duplicate messages received and discarded by an engine. This metric will always be 0 if duplicate detection has been disabled via the nv.aep.duplicate.checking configuration property.
NumMsgsSentBestEffort	Total number of messages sent by the engine on best-effort channels.
NumMsgsSentGuaranteed	Total number of messages sent by the engine on guaranteed channels.
NumMsgsResent	Total number of messages retransmitted by an engine. When a backup agent assumes the role of a primary on failover, it restransmits retransmits all in-doubt messages as part of its first transaction. In-doubt messages are those messages for which positive acknowledgements have not been received from the downstream messaging components. This metric records the number of such restransmitted messages.
NumEventsRcvd	Total number of events received by an engine. Events received by an engine include message events and the non-message events, e.g. 'channel up event', 'channel down event', etc.
NumFlowEventsRcvd	Total number of flow events received by the engine. Flow events are synonymous with message events. These include message events received on message channels and message events injected for processing into the engine by the application.
NumFlowEventsProcSuccess	Total number of successfully processed flow events. A flow event is considered successfully processed if the application did not throw an exception while processing the message associated with the event.
NumFlowEventsProcFail	Total number of failed flow events. A flow event is considered to have failed processing if the application threw an unchecked exception while procesing the message associated with the event.
NumFlowEventsProcComplete	Total number of flow events whose transactions have completed. Each successfully processed flow event participates in an AEP transaction. This metric counts the number of successful flow events whose transactions have completed.
NumTransactions	Total number of transactions that have been committed or rolled back. Note: This metric is equal to NumCommitsStarted + NumRollbacks
NumCommitsStarted	Total number of transactions whose commits have been started. Note: Transactions are committed in a pipelined manner. Therefore, there can be multiple transactions in the commit pipeline. This metric counts the number of transactions that have been entered into the commit pipeline. See NumCommitsCompeted
NumCommitsCompleted	Total number of transactions whose commits have completed. This metric counts the number of transactions that have completed and exited the transaction commit pipeline. The difference between this and the NumCommitsStarted metric is the number of transactions in flight at any point in time. See NumCommitsStarted
NumSendCommitsStarted	Total number of transactions whose send commits have been started. A transaction commit pipeline is comprised of a send commit pipeline (the commit of outbound messages in the transaction) and a store commit pipeline (commit to store changes). This metric counts the number of transactions in the send commit pipeline.
NumSendCommitsCompleted	Total number of transactions whose send commits have completed. This metric counts the number of transactions whose send commits have completed and exited the send commit pipeline.
SendCommitCompletionQueueSize	Number of transaction in the send commit completion queue. Transactions that complete the send portion of the commit are passed through a send commit completion queue for sequencing before entering into the next phase of a transaction commit. This metric counts the number of transactions held in the post send commit sequencing queue at any point in time.
NumStoreCommitsStarted	Total number of transactions whose store commits have been started. A transaction commit pipeline is comprised of a send commit pipeline (the commit of outbound messages in the transaction) and a store commit pipeline (commit to store changes). This metric counts the number of transactions in the store commit pipeline.
NumStoreCommitsCompleted	Total number of transactions whose store commits have completed. This metric counts the number of transactions whose store commits have completed and exited the store commit pipeline.
StoreCommitCompletionQueueSize	Number of transaction in the store commit completion queue. Transactions that complete the store portion of the commit are passed through a store commit completion queue for sequencing before entering into the next phase of a transaction commit. This metric counts the number of transactions held in the post store commit sequencing queue at any point in time.
NumRollbacks	Number of transactions that have been rolled back.
BackupOutboundQueueSize	[EventSourcing Only] Number of messages in a backup's outbound queue. A backup outbound queue holds outbound messages sent during concurrent processing by a backup agent in an Event Sourced cluster. The messages are held in the queue as in-doubt messages until notifications are received from the primary 'acknowledging' the receipt of the messages from downstream messaging components/agents. The messages are flushed from the queue upon receipt of such notifications. This metric counts the number of messages in a backup's outbound queue. Note: The notifications are piggy-backed on other replication traffic to avoid extra network traffic. Therefore, in the absence of replication traffic, messages can remain in the backup's outbound queue even though acknowledged by the downstream messaging system until some replication traffic occurs to replicate the downstream ack notification.
BackupOutboundLogQueueSize	[EventSourcing Only] Number of messages in a backup's outbound log queue. A backup's outbound log queue holds outbound messages for the course of a single transaction. The queue is flushed to the outbound message log upon completion of the transaction. This metric holds the number of messages in the outbound log queue.
OutboundSno	The current outbound sequence number in use by the engine. Each engine instance maintains a sequence number space for messages outbound by the engine. This metric holds the current outbound sequence number.
OutboundStableSno	The current 'stable' outbound sequence number. An outbound is considered to be stable when a positive acknowledgement is received for the message from the downstream messaging system. This metric holds the sequence number of the last stable outbound message.
Message Type Specific Statistics	Message type specific statistics. The engine also (optionally driven by configuration) maintains statistics for each of the different message types flowing through the engine. The following are the different metrics collected for each message type: NumMsgsRcvdBestEffort NumMsgsRcvdGuaranteed NumMsgsSourced NumDupMsgsRcvd NumMsgsSentBestEffort NumMsgsSentGuaranteed NumMsgsResent The semantic meaning of these metrics for each message type is identical to metrics with the same name described above, except that it is local to the message type.

...

The AEP engine's statistics thread performs the following operations:

Computes higher level statistics such as metric averages.
Outputs raw and computed statistics to a trace logger.
Triggers alerts based on configured alert thresholds.

The statistics thread can be started/stopped either programmatically or via configuration parameters.

...

Each AepEngine object is associated with an AepEngineStats object that can be obtained from the engine as follows:
AepEngine.getStats()
An application can start and stop the statistics thread using the startPeriodicOutput() and stopPeriodicOutput() methods respectively exported Each AepEngineStats object. For example, the following would start the statistics thread with a periodic output and alert dispatch every 1 second.
AepEngineStats.startPeriodicOutput(1)
Correspondingly, the following would stop the statistics thread
AepEngineStats.stopPeriodicOutput()

...

An AEP engine's statistics thread can also be started via the following environment variable or system property.
nv.aep.<engineName>.stats.interval=<interval in seconds>
For example, the following would start the 'forwarder' AEP engine's statistics thread with a periodic output and alert frequency of 5 seconds.
nv.aep.forwarder.stats.interval=5
Once started administratively, the statistics thread remains active until the engine is stopped or the thread is programatically stopped.

Note
Note: When configuring using environment variables, in Unix based systems where the shell does not support "." in environment variables, the "." can be replaced by "_".

...

By default, an AEP engine does not collect message type specifc stats. To enable message type specific stats the following should be set in you DDL config:

...

<apps>
  <app name="MyApp">
    <captureMessageTypeStats>true</captureMessageTypeStats>
  <app>
</apps
 
<xvms>
  <xvm name="MyXVM">
    <hearbeats enabled=true" interval="5">
      <includeMessageTypeStats>true</includeMessageTypeStats>
    </heartbeats>
  </xvm>
<xvms>

Transaction Latencies

Transaction latencies traced by an engine stats thread includes summary statistics for various phases within the transaction processing pipeline. The meaning of these summary statistics is as follows:

...

Phase	Description
mpproc	Records the time (in microseconds) spent by the engine dispatching the message to an application.
mproc	Records latencies for application message process times (in an EventHandler).
mfilt	Records latencies for application message filtering times (by a message filter).
msend	Time spent in AepEngine.sendMessage(). The time in the AepEngine's send call. This latency will be a subset of mproc for solicited sends and it includes msendc.
msendc	Time spent in the AepEngine's core send logic. This leg includes enqueuing the message for delivery in the corresponding bus manager.
cstart	Time spent from the point the first message of a transaction is received to the time the transaction is committed.
cprolo	Time spent from the point where transaction commit is started to send or store commit, whichever occurs first. This latency measures the time taken in any bookkeeping done by the engine prior to commit the transaction to store (or for an engine without a store until outbound messages are released for delivery).
csend	The send commit latency: i.e. time from when send commit is initiated, to receipt of send completion event. This latency represents the time from when outbound messages for a transaction are released to the time that all acknowledgements for the messages are received. Because this latency includes acknowledgement time a high value for csend does not necessarily indicate that downstream latency will be affected. The Message Latencies listed below allow this value to be decomposied further.
ctrans	Time spent from the point the store commit completes to the beginning of the send commit which releases a transaction's outbound messages for delivery. If the engine doesn't have a store, then this statistic is not captured as messages are released immediately.
cstore	The store commit latency i.e. time from when store commit is initiated to receipt of store completion event. This latency includes the time spent serializing transaction contents, persisting to the store's transaction log, inter cluster replication, and replication to backup members including the replication ack. High values in cstore will impact downstream message latencies because store commit must complete before outbound messages are released for delivery. The cstore latency is further broken down in the Store Latencies listed below.
cepilo	Time spent from the point the store or the send commit completes, whichever is last, to commit completion.
cfull	Time spent from the time the first message of a transaction is received to commit completion.
tleg1	Records latencies for the first transaction processing leg. Transaction Leg One includes time spent from the point where the first message of a transaction is received to submission of send/store commit. It includes message processing and and any overhead associated with transactional book keeping done by the engine. Each transaction leg is a portion of the overall commit time that is processed on the Aep Engine's thread. The sum of the transaction leg stats are important in that they determine the overall throughput that an application can sustain in terms of transactions per second.
tleg2	Records latencies for the second transaction processing leg. Transaction Leg Two includes time spent from the point where the send/store commit completion is received to the submission of store/send commit. Each transaction leg is a portion of the overall commit time that is processed on the Aep Engine's thread. The sum of the transaction leg stats are important in that they determine the overall throughput that an application can sustain in terms of transactions per second.
tleg3	Records latencies for the third transaction processing leg. Transaction Leg Three includes time spent from the point where the store/store commit completion is received to the completion of the transaction commit. Each transaction leg is a portion of the overall commit time that is processed on the Aep Engine's thread. The sum of the transaction leg stats are important in that they determine the overall throughput that an application can sustain in terms of transactions per second.
inout	Records latencies for receipt of a message to transmission of the last outbound message.
inack	Records latencies for receipt of a message to stabilization (and upstream acknowledgement for Guaranteed).

Anchor
_Toc221568556
_Toc221568556
Message Type Specific Stats

By default, an AEP engine does not collect message type specifc stats. To enable message type specific stats the following should be set in you DDL config:

Code Block

	xml
	xml

<apps>
  <app name="MyApp">
    <captureMessageTypeStats>true</captureMessageTypeStats>
  <app>
</apps
 
<xvms>
  <xvm name="MyXVM">
    <hearbeats enabled=true" interval="5">
      <includeMessageTypeStats>true</includeMessageTypeStats>
    </heartbeats>
  </xvm>
<xvms>

Bus Connection Stats

The engine stats thread can also trace summary statistics for its message buseseach of the message bus connections managed by the engine. Each message bus connection is wrapped by a Bus Manager which handles bus connect and reconnect, and also provides transactional semantics around the bus by queuing up messages that will be sent as part of an engine transactionhandled by bus manager. The manager handles bus connection establishment, reconnect handling and message IO through the underlying connection. From the perspective on an engine bus manager, a bus connection is synonymous with an SMA bus connection. Each Bus Manager maintains statistics for across bus binding reconnects, allowing continuous stats across bus binding reconnects. The following sections break these statistics down in more detail.

...

Field	Description
NumMsgsRcvd	The number of message received by the bus.
NumMsgsInBatches	The number of message messages received by the bus that were part of a batch.
NumMsgBatchesRcvd	The number of batch message received by the bus.
NumPacketsRcvd	The number of raw packets received by the bus.
NumMsgsEnqueued	The total number of batch messages enqueued for delivery by this bus.
NumAcksSent	The total number of acknowledgement acknowledgment sent upstream for received messages by this bus.
NumStabilityRcvd	The number of stability events (acks) received by this bus.
NumStabilityBatchesRcvd	The number of batched stability events received by this bus.
NumMsgsEnqueued	The total number of batch messages enqueued for delivery by this bus.
NumMsgsSent	The total number of batch messages enqueued message that were actually sent by the bus.
NumFlushesSync	The number of times this bus has been synchronously flushed.
NumFlushesAsync	The number of times this bus has been asynchronously flushed.
NumMsgsFlushedSync	The number of messages flushed by synchronous flushes.
NumMsgsFlushedAsync	The number of messages flushed by asynchronous flushes.
NumAsyncFlushCompletions	The number of asynchronous flushes for this that have completed.
NumCommits	The number of transactions committed by the bus.
NumRollbacks	The number transactions rolled back for the bus.

Bus Disruptor Latencies

When a bus manager is configured for detached send (aka detached commit), a transaction's outbound messages are sent on the underlying SMA connection by the bus manager's I/O thread. The bus manager uses a disruptor to manage the handoff of messages to the manager's IO thread. The "Offer to Poll" latency measures the time the outbound messages are latent in the disruptor queue. High o2p latencies in a Bus Manager may indicate that messages are being released for send faster than we can actually send them through the underlying SMA connection.

Disruptor statistics follow the message and transaction statistics in the bus manager statistics trace:

Panel
Disruptor..{[<DisruptorNumUsed> of <DisruptorNumAvailable>] <DisruptorUsagePct>% (<DisruptorClaimStrategy>, <DisruptorWaitStrategy>)} ...[o2p] [sample=0, min=-1 max=-1 mean=-1 median=-1 75%ile=-1 90%ile=-1 99%ile=-1 99.9%ile=-1 99.99%ile=-1]

When a Bus Manager is configured for detached send (aka detached commit), a transaction's outbound messages are dispatched on the Bus Manager's I/O thread. The "Offer to Poll" latency measure the time from when messages are released for delivery after being stabilized in the store, until the detached bus manager picks them up for send. High o2p latencies in a Bus Manager may indicate that messages are being released for send faster than we can actually send them.

Bus Clients, Channels and Fails

After the disruptor statistics are counters indicating the number of connected clients, active channels and binding failures:

...

Messaging latencies follow the clients, channels and fails output. The following latency statistics relate to the bus manager's message handling pipeline:

Phase	Description
c2o	The create to send latencies in microseconds, the time in microseconds from message creation to when send was called for it. Note, this statistic is for outbound messages sent through a bus and is different from the c2o statistic captured for an AepEngine which tracks the create to offer times for received/injected messages offered to the application's input queue.
o2s	The send to serialize latencies in microseconds, the time from when the message was sent until it was serialized in preparation for transmission on the wire. For an engine with a store this will include the time from the application's send call, the replication hop (if there is a store) and time through the bus manager's disruptor if detached commit is enabled for the bus manager.
s	The serialize latencies in microseconds, the spent serializing the MessageView to its transport encoding.
s2w	The serialize to wire latencies in microseconds, the time post deserialize serialize to just before the message is written to the wire.
w	The wire latencies in microseconds, the time an inbound messages spent on the wire. The time spent on the wire from when the message was written to the wire by the sender to the time it was received off the wire by the receiver. Note: that this metric is subject to clock skew when the sending and receiving sides are on different hosts.
w2d	The time from when the serialized form was received from the wire to deserialization.
d	The time (in microseconds) spent deserializing the message and wrapping it in a MessageView.
d2i	The time (in microseconds) from when the message was deserialized to when it is received by the engine. This measure the time from when the bus has deserialized by the bus to when the app's engine picks it up from it's input queue (before it dispatches it to an application) handler (it includes the o2p time of the engine's disruptor). Additional time spent by the engine dispatching the message to the application handler is covered by mpproc (see the Transaction Latencies table).
o2i	The origin to receive latencies in microseconds. The time from when a message was originally created to when it was received by the binding.
w2w	The wire to wire latencies in microseconds, for outbound messages the time from when the corresponding inbound message was received off the wire to when the outbound message was written to the wire.

Event Multiplexer (Input Queue) Statistics

The Event Multiplexer reports statistics describing the latency between:

the The point at which an event is offered to the multiplexer by a thread and the point at which it is actually enqueued for processing – i.e. the o2p statistics traced per Feeder Queue.

Panel

Feeder Queues (max=16, lastDecongest=0)
...X-Server-ems1-Main (aff=[]) has 0 (decongestCount=64500)
......[o2p] [sample=0, min=-1 max=-1 mean=-1 median=-1 75%ile=-1 90%ile=-1 99%ile=-1 99.9%ile=-1 99.99%ile=-1]
...X-ODS-StoreEventMultiplexer-1 (aff=[3(s0c3t0)]) has 0 (decongestCount=64492)
......[o2p] [sample=0, min=-1 max=-1 mean=-1 median=-1 75%ile=-1 90%ile=-1 99%ile=-1 99.9%ile=-1 99.99%ile=-1]

the The time between when an event being enqueued is offered to the multiplexer for processing and actually being dequeued for processing by the multiplexer the time at which it is returned from a poll by the engine's event multiplexer thread for processing – i.e. the o2p statistics traced for the Event Multiplexer's Disruptor.

Panel
[Event Multiplexer (ems)] Disruptor (MultiThreadedSufficientCores, BusySpin).. [0 of 1,024] 0% ...[o2p] [sample=0, min=-1 max=-1 mean=-1 median=-1 75%ile=-1 90%ile=-1 99%ile=-1 99.9%ile=-1 99.99%ile=-1]

...

Message receipt and send statistics described above per message type
Histograms for application processing and filtering times.

Anchor
_Toc221568563
_Toc221568563
Appendix A - Statistics Output

...

See Appendix A

Threads

The following output threads can be enabled to trace individual types of statistics, which is useful for testing and performance tuning. Enabling these output threads is not required for collecting stats. Statistics trace output is not zero garbage, so in a production scenario it usually makes more sense to collect stats via Xvm Heartbeats, which emits zero garbage heartbeats with the above statistics.

Note

When an AepEngine is running inside of an XVM (the most common case), engine statistics are included in XVM heartbeats and should be traced using the XVM tracing facilities. The trace threads described below should not be enabled when running within an XVM as collection by the trace threads and XVM Stats collector thread can interfere with one another.

See Tracing Heartbeats

Configuration Setting	Default	Description
nv.aep.<engine>.stats.interval	0	The interval (in seconds) at which engine stats will be traced for a given engine. Can be set to a positive integer indicate the period in seconds at which the engine's stats dump thread will dump recorded engine statistics. Setting a value of 0 disables creation of the stats thread. When enabled, engine stats are traced to the logger 'nv.aep.engine.stats' at a level of `Tracer.Level.INFO`; therefore, to see dumped stats, a trace level of 'nv.aep.engine.stats.trace=info' must be enabled. NOTE: disabling the engine stats thread only stops stats from being periodically traced. It does not stop the engine from collecting stats; stats can still be collected by an external thread (such as the Talon Server which reports the stats in server heartbeats). In other words, enabling the stats thread is not a prerequisite for collecting stats, and disabling the stats reporting thread does not stop them from being collected. NOTE: while collection of engine stats is a zero garbage operation, tracing engine stats is not a zero garbage when performed by this stats thread. For latency sensitive apps, it is recommended to run in a Talon server which can collect engine stats and report them in heartbeats in a zero garbage fashion.
nv.aep.<engine>.sysstats.interval	0	The interval (in seconds) at which engine sys stats will be reported. Set to 0 (the default) to completely disable sys stats tracing for a given engine. In most cases, AEP sys stats will not be used and system level stats would be recorded in the Server Statistics from which an AEPEngine is running.
nv.event.mux.<name>.stats.interval	0	The interval (in seconds) at which multiplexer stats will be traced. Multiplexer stats can also be reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero.
nv.msg.latency.stats.interval	0	The interval (in seconds) at which message latency stats are traced. This setting has no effect if nv.msg.latency.stats is false. This allows granular tracing of just message latency stats on a per bus basis. Message latency stats can also be reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero.
nv.aep.busmanager.<engine>.<bus>.stats.interval	0	The interval (in seconds) at which bus stats will be traced. Bus stats reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero. When engine stats output is disabled this can be used to trace only bus stats for a particular message bus.

Anchor
_Toc221568564
_Toc221568564
Output Trace Logger

An AEP engine's statistics thread uses a named trace logger to log the raw and computed statistics output. The following is the name of the logger used by an engine:
nv.aep.<engineName>.stats
For example, the 'forwarder' engine uses the following logger to log statistics output:
nv.aep.forwarder.stats
The configured logger can be either a native logger or an SLF4J logger. Refer to the X Platform Tracing and Logging document for details on X Platform trace loggers.

Anchor
_Toc221568565
_Toc221568565
Appendix

...

B – Statistics Output Format

The following is a sample output of the statistics output by an AEP engine's statistics thread sections break down the output of engine statistics. The format below is used when the stats are traced by an XVM, the Stats Dump Tool or the Statistics threads described in Appendix A.

Panel

<11,33440,wsip-24-120-50-130.lv.lv.cox.net> 20130204-03:09:43:338 (inf)...[ <nv.aep.aep.forwarder.stats> STATS]
Flows{1} Msg{In{25,901(364 0) 25,901(364 0) 0(0 0) 0(0 0) 0X(0 0) (0)} Out{25,901(364 0) 25,901(364 0) 0(0 0) (25,901 25,901 0 0) (0)} Latency{InOut{0 us} InAck{0 us}}} Ev{51,806(728 0) 25,901[25,901,0,25,901](364 0)} Txn{25,902[(25,902,25,902),(25,902,25,902 (0)),(0,0 (0)),0](364 0) 0} Store{-1}
[Message Type Specific]
OrderEvent In{0(0 0) 0(0 0) 0(0 0) 0(0 0) (0)} Out{25,901(364 0) 25,901(364 0) 0(0 0) (0)}
Trade In{25,901(364 0) 25,901(364 0) 0(0 0) 0(0 0) (0)} Out{0(0 0) 0(0 0) 0(0 0) (0)}

...

Anchor
_Toc221568581
_Toc221568581
Transactions Processed, Outstanding Commits & Transaction Rate

The number of outstanding commits represents the transactions currently in flight i.e. (NumCommitsStarted - NumCommitsCompleted)

Panel

Flows{1} Msg{In{25,901(364 0) 25,901(364 0) 0(0 0) 0(0 0) 0X(0 0) (0)} Out{25,901(364 0) 25,901(364 0) 0(0 0) (25,901 25,901 0 0) (0)} Latency{InOut{0 us} InAck{0 us}}} Ev{51,806(728 0) 25,901[25,901,0,25,901](364 0)} Txn{<NumTransactions>[(<NumCommitsStarted>,<NumCommitsCompleted>),(<NumSendCommitsStarted>,<NumSendCommitsCompleted> (<SendCommitCompletionQueueSize>)),(<NumStoreCommitsStarted>,<NumStoreCommitsCompleted> (<StoreCommitCompletionQueueSize>)),<NumRollbacks>](<TransactionRate> <DeltaTransactionRate>) <NumFlowEventsPerTransaction>} Store{-1}

Anchor
_Toc221568582
_Toc221568582
Engine Store Size

...

The Talon Manual

Additional Links

Versions Compared

Old Version 4

New Version Current

Key

Configuration Settings

Stats Enablement Properties

Configuration Settings

Global Latency Stats Settings

Statistics Output Threads

Per Engine Stats Settings

Anchor
_Toc221568552
_Toc221568552
Metrics Collected

Transaction Latencies

Anchor
_Toc221568556
_Toc221568556
Message Type Specific Stats

Bus Connection Stats

Bus Disruptor Latencies

Event Multiplexer (Input Queue) Statistics

Anchor
_Toc221568563
_Toc221568563
Appendix A - Statistics Output

Threads

Anchor
_Toc221568564
_Toc221568564
Output Trace Logger

Anchor
_Toc221568565
_Toc221568565
Appendix

B – Statistics Output Format

Anchor
_Toc221568581
_Toc221568581
Transactions Processed, Outstanding Commits & Transaction Rate

Anchor
_Toc221568582
_Toc221568582
Engine Store Size

The Talon Manual

Additional Links

Page History

Versions Compared

Old Version 4

New Version Current

Key

Configuration Settings

Stats Enablement Properties

Configuration Settings

Global Latency Stats Settings

Statistics Output Threads

Per Engine Stats Settings

Anchor_Toc221568552_Toc221568552Metrics Collected

Transaction Latencies

Anchor_Toc221568556_Toc221568556Message Type Specific Stats

Bus Connection Stats

Bus Disruptor Latencies

Event Multiplexer (Input Queue) Statistics

Anchor_Toc221568563_Toc221568563Appendix A - Statistics Output

Threads

Anchor_Toc221568564_Toc221568564Output Trace Logger

Anchor_Toc221568565_Toc221568565Appendix

B – Statistics Output Format

Anchor_Toc221568581_Toc221568581Transactions Processed, Outstanding Commits & Transaction Rate

Anchor_Toc221568582_Toc221568582Engine Store Size

Anchor
_Toc221568552
_Toc221568552
Metrics Collected

Anchor
_Toc221568556
_Toc221568556
Message Type Specific Stats

Anchor
_Toc221568563
_Toc221568563
Appendix A - Statistics Output

Anchor
_Toc221568564
_Toc221568564
Output Trace Logger

Anchor
_Toc221568565
_Toc221568565
Appendix

Anchor
_Toc221568581
_Toc221568581
Transactions Processed, Outstanding Commits & Transaction Rate

Anchor
_Toc221568582
_Toc221568582
Engine Store Size