The Talon Manual

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Updating config samples to use xvm rather than server elements. Fixing typos.

...

Heartbeats for an XVM can be enabled via DDL XML using the <heartbeats> element: 

Code Block
xml
xml
<servers><xvms>
  <server<xvm name="my-xvm">
    <heartbeats enabled="true" interval="5">
      <collectNonZGStats>true</collectNonZGStats>
      <collectIndividualThreadStats>true</collectIndividualThreadStats>
      <collectSeriesStats>true</collectSeriesStats>
      <collectSeriesDatapoints>false</collectSeriesDatapoints>
      <maxTrackableSeriesValue>100000000</maxTrackableSeriesValue>
      <includeMessageTypeStats>false</includeMessageTypeStats>
      <collectPoolStats>true</collectPoolStats>
      <poolDepletionThreshold>1.0</poolDepletionThreshold>
      <logging enabled="true"></logging>
      <tracing enabled="true"></tracing>
    </heartbeats>
  </server>xvm>
</servers>xvms>
Configuration SettingDefaultDescription
enabled
false

Enable or disable server stats collection and heartbeat emission.

Note

Collection of stats and emission of heartbeats can impact application performance from both a latency and throughput standpoint. For applications that are particularly sensitive to performance, it is a good idea to compare performance with and without heartbeats enabled to understand the overhead that is incurred by enabling heartbeats.

interval
1000The interval in seconds at which server stats will be collected and emitted.
collectNonZGStats
trueSome statistics collected by the stats collection thread require creating a small amount of garbage. This can be set to false to supress collection of these stats.
collectIndividualThreadStats
trueIndicates whether heartbeats will contains stats for each active thread in the JVM. Individual thread stats are useful
collectSeriesStats
trueIndicates whether or not series stats should be included in heartbeats.
collectSeriesDatapoints
false

Indicates whether or not series stats should report the data points captured for a series statistic.

(warning) Enabling this value includes each datapoint collected in a series in heartbeats which can make emitted heartbeats very large and slow down their collection. It is not recommended that this be run in production.

maxTrackableSeriesValue
10 minutesThe maximum value (in microseconds) that can be tracked for reported series histogram timings. Datapoints above this value will be downsampled to this value, but will be reflected in the max value reported in an interval.
includeMessageTypeStats
false

Sets whether or not message type stats are included in heartbeats (when enabled for the app).

When captureMessageTypeStats is enabled for an app, the AepEngine will record select statistics on a per message type basis. Because inclusion of per message type stats can significantly increase the size of heartbeats, inclusion in heartbeats is disabled by default.

(lightbulb) For message type stats to be included in heartbeats, both captureMessageTypeStats for the app must be set to true (capture is disabled by default because recording them is costly), and includeMessageTypeStats must be set to true (inclusion is disabled by default because emitting them is costly).

collectPoolStats
trueIndicates whether or not pool stats are collected by the XVM.
poolDepletionThreshold

Anchor
PoolDepletionThreshold
PoolDepletionThreshold
 
1.0

Configuration property used to set the percentage decrement at which a preallocated pool must drop to be included in a server heartbeat. Setting this to a value greater than 100 or less than or equal to 0 disables depletion threshold reporting.

This gives monitoring applications advanced warning if it looks like a preallocated pool may soon be exhausted. By default the depletion threshold is set to trigger inclusion in heartbeats at every 1% depletion of the preallocated count. This can be changed by specifying the configuration property nv.server.stats.pool.depletionThreshold to a float value between 0 and 100.

For example:
If a pool is preallocated with 1000 items and this property is set to 10, pool stats will be emitted for the pool each time a heartbeat occur and the pool has dropped below a 10% threshold of the preallocated size e.g. at 900, 800, 700, until its size reaches 0 (at which point subsequent misses would cause it to be included on every heartbeat).

Setting this to a value greater than 100 or less than or equal to 0 disables depletion threshold reporting.

 

logging
false

Configures binary logging of heartbeats.

Binary heartbeat logging provides a means by which heartbeat data can be captured in a zero garbage fashion. Collection of such heartbeats can be useful in diagnosing performance issues in running apps.

tracing
false

Configures trace logging of heartbeats.Binary heartbeat logging provides a means by which heartbeat data can be captured in a zero garbage fashion. Collection of such heartbeats can be useful in diagnosing performance issues in running apps

Enabling textual tracing of heartbeats is a useful way of quickly seeing data in heartbeats for applications that aren't monitoring xvm heartbeats remotely. Textual trace of heartbeats is not zero garbage and is therefore not suitable for applications that are latency sensitive.

Enabling Global Stats

An XVM collects stats that are enabled for the applications that it contains. The followings stats can be enabled and reported in heartbeats

...

By default all server statistics tracers are disabled as trace logging is not zero garbage and introduces cpu overhead in computing statistics. While tracing heartbeats isn't recommended in production, enabling server statistics trace output can be useful for debugging and performance tuning. To enable you will need to configure the appropriate tracers at the debug level. See the Output Trace Loggers section for more detail.

Code Block
languagexml
<servers><xvms>
  <server name<xvmname="my-xvm">
    <heartbeats enabled="true" interval="5">
      <logging enabled="true"></logging>
      <tracing enabled="true"></tracing>
    </heartbeats>
  </server>xvm>
</servers>xvms>

Binary Heartbeat Logging

Applications that are latency sensitive might prefer to leave all tracers disabled to avoid unnecessary allocations and the associated GC activity. As an alternative, it's possible to enable logging of zero-garbage heartbeat messages to a binary transaction log:

Code Block
languagexml
<servers><xvms>
  <server<xvm name="my-xvm">
    <heartbeats enabled="true" interval="5">
      <logging enabled="true">
         <storeRoot>/path/to/heartbeat/log/directory</storeRoot>
      </logging>
    </heartbeats>
  </server>xvm>
</servers>xvms>

When a storeRoot is not set, an xvm will log heartbeats to {XRuntime.getDataDirectory}/server-heartbeats/<xvm-name>-heartbeats.log, which can then be queried and traced from a separate process using the Stats Dump Tool.

...

Appears in trace output when nv.server.stats.enable=true and nv.server.stats.sys.trace=debug

Code Block
xml
xml
<servers><xvms>
  <server<xvm name="my-xvm">
    <heartbeats enabled="true" interval="5">
      <tracing enabled="true">
        <traceSysStats>true</traceSysStats>
      </tracing>
    </heartbeats>
  </server>xvm>
</server>xvms>
div
stylefont-size: smaller
No Format
[System Stats]
Sat May 13 12:14:03 PDT 2017 'market' server (pid=54449) 2 apps (collection time=0 ns)
System: 20 processors, load average: 0.73 (load 0.10 process, 0.10 total system)
Memory (system): 94.4G total, 89.8G free, 5.5G committed (Swap: 96.6G total, 96.6G free)
Memory (proc): HEAP 1.5G init, 522M used, 1.5G commit, 1.5G max NON-HEAP 2M init, 47M used, 48M commit, 0K max
Disk:
  [/ Total: 49.2GB, Usable:  18GB, Free:  18GB]
  [/dev/shm Total: 47.2GB, Usable: 47.2GB, Free: 47.2GB]
  [/boot Total: 484.2MB, Usable: 422.4MB, Free: 422.4MB]
  [/home Total: 405.2GB, Usable: 267GB, Free: 267GB]
  [/distributions Total: 196.9GB, Usable: 8.1GB, Free: 8.1GB]
Threads: 20 total (16 daemon) 21 peak
JIT: HotSpot 64-Bit Tiered Compilers, time: 2959 ms
GC:
...ParNew [0 collections, commulative time: 0 ms]
...MarkSweepCompact [1 collections, commulative time: 54 ms]

...

Individual thread stats can be traced by setting the following in DDL:

Code Block
xml
xml
<servers><xvms>
  <server<xvm name="my-xvm">
    <heartbeats enabled="true" interval="5">
      <collectIndividualThreadStats>true</collectIndividualThreadStats>
      <tracing enabled="true">
        <traceThreadStats>true</traceThreadStats>
      </tracing>
    </heartbeats>
  </server>xvm>
</server>xvms>

When enabled the following stats are traced to the console. 

...

Pools stats can be traced by setting the following in DDL:

Code Block
xml
xml
<servers><xvms>
  <server<xvm name="my-xvm">
    <heartbeats enabled="true" interval="5">
      <collectPoolStats>true</collectPoolStats>
      <tracing enabled="true">
        <tracePoolStats>true</tracePoolStats>
      </tracing>
    </heartbeats>
  </server>xvm>
</server>xvms>

To reduce the size of heartbeats, Pool Stats for a given pool are only included when:

...

Stats that you defined in your application are emitted, they can be included in trace with the following configuration:

Code Block
xml
xml
<servers><xvms>
  <server<xvm name="my-xvm">
    <heartbeats enabled="true" interval="5">
      <tracing enabled="true">
        <traceUserStats>true</traceUserStats>
      </tracing>
    </heartbeats>
  </server>xvm>
</server>xvms>

Trace Output

div
stylefont-size: 10px;
No Format
[App (ems) User Stats]
...Gauges{
......EMS Messages Received: 142604
......EMS Orders Received: 35651
...}
...Series{
......[In Proc Tick To Trade(sno=35651, #points=150, #skipped=0)
.........In Proc Tick To Trade(interval): [sample=150, min=72 max=84 mean=75 median=75 75%ile=77 90%ile=79 99%ile=83 99.9%ile=84 99.99%ile=84]
.........In Proc Tick To Trade (running): [sample=35651, min=72 max=2000 mean=93 median=76 75%ile=82 90%ile=111 99%ile=227 99.9%ile=805 99.99%ile=1197]
......[In Proc Time To First Slice(sno=35651, #points=150, #skipped=0)
.........In Proc Time To First Slice(interval): [sample=150, min=85 max=98 mean=88 median=88 75%ile=90 90%ile=92 99%ile=95 99.9%ile=98 99.99%ile=98]
.........In Proc Time To First Slice (running): [sample=35651, min=84 max=4469 mean=249 median=88 75%ile=95 90%ile=133 99%ile=283 99.9%ile=3628 99.99%ile=4143]
...}

...

The aep engine stats underlying your application are also included in heartbeats. Tracing of aep stats can be enabled with the following. 

Code Block
xml
xml
<servers><xvms>
  <server<xvm name="my-xvm">
    <heartbeats enabled="true" interval="5">
      <tracing enabled="true">
        <traceAppStats>true</traceAppStats>
      </tracing>
    </heartbeats>
  </server>xvm>
</server>xvms>

Trace Output

The trace output

...