See also Transaction Log Tool for general usage information about the tool, along with command reference.

In This Section

Overview

An X Platform application has two sources of data that are backed by transaction log files which are interesting to query: application state and inbound/outbound messaging traffic. This document discusses the tools available to query transaction logs. For the sake of brevity, the term log is used interchangeably herein. 

 

Application State: In memory application state is modeled as a singly rooted object tree. When using State Replication the state tree is built incrementally by a series of PUT, UPDATE and REMOVE events that can be replicated by a primary updating the state tree and possibly also persisted to the application's recovery log. For an application using Event Sourcing, application state is built up by replaying all of an application's inbound messaging traffic with the expectation that it will produce exactly the same output as the original inbound message stream.

Recovery Transaction Log: The recovery transaction log holds all of the data needed to recover an application to its last stable state from disk. In an Event Sourcing application, this transaction log will hold all of an application's inbound messages. A StateReplication transaction log holds all of the incremental updates to the application's state tree plus the application's unacknowledged outbound messages. 

Inbound Message Transaction Log: An inbound transaction log is an audit log that can be used to log inbound messages for audit purposes.

Outbound Message Transaction Log: An outbound transaction log is an audit log that can be used to log an application's outbound messages.

An application configured with recovery, inbound and outbound transaction logs is in a good position to have the after-the-fact historical analysis performed on it via query. Because it is often the case that an application is operating in collaboration with several other application, it is also generally true that a set of log files from multiple inter-operating applications are of interest to a user. A user might want to, for example, track a message's path through a set of applications, therefore the X Platform query engine allows query of multiple logs.

Querying in the X-Platform can be accomplished via a Programmatic API which provides the ability to query a set of objects using bean path style selectors via a QueryEngine. Querying can also be performed on the command line using the Transaction Log Tool via the xlogtool scripts provided with an X distribution or by. Command line queries are entered using XPQL, the X Platform Query Language, a SQL-like query grammar.

The Transaction Log Query Schema

An X Platform transaction log is simply a series of Entries containing messages and/or state updates. This section discusses the mapping of such a log to a table structure suitable for query.

Tables

To understand how we query logs, one must first look at the structure of a log. A log is comprised of series of serialized Entries, These entries are made accessible for reading via the platform's RogLog.LogReader.Entry class:

For transaction logs, XPQL maps this object model into the following logical table structure to facilitate SQL-like querying. Each transaction log is modeled as a table, and each Entry in a transaction log is modeled as a row in the transaction log table. Each row's columns consist of the Entry class itself, Metadata and a column for each Entity and/or Message type present in the transaction log. For example, in a state replication log, we might have objects such as Customer, Employee, and OrderRequest. We would then have 4 columns in the table which represents this log and any given row of the table will contain the Entry fields and Metadata for the log Entry and will have exactly one Entity or Message type column populated, e.g.

EntryMetadataEmployeeCustomerOrderRequest
Entry
entryType=Put 
Metadata
transactionId=1 
NULL

Customer
cid=2
firstName='Bob' 

NULL
Entry
entryType=Put 
Metadata
transactionId=2 

Employee
eId=1
firstName='Bob'

NULLNULL
Entry
entryType=Update
Metadata
  transactionId=3  
NULLCustomer
cid=2
firstName='Bobby' 
NULL
Entry
entryType=Put 
Metadata
  transactionId=4  

Employee
eid=2
firstName='Susan' 

NULLNULL

Similarly, an event sourcing log's table will consist of one column for Metadata and an additional number of columns for each type of Message contained within the log. This allows us to SELECT from among each of the Entity/Message types that are relevant to the log in question.

Each of these columns is a composite column. The column may be selected as is in which case the object itself is returned, but it's fields can also be accessed via standard java bean paths. For example, you can refer to Entry.entryType, Metadata.inboundSequenceno, or Customer.firstName. The following query serves as an example of this.

query>SELECT Employee.firstName, Customer.firstName, Entry.transactionId, Entry.timestamp, \
             metadata.isMessage \ 
             FROM logs WHERE firstName = 'Bob'
Query executed in 90ms. Use 'next' or 'dump' to display results
query>next 10
|  Employee.firstName|  Customer.firstName|   transactionId|                  timestamp|Metadata.isMessage|
+--------------------+--------------------+----------------+---------------------------+------------------+
|                null|                 Bob|               1|2014-01-27 05:28:41.832 PST|             false|
|                 Bob|                null|               2|2014-01-27 05:28:41.835 PST|             false|
No more results

Indexes

In order to find and retrieve the relevant rows from a log table we will create indexes against relevant fields of the table. For each log file, e.g. /rdat/warehouse.log, we will create an index file, e.g. /rdat/warehouse.idx, that will contain these indexes. Indexes can be created for any field of any Entity or Message in the log table. For example, we could index Customer.lastName or we could index Order.itemId. Each of these indexes must be declared as either unique or non-unique. This decision determines whether the index can refer to multiple rows of the table. For example, in an event sourcing system we might have a unique index on NewOrderSingle.id. We would not, however, have a unique index on the Customer.id of a state replication system because we would expect to see one row for the initial PUT of the Entity and additional rows for its UPDATE and potential REMOVE messages. An index can also be created on fields that span multiple Entities/Messages. For example, if many or even every Message had a timestamp field we could create an INDEX against timestamp and thereby allow any Message, regardless of what type, to be retrieved by timestamp.

Transaction Log Tool Querying

The Transaction Log Tool provides various utilities for working with X Platform logs including query via XPQL. The log tool can be launched via the xtxnlogtool.bat or xtxnlogtool.sh script provided with an X Platform distribution. See the Transaction Log Tool page for more details on launching the tool.

Opening Logs

In the transaction log tool any open log is available for query, Logs are opened with the open command. Given a transaction log named OrderProcessingApp.log in the tool's working directory:

open OrderProcessingApp  --Now eligible for SELECT statements
close OrderProcessingApp --No longer considered in select statements

The tool also allows for opening all logs in a given directory:

open .                --Opens all logs in the tool's working directory
open /logs/backupLogs --Opens all logs in the /logs/backupLogs directory.
list logs             --Lists all logs that are currently open
close                 --Closes all open logs

XSQL Query Syntax

The query syntax is a subset of ANSI SQL SELECT. The basic form is:

SELECT <columns>

FROM <logs>

WHERE <conditions>

GROUP BY <columns>

ORDER BY <columns>

LIMIT [skip,]fetch

The WHERE, GROUP BY, ORDER BY and LIMIT clauses are optional.

Column and Field Syntax

Usually, you can simply specify a column and field by using the domain class's simple name and the bean path of the field, for example 'Order.orderId' (or even just 'orderId' if you don't care about the type that contains the field). This is typically enough for the query parser to find and locate the field of interest. In cases where ambiguity arises the full column and field syntax is specified below:

'*' | 'x_repository_name' | [Column][.][fieldPath]
where:
 
Column:= 'Entry'|'Metadata'|DomainType|AggregateColumn
DomainType: SimpleName | QualifiedClassName
SimpleName:= A simple classname e.g. 'Customer'
QualifiedClassName:= The fully qualified class name with '.' replaced by '/' e.g. com/mycompany/MyMessage
fieldPath:= A bean path for the field of interest using standard bean path case. 
AggregateColumn:= AggregateFunction(Column)
AggregateFunction:= COUNT|MIN|MAX|SUM|AVG

When specifying a field's path the field should be specified in a case consistent with standard bean paths described here. It is possible to use the `describe` command listed below to list an object's fields. 

Examples

Column/FieldNotes
Customer

Selects all Customer objects from the log. This returns the Customer object itself rather than any of its fields which is useful in cases where you plan to dump raw results from the transaction log tool (or if you are working with the programmatic API).

Customer.firstNameSelects/Matches any Customer object's first name field. Also matches any subclass of Customer that has a firstName field. If no such field exists this results in an error. This would not, for example, select or match Employee.firstName.
Object.firstName

Selects/Matches any object that is a subclass of Object that has a firstName field. An error is reported if two types have a firstName field that differs in type. For example if Customer.firstName is a String, but CustomerFlags.firstName is a boolean this will produce an error.

firstName

An unqualified fieldPath shorthand for Object.firstName

or:

selects a class with a simpleName of 'firstName'.

Entry.object.firstName fieldNested select/match of Entry's object field for any object that has a firstName field.
Customer.address.zipCodeSelect of nested zipCode field in Customer object.
address.zipCode

Unqualified select/match of nested address.zipCode field in any object that has one

or:

Select of address class and its field zipCode. This would be an error if there existed a class

com.mycompany.address and a class Customer with an address.zipCode field.

com/packageA/Customer

com/packageB/Customer

Column type resolution can usually be achieved with just the domain object's simple name. Fully qualified the type is necessary in cases for which the type would otherwise be ambiguous.

com/mycompany/Customer.firstNameFully qualified select/match of Customer's firstName field.
com/mycompany/Customer/firstNameFully qualified select/match of the unconventionally named 'firstName' class
com/mycompany/Customer$InnerClass.firstNameFully qualified select/match of the InnerClass.firstName field from the inner class of Customer.
*Can be used in a Select clause to select all objects. This is synonymous with specifying 'Object'
x_repository_name

Pseudo column that can be used in a SELECT clause to select the log name from which an entry came

which can be useful in cases where you are querying across multiple logs.

MetadataSelects the metadata column (see examples below)
EntrySelects the Entry column
COUNT(Customer)A count of the number of non-null Customer entries with respect to the GROUP BY clause's aggregation key
COUNT( * )A count of all of the entries with respect to the GROUP BY clause's aggregation key
MIN(Customer.address.zipcode)The least value found for this zipcode field with respect to the GROUP BY clause's aggregation key
MAX(Customer.address.state)The greatest value found for this state field with respect to the GROUP BY clause's aggregation key
SUM(Employee.salary)The arithmetic sum of all salary entries with respect to the GROUP BY clause's aggregation key
AVG(Employee.bonus)The arithmetic average or mean of all employee bonuses with respect to the GROUP BY clause's aggregation key
Entry Fields

The below fields can be selected using Entry.<ColumnName> or just <Column> name if the name is not ambiguous: 

ColumnTypeNotes
entryType
enum

Select the type of entry ('Put', 'Update', 'Send', 'Message', or 'Remove').

iduuidThe id of the payload object in this entry.
timestamplongThe timestamp at which the entry was logged. See Dates and Timestamps below for more details.
transactionIdlongThe AEP transactionId for the entry
stableTransactionIdlongThe Stable transaction id for the entry (id of a previous transaction that commit of this entry marks as stable)
commitStartbooleanWhether this entry demarcates the first entry in an AEP transaction
commitEndbooleanWhether this entry demarcates the last entry in an AEP transaction. An AEP transaction isn't considered to be complete until the 'commitEnd' entry is written.
classNameStringThe fully qualified name of the payload object (e.g. message or entity).
simpleClassNameStringThe simple class name of the payload object.
entrySizeintThe serialized size of the entry including the log entry metadata.
payloadSizeintThe serialized size of the contained payload object.
logNameStringThe name of the log from which this entry was read.
filePositionlongThe position of the entry in the transaction log
object
 Selects the object associated with the entry. This can be null if the entryType is 'Remove'
metadata
 The metadata for the object associated with the entry. This is never null even for removes.

 

You can see a listing of the available Entry fields by issuing:

describe entry


Metadata Fields

All domain objects in a transaction log implement the IRogMetadata interface which means that all entries have metadata. A listing of metadata fields can be found by consulting the javadoc for com.neeve.rog.IRogMetadata.

The following are all valid ways of representing the isMessage field

ColumnNotes
Metadata.isMessage

Select Metadata column and isMessage field

Entry.metadata.isMessage

Select Entry column, its metadata.isMessagenested bean path

metadata.isMessage

Unqualified select of Entry column, selecting its metadata.isMessagenested bean path

Entry.object.isMessage

Select Entry column, its object.isMessagenested bean path (all object implement metadata)

object.isMessage

Unqualified select of Entry column, selecting its object.isMessagenested bean path (all objects implement metadata)

Object.isMessage

Wildcard select of any object type (and all objects have metadata)

isMessage

Unqualified select of isMessage(all objects have metadata)

You can see a listing of the available metadata fields by issuing:

describe metadata

Note that:

Customer.transactionId and transactionId are different, the former only selects Customer transactionIds while the latter selects all transactionIds

Listing available types

The types in the log can be displayed using the 'list' command which displays a list of the types available through the object factories registered with the log:

list types

Describing a type's fields

The fields of a type that are available for query can be described with the describe command:

describe Customer                         --unqualified
describe com/mycompany/Customer           --fully qualified

This command will list out each of the Customer class fields. If Customer has an embedded address field of type Address, the fields of Address can be described with:

describe Customer.address                 --unqualified
describe com/mycompany/Customer.address   --fully qualified

Conditions

Conditions are of the general form: <field> <operator> <value(s)>

Some notes on the use of fields are in order with respect to performance. Deserializing an Entry's payload: Entry.object is comparatively quite costly. Consequently, it is preferable to place predicates that do not require deserialization first in the WHERE clause. Such fields include:

Hence the following WHERE clause is an example of this approach, in that places predicates against Entry.timestamp prior to other predicates that require the deserialization of the payload.

WHERE Entry.timestamp BETWEEN 1421271440000 AND 1421271449255
AND firstName = 'John'

The query engine's optimizer is being enhanced in a coming release to identify such predicates so that they can be evaluated first, regardless of where they are located in the WHERE clause.

GROUP BY Aggregation

The GROUP BY clause is optional, and when present specifies the granularity to which a query's result set is aggregated. Specifically, this means that the result set of the query is aggregated to contain one row per distinct combination of values for the columns specified in the GROUP BY clause. Consider the following data.

idfirstNamelastNamecity
1JohnSmithBoston
2EmilySmithBoston
3AliceJonesChicago
4JohnWilliamsChicago

An aggregate query might yield only the distinct last names and cities:

SELECT lastName, city 
FROM logs 
GROUP BY lastName, city
lastNamecity
WilliamsChicago
JonesChicago
SmithBoston

An aggregate function, discussed in more detail later, could summarize the number for instances of each last name/city.

SELECT lastName, COUNT(*) 
FROM logs 
GROUP BY lastName
lastNamecityCOUNT( * )
WilliamsChicago1
SmithBoston2
JonesChicago1

GROUP BY Aggregate Functions

Aggregate functions are only applied to other non-aggregate columns/fields. Further, aggregate functions are only applied to the non-null values of those columns/fields.

FunctionDescription
COUNTReturns the cardinality of non-null values, or 0 if all values are null. Hence COUNT(Customer.middleName) will return the number of Customer entries with non-null middleName fields.
MINReturns the minimum non-null value obtained, or null if all values are null. MIN can only be applied to Comparable fields.
MAXReturns the maximum non-null value obtained, or null if all values are null. MAX can only be applied to Comparable fields.
SUMReturns the arithmetic sum of all non-null values, or null if all values are null. SUM can only be applied to numeric fields, fields that implement Number.
AVG

Returns the arithmetic average or mean of all non-null values, or null if all values are null. AVG can only be applied to numeric fields, fields that implement Number. Note, in particular for AVG, that null values are completely excluded from the computation. Hence, averaging: {null, 4, null, null, 5, 0} yields (4+5+0)/3 = 3.

 

ORDER BY Sorting

The ORDER BY clause is also optional. It allows a sort order to be specified with respect to one or more fields. Each field's ordering may optionally be specified as either ascending, the default, or descending.

For example,

SELECT lastName, city
FROM logs
ORDER BY city

would return rows for 'Boston' before 'Chicago'. Because no additional ORDER BY fields are specified the order of rows within each such group is arbitrary. If we additionally specified,

ORDER BY city, lastName

then each row within a given city grouping would be in alphabetic order by last name. Finally, the modifier ASC, for ascending, or DESC for descending may be appended after one or more ORDER BY fields. The default ordering is ascending, so the following clause is equivalent to the one above.

ORDER BY city ASC, lastName ASC

If, instead, we wished to have the last names in descending order we would specify either of the below, equivalent clauses.

ORDER BY city, lastName DESC
ORDER BY city ASC, lastName DESC

At this time ORDER BY sorting may not be applied to aggregate queries, that is queries that have aggregate fields and/or a GROUP BY clause. This is a temporary restriction and will be removed in an upcoming release.

LIMIT -- Restricting the size of the result set

The LIMIT clause restricts the number of rows to be fetched and optionally skips a given number of rows. For example,

SELECT *
FROM logs
LIMIT 10

will only return the first 10 rows from the log(s). An optional, two argument form specifies the number of rows to be skipped followed by the number of rows to be fetched. The query,

SELECT *
FROM logs
LIMIT 50, 10

will skip 50 rows and then return the next 10. The LIMIT clause is often used with an ORDER BY clause to specify the number of entries to be fetched with respect to a particular ordering of the result set. E.g.

SELECT *
FROM logs
ORDER BY transactionDate DESC
LIMIT 10

would return the most recent 10 entries with respect to the transactionDate.

Dates and Timestamps

Timestamp Display

Where possible Date and timestamp fields displayed within the tool will be displayed using format defined by the timestampFormat and timestampTZ tool configuration properties which can be set at any time. The current values of the formats can be viewed with the following command:

query>get -a timestamp
|       Property|                    Value|                         Description|
+---------------+-------------------------+------------------------------------+
|timestampFormat|yyyy-MM-dd HH:mm:ss.SSS z|The default timestamp format to use |
|               |                         |when displaying results (use quotes |
|               |                         |when setting).                      |
|timestampTZ    |DEFAULT                  |The timezone in which to display    |
|               |                         |timestamps. A value of 'DEFAULT'    |
|               |                         |uses the current system timezone    |
+---------------+-------------------------+------------------------------------+

See the Transaction Log Tool page for more detailed on these properties. 

Time Parsing

Dates and timestamps can be passed as conditions pertaining to fields that have a type of date or long.

Standard SQL Date and Timestamp literals are accepted:

Note that a date literal is specified by DATE'yyyy-MM-dd', and that the literal begins with DATE. A timestamp literal, likewise, must begin with TIMESTAMP. A timestamp literal includes hours, minutes, seconds and milliseconds. The timezone field is optional.

While using DATE and TIMESTAMP literals are a surefire way to enter dates and timestamps into a query it can be cumbersome and also inconvenient in places where the timestamp output is not configured to match standard SQL syntax. Consequently, several additional mechanism for parsing a non-literal specified timestamp:

Examples:

-- Select Orders with a receipt Date after April 1st 2010 at 00:00 AM:
-- using a DATE literal
WHERE Order.receiptDate > DATE'2010-04-01'
 
-- Select Orders with a receipt Date after April 1st 2010 at 00:00 AM:
WHERE Order.receiptDate > '2010-04-01'

 
-- Select yesterday's orders
WHERE Order.receiptDate BETWEEN 'yesterday' AND 'today'
 
-- Orders since yesterday
WHERE Order.receiptDate >= 'yesterday'
 
-- Select any order with a timestamp of yesterday at 00:00 AM ... NOT any order yesterday.
WHERE Order.receiptDate = 'yesterday'
 
-- Orders between yesterday at 3pm and today at 3pm:
WHERE Order.receiptDate BETWEEN 'yesterday at 3pm' AND 'today at 3PM'
 
-- Orders received in the last 30 minutes:
WHERE Order.receiptDate > '30 minutes ago'                

Some timestamps in transaction log entry metadata fields are captured with microsecond resolution. Examples of such fields are preProcessingTs, outTs, and enqueueTs. These field contain additional 'Micros' suffixed accessors that allow the microsecond level timestamp to be retrieved (e.g. preProcessingTsMicros, outTsMicros and enqueueTsMicros). The transaction log tool doesn't currently support any timestamp formatting of these microsecond level timestamps; they are displayed simply as longs and may be used for computing microsecond deltas.

Class Literals

A class literal consists of a string containing the fully qualified class name, for example:

WHERE salesOrder.timestamp.class IN ('java.lang.Long', 'java.util.Date')

For factory-generated classes the simple name may be used, as long as it is unambiguous, for example:

AND Entry.object.class = 'Customer'

Inner classes are referenced with the standard Java syntax, e.g. 'java.util.Map$Entry'. Anonymous inner classes use the standard naming as well, 'com.foo.Bar$1'.

Escaping

Text/string scalars are defined via standard SQL in single quotes. Within a string double quotes may be used freely and single quotes are specified as two single quotes in a row. E.g.

Repositories should be given aliases that are ANSI SQL identifiers. This means that the identifier must begin with a letter and may contain letters, numbers and underscores. In some cases this is not possible, for example, if the contents of an entire directory are opened via "open ." or "open some/path" then the repositories/logs are specified by their corresponding file names. These may well not be valid SQL identifiers. In these cases, the repository names must be escaped by quoting them with double quotes. E.g.

Example Queries

First, we will query a repository, finding Customers with a particular first & last name and an entry type of PUT.

SELECT Customer
FROM orderProcessingApp
WHERE Customer.lastName = 'Doe'
AND Customer.firstName = 'John'
AND Entry.entryType = 'PUT'

Inner classes may be specified with the standard Java class nomenclature. So, to select all of the History entries on April 1st 2013 you can issue:

SELECT Customer$History
FROM orderProcessingApp
WHERE Customer$History.lastOrderDate = DATE '2013-04-01'

We can also select fields of interest via standard Java bean-path notation, For example, to find the transaction in which Customer 'John Doe' was created you could issue:

SELECT transactionId, Customer.firstName,
	Customer.address.zipcode
FROM orderProcessingApp
WHERE Customer.lastName = 'Doe'
AND Customer.firstName = 'John'
AND Entry.entryType = 'Put'

The following query demonstrates a query across all 'columns' in the schema. In the table model this means across all object types, so it would select any object that has matching metadata criteria:

SELECT * 
FROM orderProcessingApp
WHERE transactionId = '30'
	OR Entry.entryType = 'MESSAGE'

The following query demonstrates aggregation of sales across zip codes.

SELECT Shipment.address.zipcode, SUM(Shipment.total), AVG(Shipment.total)
FROM orderProcesingApp
WHERE Shipment.address.state = 'TX'
GROUP BY Shipment.address.zipcode

 

In cases where object names might be ambiguous, the fully qualified class name can be specified to resolve the ambiguity. The package structure is delimited with "/".

SELECT com/company/ordering/Request
FROM orderProcessingApp
WHERE com/company/ordering/Request.customerId = '12345'
AND com/company/ordering/Request.quantity >= 1
AND Entry.entryType = 'Send'
 
SELECT com/company/shipping/Request
FROM orderProcessingApp
WHERE com/company/shipping/Request.customerId = '12345'
AND com/company/shipping/Request.cost < 100.00
AND Entry.entryType = 'Message'

Finally, if you have added multiple logs to the query engine you can select all of them via the aggregate view "logs". This will return the results of querying each log, presented sequentially, i.e. log1's results, followed by log2's results, and on until the results of the last log are returned.

SELECT x_repository_name, NewOrderSingle
FROM logs
WHERE NewOrderSingle.symbol = 'IBM'
AND NewOrderSingle.id IN (4001,4007,4008)
AND NewOrderSingle.trader BETWEEN 'Johnson' AND 'Jones'

Results

The execution of a query yields a RogLogResultSet. This is a navigable container that implements Iterator and has a count() method. Note that computing count() may result in a delay in some cases. Each row of the RogLogResultSet is comprised of the columns specified in the SELECT clause. A RogLogResult exposes its columns through accessors similar to those found in a java.sql.ResultSet. Specifically, this means that SELECT Customer will result in a single column RogLogResultSet that has a single Customer object. Similarly, SELECT Metadata.seqNo, Entry.entryType, Address.city, Address.state would yield a RogLogResultSet comprised of four columns of type Long, Enum, String & String.

Rows of a ResultSet

A result set will contain one row for each relevant entry/node of the underlying repository/log. It is essential to understand what the word relevant means in this context. Consider the following query:

SELECT Customer
FROM orderProcessingApp
WHERE Customer.lastName = 'Doe'
AND Customer.firstName = 'John'
AND Entry.entryType = 'Put'

Note that the only domain entity or message referred to is Customer. Because of this only entry that contain a Customer as their payload will be returned. If there are entries in a log that contain an Order as their root payload they will not be returned, regardless of whether there is an Order.customer field of type Customer. If, however, NO domain entities/messages are referenced then all rows in the log/repository will be returned. Consider:

SELECT Entry.entryType, transactionId
FROM orderProcessingApp
WHERE Entry.entryType = 'Put'

This query returns every entry from a log that is a 'Put' regardless of the type of object it contains. If multiple domain entities/messages are referenced, then each underlying entry/node whose domain entity/message is referenced will be returned. Hence the following query will return entries from a log whose root payload is either a Customer or an Order.

SELECT Customer.id, Order.shipDate
FROM orderProcessingApp
WHERE transactionId= 1234

Finally, if only unqualified fields are used in a query, e.g. firstName vs the qualified Customer.firstName, a separate implicit filtering is applied. In this case only entries that contain at least one of the selected fields are returned. Consider:

SELECT firstName, address.province 
FROM orderProcessingApp
WHERE firstName = 'Bob'

In this case, only entries that contain either a firstName (e.g. Customer or Employee) or an address.province (e.g. Customer or Warehouse) would be returned. It is not necessary that these fields be populated, as long as they exist on the entry null values will be returned, including entries containing only null values for these fields.

We consider address.province more closely. If the class Address does not declare such a field, but only its subclass CanadianAddress has a field province then address must be such a CanadianAddress. Entries containing a UsAddress (with perhaps a state field) would not be returned. Neither would entries with a null address be returned because Address itself does not have a province field.

Working with Results

You can navigate and display results from a select statement in the Transaction Log Tool with the next, rewind, skip, count and dump commands.

See Transaction Log Tool for more information

Count

The count command gives a count of the number of results. In its current form this results in a scan of all of the results to perform the count

Execute 'help count' in the tool for more information

Rewind

The rewind command resets the query result to the beginning. 

Execute 'help rewind' in the tool for more information

Skip

Skips the next result or next N results, for example skip 10 would skip the next 10 results. 

Execute 'help skip' in the tool for more information

Next

The next command prints the next or next 'n' results to the console. for example:

query>SELECT Customer.firstName, Entry.transactionId, timestamp, entryType from logs WHERE firstName = 'Bob'
Query executed in 90ms. Use 'next' or 'dump' to display results
query>next 10
|  Customer.firstName|       transactionId|                  timestamp|        entryType |
+--------------------+--------------------+---------------------------+------------------+
|                 Bob|                   1|2014-01-27 05:28:41.832 PST|             'Put'|
|                 Bob|                   2|2014-01-27 05:28:41.835 PST|          'Update'|
No more results

By default the tool dumps in a textual tabular format. You can also issue next -d to print the details of the corresponding entry. You can also specify -c to print results in csv format which can be useful when values would otherwise be truncated.

Execute 'help next' in the tool for more information

Dump

The dump command dumps all the results to either the console or a file. When dumping to a file, the file suffix .csv will cause the output to be formatted as csv and the file suffix will .html will cause the output to be formatted as an html table. Specifying the -d flag will dump detailed results for each entry that matched the query. 

Execute 'help dump' in the tool for more information

Creating Indexes

The following syntax is used to create indexes see Column and Field Syntax

CREATE <optional:UNIQUE> INDEX <name> ON <repository>(<indexField>)
CREATE UNIQUE INDEX customer_u1 ON some_repo(com/foo/bar/Customer.id)
--Creates a non unique index on the address.zipcode field of Customer:
CREATE INDEX customer_n1 ON another_repo(com/foo/bar/Customer.address.zipcode) 
CREATE UNIQUE INDEX metadata_id ON repo(Metadata.id)

Note that a default package can be specified. In this context either slashes or dots are accepted as delimiters.

SCHEMA com.foo.bar
SCHEMA com/foo/bar

With such a setting we could create two of the above indexes via:

CREATE INDEX customer_n1 ON repo(Customer.address.zipcode)
CREATE UNIQUE INDEX customer_u1 ON repo(Customer.id)

Java API

The Java API is in beta and still is under development. The API below is fairly stable, but subject to change.

Accessing Repositories

First we must instantiate a QueryEngine and a Repository. For TransactionLogs, a RogLog serves as a repository for queries. The following snippet demonstrates creating a transaction log and adding it to a query engine.

// Assuming an orderProcessingApp.log in a folder indicated by:
// System.getProperty("NVROOT")/rdat
// OR
// System.getProperty("nv.data.directory")
RogLog log = RogLogFactory.createLog("orderProcessingApp");
log.open();
RogLogRepository repository = log.asRepository();
repository.open();
RogLogQueryEngine queryEngine = RogLogFactory.createQueryEngine();
queryEngine.addRepository(repository, "orderProcessingApp");

Any number of repositories may be added to an engine. Unless a repository is specified in a query (see query details below), queries are run against all repositories associated with an engine. A repository may be remove from the engine by its alias.

queryEngine.removeRepository("orderProcessingApp");

Fields

The fields that are indexed, selected in queries, and used in predicates in queries (details below) are all defined similarly. A field is defined by the bean path that leads to its value. Since we are dealing with log entries, the bean path for any message or entity will start with "object" since you would retrieve it from Entry.getObject()

An example of a field might be a customer's first name customer, which would be accessed in Java via String val = ((Customer)entry.getObject()).getFirstName().

RogLogField<String> customerFirstName = queryEngine.getField(Customer.class, "firstName");
RogLogField<String> customerLastName = queryEngine.getField(Customer.class, "lastName");
RogLogField<Long> customerId = queryEngine.getField(Customer.class, "customerId");

Let's look at how the customerFirstName field was created:

  1. RogLogField<String> – The type signature of the field indicates that its job is to yield a String value from a log entry.
  2. Fields.get(Customer.class ...) – The first parameter passed to Fields.get() indicates that this field applies only to entries whose root domain element is a Customer. This field does not apply to log entries whose root payload is an Order, regardless of whether there is an Order.customer field.
  3. Fields.get(..., "firstName", ...) – The 2nd parameter indicates the bean path from the log entries contained entity or message to the field. In Java this corresponds to: ((Customer)entry.getObject()).getFirstName().

A field could also have a nested path, such as ((Customer)entry.getObject()).getAddress().getZipcode().

RogLogField<Short> customerZipcode = queryEngine.getField(Customer.class, "address.zipcode");

Aggregate Fields

Aggregate fields are specified directly from their parent field. Hence, using the above customerZipcode, we could effect "COUNT(Customer.address.zipcode)" in the Java API with, e.g.

customerZipcode.count()
customerZipcode.min()
customerZipcode.max()
employeeSalary.sum()
employeeSalary.average()

Creating Indexes

Either unique or non-unique indexes may be created. Indexes are defined as bean paths based on an entity or message. They are typed by their entity/message as well as the final field to which they refer.

RogLogField<String> customerZipcode = queryEngine.getField(Customer.class, "address.zipcode");
RogLogField<Integer> customerId = queryEngine.getField(Customer.class, "customerId");
queryEngine.createIndex(customerZipcode, false); // non-unique index
queryEngine.createIndex(customerId, true); // unique index

Queries

The simplest way to execute queries using the API is to pass in an XQL query:

// Select All Customers from open logs with a customer id of 2:
RogLogQuery query = queryEngine.createQuery("SELECT * FROM logs WHERE Customer.customerId = 2");
RogLogResultSet rs = queryEngine.execute(query);
while (rs.next()) {
    Customer customer = (Customer)rs.getLogEntry().getObject();
    System.out.println(customer.toJsonString());
}
rs.close(); 

The above query will select all log entries for all logs added to the query engine which contains a Customer object with a customerId of 2. 

Constructing Queries Programmatically

For more complex use cases one can use query engine's fluent APIs to build up queries programmatically. The following examples discuss how the fluent API

//Creating a query:
RogLogQuery query = queryEngine.createQuery();
query.select(customerFirstName)
     .select(customerLastName)
     .select(customerZipcode)
        .from("orderProcessingApp")
        .where(customerId.is((long)2));
RogLogResultSet rs = queryEngine.execute(query);

// do something with results:
while (rs.next()) {
    Customer customer = (Customer) rs.getLogEntry().getObject();
    System.out.println(customer.toJsonString());
}

// Close the result set
rs.close();


//Same query created using xpql
query = queryEngine.createQuery("SELECT Customer.firstName, Customer.address.zipcode " +
        "FROM orderProcessingApp WHERE Customer.customerId = 2");

Note, that because the QueryEngine may need to allocate substantial resources to serve query results, application code must close the returned ResultSet when done working with it. Failing to close a result set may lead to resource leaks.

 

Compound predicates may be formed with the conjunctions AND and OR which can be built fluently from another Predicate. These may be nested arbitrarily:

And, the SQL equivalent of "SELECT *" is available via RogLogQueryFieldResolver.SELECT_ALL_FIELD:

RogLogField<Entry.Type> entryType = queryEngine.getField("entryType");
query = queryEngine.createQuery().select(RogLogQueryFieldResolver.SELECT_ALL_FIELD)
        .from("orderProcessingApp")
        .where(customerLastName.is("Doe")
                .and(customerFirstName.like("J%"))
                .and(entryType.is(Entry.Type.Put))
        );
rs = queryEngine.execute(query);
rs.close();
This will yield a RogLogResultSet with one column for each entity/message type found within the repository. It must be noted that any given log entry, even the very last, could contain a new type of entity/message and hence add a new column to the ResultSet.
A GROUP-BY clause is specified via the groupBy() method on the query.

 

RogLogField<Long> timestamp = queryEngine.getField("timestamp");
query = queryEngine.createQuery().select(customerZipcode)
        .select(customerId.count())
        .select(customerId.min())
        .select(customerId.max())
        .from("logs")
        .where(timestamp.greaterThan(UtlDataTypes.parseAsDate("one day ago").getTime()))
        .groupBy(customerZipcode);
rs = queryEngine.execute(query);
rs.close();

An ORDER-BY clause is indicated by calls to the orderBy() method on the query. This method has a two-parameter form that takes a field as well as a SortOrder value, ASC or DESC. The single parameter form only takes a field and is equivalent to calling the two parameter version with SortOrder.ASC as the second parameter.

query = queryEngine.createQuery().select(customerLastName)
        .select(customerFirstName)
        .from("logs")
        .orderBy(customerLastName, SortOrder.ASCENDING);

A LIMIT clause is specified by a call to either the single or two parameter method limit() on the query.

query = queryEngine.createQuery().select(customerLastName)
        .select(customerFirstName)
        .from("logs")
        .orderBy(customerLastName, SortOrder.ASCENDING)
        .limit(1);

Working with Query Results

The ResultSet that we obtain from a query is a subset of the standard Java ResultSet API. To that end, it allows iteration by repeated calls to next() and column value may be accessed via the familiar ResultSet getters. Note that columns are indexed are 1-based. For example, if a query only selected Customer as its single column its results could be read via:

query = queryEngine.createQuery();
query.select(queryEngine.getField("entry.object"))
        .from("orderProcessingApp");
rs = queryEngine.execute(query);
while (rs.next()) {
    Customer customer = rs.getObject(1);
}

Similarly, the first sample query in this section specifies columns for customerFirstName, customerLastName and customerZipcode. This query's ResultSet could be read via:

 while(rs.next()) {
	String firstName = rs.getString(1);
	String lastName = rs.getString(2);
	Short zipcode = rs.getShort(3);
 }

Finally, the underlying log entry containing both the object and metadata may be obtained via a call to getLogEntry(), so regardless of the fields selected the corresponding entry will always be available (unless the query is an aggregate query). 

 while (rs.next()) {
    Entry entry = rs.getLogEntry();
    IRogMetadata metadata = entry.getMetadata();
    Object object = entry.getObject();
}