www.espertech.comDocumentation
A named window is a globally-visible data window. A table is a globally-visible data structure organized by primary key or keys.
Named windows and tables both offer a way to share state between statements and are stateful. Named windows and tables have differing capabilities and semantics.
To query a named window or table, simply use the named window name or table name in the from
clause of your statement,
including statements that contain subqueries, joins and outer-joins.
Certain clauses operate on either a named window or a table, namely the
on-merge
, on-update
, on-delete
and on-select
clauses.
The fire-and-forget queries also operate on both named windows and tables.
Both named windows and tables can have columns that hold events as column values, as further described in Section 6.11, “Events as Property”.
Named Windows are introduced in Section 2.14.1, “Named Windows”.
A named window is a global data window that can take part in many statements and that can be inserted-into and deleted-from by multiple statements. A named window holds events of the same type or supertype, unless used with a variant stream.
The create window
clause declares a new named window. The named window starts up empty unless populated from an existing named window at time of deployment. Events must be inserted into the named window using the insert into
clause. Events can also be deleted from a named window via the on delete
clause.
Events enter the named window by means of insert into
clause of a select
statement. Events leave a named window either because the expiry policy of the declared data window removes events from the named window, or through statements that use the on delete
clause to explicitly delete from a named window.
A named window may also decorate an event to preserve original events as described in Section 5.10.4, “Decorated Events” and Section 6.2.2.1, “Named Windows Holding Decorated Events”.
To tune subquery performance when the subquery selects from a named window, consider the hints discussed in Section 5.11.8, “Hints Related to Subqueries”.
Tables are introduced in Section 2.14.2, “Tables”.
A table is a data structure that is globally visible and that holds state.
The columns of a table can store aggregation state, allowing for co-location of event data with aggregation state. Other statements can directly create and update the shared aggregation state. Statements can also query the aggregation state conveniently. Aggregation state can include comprehensive state such as for example a large matrix of long-type values for use in a Count-min sketch approximation. Common aggregation state can be updated by multiple statements.
Use the create table
clause to declare a new table.
The atomicity guarantees under multi-threaded evaluation are as follows. For a given statement, a table row or rows either exists or do not exist, consistently, for the duration of the evaluation of an event or timer against a context partition of a statement. The same is true for updates in that for a given context partition of a statement, each table row is either completely updated or not updated at all for the duration of an evaluation. Stream-level filter expressions against tables are not part of statement evaluation and the same atomicity applies to stream-level filter expressions.
As a general rule-of-thumb, if you need to share a data window between statements, the named window is the right approach. If however rows are organized by primary key or hold aggregation state, a table may be preferable. Statements allow the combined use of both.
In summary the difference comes from named windows holding immutable events, versus tables which hold update-in-place rows and columns that can hold plain values and complex aggregation state side-by-side.
One important difference between named windows and tables is in the data that a row holds: While named windows hold events, tables can hold additional derived state.
For example, a table column can hold rich derived state such as a distinct values set and rich aggregation state such as the state of a Count-min sketch approximation aggregation (a large matrix of long-type values).
// Declare a table to hold a Count-min sketch approximate count per feed create table AppoximateCountPerWord (feed string, approx countMinSketch())
A second difference between named windows and tables is the organization of rows. For named windows, the organization of rows follows the data window declaration. Tables, on the other hand, can be organized by a primary key or by multiple primary keys that make up a compound key.
For example, if your declaration specifies a sliding time window to hold 10 seconds of stock tick events then the rows are held in a sliding time window, i.e. a list or queue according to arrival order.
// Declare a named window to hold 10 seconds of stock tick events create window TenSecOfTicksWindow#time(10 sec) as StockTickEvent
An iterator for a named window returns rows in the order as provided by the data window(s) declared for the named window. An iterator for a table returns rows in an unpredictable order.
Only named windows provide an insert and remove stream to other statements. Tables do not provide an insert and remove stream.
For example, considering the TenSecOfTicksWindow
named window declared above,
the following statement outputs the current count each time events enter or leave the named window.
select count(*) from TenSecOfTicksWindow
Also for example, considering the AppoximateCountPerWord
table declared above, the following EPL does not output any rows
when table rows gets inserted, updated or deleted and only outputs rows when the statement is iterated:
// does not continously output for table changes select * from AppoximateCountPerWord
As named windows hold events and events are immutable, when an update statement updates events held in a named window, the runtime performs a logical copy operation (copy-on-write, as configured for the type) of each updated event, and only modifies the newly created event, preserving the immutable original event.
Data in tables are updated in-place. There is no copy operation for table rows.
For named windows, the data window declared for the named window instructs the runtime to expire and remove events from the named window.
Events can also be removed via on-merge
, on-delete
and fire-and-forget delete
.
For tables, row can only be removed via on-merge
, on-delete
, on-select-and-delete
and fire-and-forget delete
.
The create window
statement creates a named window by specifying a window name and one or more data windows, as well as the type of event to hold in the named window.
There are two syntaxes for creating a named window: The first syntax allows modeling a named window after an existing event type or an existing named window. The second syntax is similar to the SQL create-table syntax and provides a list of column names and column types.
A new named window starts up empty. It must be explicitly inserted into by one or more statements, as discussed below. A named window can also be populated at time of creation from an existing named window.
Your application can only undeploy the statement that created the named window after all other statements that use the named window are also undeployed.
The create window
statement posts to listeners any events that are inserted into the named window as new data. The statement posts all deleted events or events that expire out of the data window to listeners as the remove stream (old data). The named window contents can also be iterated on via the pull API to obtain the current contents of a named window.
The benefit of modeling a named window after an existing event type is that event properties can be nested, indexed, mapped or other types that your event objects may provide as properties, including the type of the underlying event itself. Also, using the wildcard (*) operator means your EPL does not need to list each individual property explicitly.
The syntax for creating a named window by modeling the named window after an existing event type, is as follows:
[context context_name] create window window_name.window_spec [as] [select list_of_properties from] event_type_or_windowname [insert [where filter_expression]]
The window_name you assign to the named window can be any identifier. The name should not already be in use as an event type or stream name or table name.
The window_spec are one or more data windows that define the expiry policy for removing events from the named window. Named windows must explicitly declare a data window. This is required to ensure that the policy for retaining events in the data window is well defined. To keep all events, use the keep-all window: It indicates that the named window should keep all events and only remove events from the named window that are deleted by on delete
, on merge
or fire-and-forget delete
. Data windows are listed in Chapter 14, EPL Reference: Data Windows.
The select
clause and list_of_properties are optional. If present, they specify the column names and, implicitly by definition of the event type, the column types of events held by the named window. Expressions other than column names are not allowed in the select
list of properties. Wildcards (*) and wildcards with additional properties can also be used.
The event_type_or_windowname is required if using the model-after syntax. It provides the name of the event type of events held in the data window, unless column names and types have been explicitly selected via select
. The name of an (existing) other named window is also allowed here. Please find more details in Section 6.2.1.4, “Populating a Named Window From an Existing Named Window”.
Finally, the insert
clause and optional filter_expression are used if the new named window is modelled after an existing named window, and the data of the new named window is to be populated from the existing named window upon deployment. The optional filter_expression can be used to exclude events.
You may refer to a context by specifying the context
keyword followed by a context name. Contexts are described in more detail at Chapter 4, Context and Context Partitions. The effect of referring to a context is that your named window operates according to the context dimensional information as declared for the context. For usage and limitations please see the respective chapter.
The next statement creates a named window OrdersNamedWindow
for which the expiry policy is simply to keep all events. Assume that the event type 'OrderMapEventType' has been configured. The named window is to hold events of type 'OrderMapEventType':
create window OrdersNamedWindow#keepall as OrderMapEventType
The below sample statement demonstrates the select
syntax. It defines a named window in which each row has the three properties 'symbol', 'volume' and 'price'. This named window actively removes events from the window that are older than 30 seconds.
create window OrdersTimeWindow#time(30 sec) as select symbol, volume, price from OrderEvent
In an alternate form, the as
keyword can be used to rename columns, and constants may occur in the select-clause as well:
create window OrdersTimeWindow#time(30 sec) as select symbol as sym, volume as vol, price, 1 as alertId from OrderEvent
The select
-clause when used with create window
provides types information only. You must still use on-merge
or insert-into
to insert events.
The second syntax for creating a named window is by supplying column names and types:
[context context_name] create window window_name.window_spec [as] (column_name column_type [,column_name column_type [,...])
The column_name is an identifier providing the event property name. The column_type is also required for each column. Valid column types are listed in Section 5.17.1, “Creating Variables: The Create Variable Clause” and are the same as for variable types.
For attributes that are array-type append []
(left and right brackets).
The next statement creates a named window:
create window SecurityEvent#time(30 sec) (ipAddress string, userId String, numAttempts int, properties String[])
Named window columns can hold events by declaring the column type as the event type name. Array-type in combination with event-type is also supported.
The next two statements declare an event type and create a named window with a column of the defined event type:
create schema SecurityData (name String, roles String[])
create window SecurityEvent#time(30 sec) (ipAddress string, userId String, secData SecurityData, historySecData SecurityData[])
Whether the named window uses a Map, Object-array, JSON or Avro event representation for the rows can be specified as follows. If the create-window statement provides the @EventRepresentation(objectarray)
annotation the runtime maintains named window rows as object array, for example.If no annotation is provided, the runtime uses the configured default event representation as discussed in Section 17.4.9.1, “Default Event Representation”.
The following statement instructs the runtime to represent FooWindow rows as object arrays:
@EventRepresentation(objectarray) create window FooWindow#time(5 sec) as (string prop1)
There is no syntax to drop or remove a named window.
Undeploying the deployment that created the named window also removes the named window.
Your statement may specify the name of an existing named window when creating a new named window, and may use the insert
keyword to indicate that the new named window is to be populated from
the events currently held by the existing named window.
For example, and assuming the named window OrdersNamedWindow
already exists, this statement creates a new named window ScratchOrders
and populates all orders in OrdersNamedWindow
into the new named window:
create window ScratchOrders#keepall as OrdersNamedWindow insert
The where
keyword is also available to perform filtering, for example:
create window ScratchBuyOrders#time(10) as OrdersNamedWindow insert where side = 'buy'
The insert into
clause inserts events into named windows. Your application must ensure that the column names and types match the declared column names and types of the named window to be inserted into.
For inserting into a named window and for simultaneously checking if the inserted row already exists in the named window or for atomic update-insert operation on a named window, or for targeting a specific partition, you must use on-merge
as described in Section 6.8, “Triggered Upsert Using the On-Merge Clause”. On-merge
is similar to the SQL merge
clause and provides what is known as an "Upsert" operation: Update existing events or if no existing event(s) are found then insert a new event, all in one atomic operation provided by a single statement.
This example first creates a named window using some of the columns of an OrderEvent event type:
create window OrdersWindow#keepall as select symbol, volume, price from OrderEvent
The insert into the named window selects individual columns to be inserted:
insert into OrdersWindow(symbol, volume, price) select name, count, price from FXOrderEvent
An alternative form is shown next:
insert into OrdersWindow select name as symbol, vol as volume, price from FXOrderEvent
Following above statement, the runtime enters every FXOrderEvent arriving into the runtime into the named window 'OrdersWindow'.
The following statements create a named window for an event type backed by a Java class and insert into the window any 'OrderEvent' where the symbol value is IBM:
create window OrdersWindow#time(30) as com.mycompany.OrderEvent
insert into OrdersWindow select * from com.mycompany.OrderEvent(symbol='IBM')
The last example adds one column named 'derivedPrice' to the 'OrderEvent' type by specifying a wildcard, and uses a user-defined function to populate the column:
create window OrdersWindow#time(30) as select *, price as derivedPrice from OrderEvent
insert into OrdersWindow select *, MyFunc.func(price, percent) as derivedPrice from OrderEvent
Event representations based on Java base classes or interfaces, and subclasses or implementing classes, are compatible as these statements show:
// create a named window for the base class create window OrdersWindow#unique(name) as select * from ProductBaseEvent
// The ServiceProductEvent class subclasses the ProductBaseEvent class insert into OrdersWindow select * from ServiceProductEvent
// The MerchandiseProductEvent class subclasses the ProductBaseEvent class insert into OrdersWindow select * from MerchandiseProductEvent
To avoid duplicate events inserted in a named window and atomically check if a row already exists, use on-merge
as outlined in Section 6.8, “Triggered Upsert Using the On-Merge Clause”. An example:
on ServiceProductEvent as spe merge OrdersWindow as win where win.id = spe.id when not matched then insert select *
Decorated events hold an underlying event and add additional properties to the underlying event, as described further in Section 5.10.4, “Decorated Events”.
This sample statement creates a named window that decorates OrderEvent events by adding an additional property named priceTotal
to each OrderEvent. A matching insert into
statement is also part of the sample:
create window OrdersWindow#time(30) as select *, price as priceTotal from OrderEvent
insert into OrdersWindow select *, price * unit as priceTotal from ServiceOrderEvent
The property type of the additional priceTotal
column is the property type of the existing price
property of OrderEvent.
A named window can be referred to by any statement in the from
clause of the statement. Filter criteria can also be specified. Data windows cannot be specified when selecting from a named window (the special derived-value windows are allowed however).
A statement selecting all events from a named window OrdersNamedWindow
is shown next. The named window must first be created via the create window
clause before use.
select * from OrdersNamedWindow
The statement as above simply receives the unfiltered insert stream of the named window and reports that stream to its listeners. The iterator
method returns all events in the named window, if any.
If your application desires to obtain the events removed from the named window, use the rstream
keyword as this statement shows:
select rstream * from OrdersNamedWindow
The next statement derives an average price per symbol for the events held by the named window:
select symbol, avg(price) from OrdersNamedWindow group by symbol
A statement that consumes from a named window, like the one above,
receives the insert and remove stream of the named window. The insert stream represents the events inserted into the named window. The remove stream represents the events expired from
the named window data window and the events explicitly deleted via on-delete
for fire-and-forget (on-demand) delete
.
Your application may create a consuming statement such as above on an empty named window, or your application may create the above statement on an already filled named window. The runtime provides correct results in either case: At the time of deployment the runtime internally initializes the consuming statement from the current named window, also taking your declared filters into consideration. Thus, your statement deriving data from a named window does not start empty if the named window already holds one or more events. A consuming statement also sees the remove stream of an already populated named window, if any.
If you require a subset of the data in the named window, you can specify one or more filter expressions onto the named window as shown here:
select symbol, avg(price) from OrdersNamedWindow(sector='energy') group by symbol
By adding a filter to the named window, the aggregation and grouping receive a filtered insert and remove stream. The above statement thus outputs, continuously, the average price per symbol for all orders in the named window that belong to a certain sector.
A side note on variables in filters filtering events from named windows: The runtime initializes consuming statements at statement deployment time and changes aggregation state continuously as events arrive. If the filter criteria contain variables and variable values changes, then the runtime does not re-evaluate or re-build aggregation state. In such a case you may want to place variables in the having
clause which evaluates on already-built aggregation state.
The following example further declares a custom derived-value window named myplugindatawindow
.
select * from OrdersNamedWindow(volume>0, price>0)#myplugindatawindow()
Data windows cannot be used onto named windows since named windows post insert and remove streams for the events entering and leaving the named window, thus the expiry policy and batch behavior are well defined by the data window declared for the named window. For example, the following is not allowed and fails at time of statement compilation:
// not a valid statement select * from OrdersNamedWindow#time(30 sec)
The create table
statement creates a table.
A new table starts up empty. It must be explicitly aggregated-into using into table
, or populated by an on-merge
statement, or populated by insert into
.
The syntax for creating a table provides the table name, lists column names and types and designates primary key columns:
[context context_name] create table table_name [as] (column_name column_type [primary key
] [,column_name column_type [primary key
] [,...]])
The table_name you assign to the table can be any identifier. The name should not already be in use as an event type or named window name.
You may refer to a context by specifying the context
keyword followed by a context name. Contexts are described in more detail at Chapter 4, Context and Context Partitions. The effect of referring to a context is that your table operates according to the context dimensional information as declared for the context. For usage and limitations please see the respective chapter.
The column_name is an identifier providing the column name.
The column_type is required for each column. There are two categories of column types:
Non-aggregating column types: Valid column types are listed in Section 5.17.1, “Creating Variables: The Create Variable Clause” and are the same as for variable types.
For attributes that are array-type append []
(left and right brackets).
Table columns can hold events by declaring the column type as the event type name. Array-type in combination with event-type is also supported.
Aggregation column types: These instruct the runtime to retain aggregation state.
After each column type you may add the primary key
keywords. This keyword designates the column as a primary key.
When multiple columns are designated as primary key columns the combination of column values builds a compound primary key.
The order in which the primary key columns are listed is important.
Further detail on key expressions can be found at Section 5.2.13, “Composite Keys and Array Values as Keys”.
The next statement creates a table to hold a numAttempts
count aggregation state and a column named active
of type boolean, per ipAddress
and userId
:
create table SecuritySummaryTable ( ipAddress string primary key, userId String primary key, numAttempts count(*), active boolean)
The example above specifies ipAddress
and userId
as primary keys. This instructs the runtime that the table holds a single row
for each distinct combination of ipAddress
and userId
. The two values make up the compound key and there is a single row per compound key value.
If you do not designate any columns of the table as a primary key column, the table holds only one row (or no rows).
The create table statement does not provide output to its listeners. The table contents can be iterated on via the pull API to obtain the current contents of a table.
All aggregation functions can be used as column types for tables. Please simply list the aggregation function name as the column type and provide type information, when required. See Section 10.2.1, “SQL-Standard Functions” for a list of the functions and required parameter expressions for which you must provide type information.
Consider the next example that declares a table with columns for different aggregation functions (not a comprehensive example of all possible aggregation functions):
create table MyStats ( myKey string primary key, myAvedev avedev(int), // column holds a mean deviation of int-typed values myAvg avg(double), // column holds an average of double-typed values myCount count(*), // column holds a count myMax max(int), // column holds a highest int-typed value myMedian median(float), // column holds the median of float-typed values myStddev stddev(java.math.BigDecimal), // column holds a standard deviation of BigDecimal values mySum sum(long), // column holds a sum of long values myFirstEver firstever(string), // column holds a first-ever value of type string myCountEver countever(*) // column holds the count-ever (regardless of data windows) )
Additional keywords such as distinct
can be used as well. If your aggregation will be associated with a filter expression, you must add boolean
to
the parameters in the column type declaration.
For example, the next EPL declares a table with aggregation-type columns that hold an average of filtered double-typed values and an average of distinct double-typed values:
create table MyStatsMore ( myKey string primary key, myAvgFiltered avg(double, boolean), // column holds an average of double-typed values // and filtered by a boolean expression to be provided myAvgDistinct avg(distinct double) // column holds an average of distinct double-typed values )
The event aggregation functions can be used as column types for tables.
For event aggregation functions you must specify the event type using the @type(
name)
annotation.
The window
event aggregation function requires the *
wildcard.
The first
and last
cannot be used in a declaration, please use window
instead and access
as described in Section 6.3.3.2, “Using Aggregation Methods to Access Aggregation State with the Dot Operator”.
The sorted
, maxbyever
and minbyever
event aggregation functions require the criteria expression as a parameter.
The criteria expression must only use properties of the provided event type.
The maxby
and minby
cannot be used in a declaration, please use sorted
instead and access
as described in Section 6.3.3.2, “Using Aggregation Methods to Access Aggregation State with the Dot Operator”.
In this example the table declares sample event aggregations (not a comprehensive example of all possible aggregations):
create table MyEventAggregationTable ( myKey string primary key, myWindow window(*) @type(MyEvent), // column holds a window of MyEvent events mySorted sorted(mySortValue) @type(MyEvent), // column holds MyEvent events sorted by mySortValue myMaxByEver maxbyever(mySortValue) @type(MyEvent) // column holds the single MyEvent event that // provided the highest value of mySortValue ever )
Any custom single-function and multi-function aggregation can be used as a table column type. If the aggregation has multiple different return values and aggregations share common state, the multi-function aggregation is the preferred API.
For example, the next EPL declares a table with a single column that holds the state of the aggregation function myAggregation
:
create table MyStatsCustom (myCustom myAggregation('some code', 100))
The above example passes the values some code
and 100
to show how to pass constants to your custom aggregation function at declaration time.
Use the into table
keywords to instruct the runtime to aggregate into table columns. A given statement can only aggregate into a single table.
For example, consider a table that holds the count of intrusion events keyed by the combination of from-address and to-address:
create table IntrusionCountTable ( fromAddress string primary key, toAddress string primary key, countIntrusion10Sec count(*), countIntrusion60Sec count(*) )
The next sample statement updates the count considering the last 10 seconds of events:
into table IntrusionCountTable select count(*) as countIntrusion10Sec from IntrusionEvent#time(10) group by fromAddress, toAddress
For statements that have a data window, like the example above, the into table
instructs the runtime to update the aggregations that reside in the table column for events entering the data window and for events expiring from the data window.
For statements that don't have a data window the into table
instructs the runtime to update the aggregation for arriving events.
In the example above, when an IntrusionEvent
enters the 10-second time window the runtime increments the count aggregation by one.
When an IntrusionEvent
gets removed from (expires from) the 10-second time window the runtime decrements the count aggregation by one.
Multiple statements can aggregate into the same table columns or different table columns. The co-aggregating ability allows you to co-locate aggregation state conveniently.
The sample shown below is very similar to the previous statement except that it updates the count considering the last 60 seconds of events:
into table IntrusionCountTable select count(*) as countIntrusion60Sec from IntrusionEvent#time(60) group by fromAddress, toAddress
Considering the example above, when an intrusion event arrives and a row for the group-by key values (from and to-address) does not exists, the runtime creates a new row and updates the aggregation-type columns. If the row for the group-by key values exists, the runtime updates the aggregation-type columns of the existing row.
Tables can have no primary key columns. In this case a table either has a single row or is empty.
The next two statements demonstrate table use without a primary key column:
create table TotalIntrusionCountTable (totalIntrusions count(*))
into table TotalIntrusionCountTable select count(*) as totalIntrusions from IntrusionEvent
In conjunction with into table
the unidirectional
keyword is not supported.
The use of the into table
clause requires that the group by
clause must list group-by expressions
that match the table's primary key declarations in terms of the number, return type and order of group-by expressions.
It is not necessary that table column names match group-by expression texts.
For example consider a table with a single long-type primary key column:
create table MyTable (theKey long primary key, theCount count(*))
The following EPL are all not valid:
// Invalid: No group-by clause however the table declares a primary key into table MyTable select count(*) as theCount from MyEvent
// Invalid: Two expressions in the group-by clause however the table declares a single primary key into table MyTable select count(*) as theCount from MyEvent group by longPropertyOne, longPropertyTwo
// Invalid: The group-by clause expression returns a string-typed value however the table expects a long-type primary key into table MyTable select count(*) as theCount from MyEvent group by stringProperty
You may use the rollup
, cube
and grouping sets
keywords in conjunction with tables.
The use of the into table
clause requires that all aggregation state of the statement resides in table columns.
For example consider a simple table as follows:
create table MyTable (theKey long primary key, theCount count(*))
The following EPL is not valid:
// Invalid: the sum aggregation state is not available in a table column into table MyTable select count(*) as theCount, sum(intProperty) from MyEvent group by longProperty
The use of the into table
clause requires that all aggregation functions that are listed in the statement are compatible with table column types,
and that the statement has at least one aggregation function.
For example consider a simple table as follows:
create table MyTable (theKey long primary key, theCount count(*))
The following EPL is not valid:
// Invalid: the sum aggregation state is not compatible with count(*) that was declared for the table column's type into table MyTable select sum(intProperty) as theCount from MyEvent group by longProperty
If declared, the distinct
keyword and filter expressions must also match. The event type information must match for event aggregation functions.
The use of the into table
clause requires that the aggregation functions are named.
You can name an expression two ways.
First, you can name the aggregation function expression by adding it to the select-clause and by providing the as
-keyword
followed by the table column name. The examples earlier use this technique.
Second, you can name the aggregation function by placing it into a declared expression that carries the same name as the table column.
This example demonstrates the second method of naming an aggregation function:
expression alias totalIntrusions {count(*)} select totalIntrusions from IntrusionEvent
For accessing table columns by primary key, EPL provides a convenient syntax that allows you to read table column values simply by providing the table name, primary key value expressions (if required by the table) and the column name.
The synopsis for table-column access expressions is:
table-name[primary_key_expr [, primary_key_expr] [,...]][.column-name]
The expression starts with the table name.
If the table declares primary keys you must provide the primary_key_expr value expressions for each primary key within square brackets.
To access a specific column, add the (.
) dot character and the column name.
For example, consider a table that holds the count of intrusion events keyed by the combination of from-address and to-address:
create table IntrusionCountTable ( fromAddress string primary key, toAddress string primary key, countIntrusion10Sec count(*) )
Assuming that a FireWallEvent
has string-type properties named from
and to
, the next
statement outputs the current 10-second intrusion count as held by the IntrusionCountTable
row for the matching combination of keys:
select IntrusionCountTable[from, to].countIntrusion10Sec from FirewallEvent
The number of primary key expressions, the return type of the primary key expressions and the order in which they are provided must match the primary key columns that were declared for the table. If the table does not have any primary keys declared, you cannot provide any primary key expressions.
If a row for the primary key (or compound key) cannot be found, the runtime returns a null
value.
An example table without primary key columns is shown next:
create table TotalIntrusionCountTable (totalIntrusions count(*))
A sample statement that outputs the current total count every 60 seconds is:
select TotalIntrusionCountTable.totalIntrusions from pattern[every timer:interval(60 sec)]
Table access expressions can be used anywhere in statements except as parameter expressions for data windows, the update istream
,
context declarations, output limit expressions, pattern observer and guard parameters, pattern every-distinct, pattern match-until bounds, pattern followed-by max
and create window
insert or select expression and as a create variable
assignment expression.
If your keyed-access expression emits the column name, the runtime returns all current column values.
An example EPL:
select IntrusionCountTable[from, to] from FirewallEvent
The runtime returns each column value, or null if no row is found. For aggregation-type columns it returns the current aggregation value.
Certain aggregation functions allow accessing aggregation state using the (.
) dot operator.
The methods that operate on aggregation state are called aggregation methods.
Aggregation methods are explained further in Chapter 13, EPL Reference: Aggregation Methods.
The EPL shown below declares a table that keeps an unsorted set of events and a sorted set of events. This sample table has no primary key columns:
create table MyTable ( theWindow window(*) @type(MyEvent), theSorted sorted(mySortValue) @type(MyEvent) )
This statement uses the first
and the maxBy
aggregation methods to return the first (oldest) event and the greatest event according to the sort value:
select MyTable.theWindow.first(), MyTable.theSorted.maxBy() from SomeOtherEvent
Tables provide the keys
method that returns all table keys. The synopsis is:
tablename.keys()
The return type of the method is object-array. Each element of the array is a table key. For composite table keys, each element of the array is itself of type object-array and each array element is the primary key value.
An example EPL:
select IntrusionCountTable.keys() from FirewallEvent
The runtime returns an array of primary keys.
The insert into
clause inserts rows into a table.
Your application must ensure that the column names and types match the declared column names and types of the table to be inserted into, when provided.
For inserting into a table and for simultaneously checking if the inserted row already exists in the table or for atomic update-insert operation on a table, or for targeting a specific partition, you must use
on-merge
as described in Section 6.8, “Triggered Upsert Using the On-Merge Clause”. On-merge
is similar to the SQL merge
clause and provides what is known as an "Upsert" operation: Update existing rows or if no existing rows(s) are found then insert a new row,
all in one atomic operation provided by a single statement.
The following statement populates the example table declared earlier:
insert into IntrusionCountTable select fromAddress, toAddress from FirewallEvent
Note that when a row with the same primary key values already exists, your statement may encounter a unique index violation at runtime.
If the inserted-into table does not have primary key columns, the table holds a maximum of one row and your statement may also encounter a unique index violation upon
attempting to insert a second row.
Use on-merge
to prevent inserts of duplicate rows.
Table columns that are aggregation functions cannot be inserted-into and must be updated using into table
instead.
You may also explicitly list column names as discussed earlier in Section 6.2.2, “Inserting Into Named Windows”.
For insert-into
, the context name must be the same context name as declared for the create table
statement or the context name must be absent for both.
A table can be referred to by any statement in the from
-clause of the statement.
Tables do not provide an insert and remove stream. When a table appears alone in the from
-clause (other than as part of a subquery),
the statement produces output only when iterated (see pull API) or when executing a fire-and-forget (on-demand) query.
Assuming you have declared a table by name IntrusionCountTable
as shown earlier,
the following statement only returns rows when iterated or when executing the EPL as a fire-and-forget (on-demand) query or when adding an output snapshot
:
select * from IntrusionCountTable
For tables, the contained-event syntax and specifying a data window is not supported.
In a join, a table in the from
-clause cannot be marked as unidirectional
. You may not specify any of the retain-flags.
Tables cannot be used in the from
-clause of match-recognize statements, in context declarations, in pattern filter atoms and update istream
.
The following are examples of invalid statements:
// invalid statement examples select * from IntrusionCountTable#time(30 sec) // data window not allowed select * from IntrusionCountTable unidirectional, MyEvent // tables cannot be marked as unidirectional
Tables can be used in subqueries and joins.
It follows a sample subselect and join against the table:
select (select * from IntrusionCountTable as intr where intr.fromAddress = firewall.fromAddress and intr.toAddress = firewall.toAddress) from IntrusionEvent as firewall
select * from IntrusionCountTable as intr, IntrusionEvent as firewall where intr.fromAddress = firewall.fromAddress and intr.toAddress = firewall.toAddress
If the subselect or join specifies all of a table's primary key columns, please consider using the table-access expression instead. It offers a more concise syntax.
Note that for a subquery against a table that may return multiple rows, the information about subquery multi-row selection applies. For subselects, consider using @eventbean
to preserve table type information in the output event.
Note that for joins against tables the compiler does not allow specifying table filter expressions in parenthesis, in the from
clause.
Filter expressions must instead be placed into the where
-clause.
You may access aggregation state the same way as in table-access expressions, using the dot (.
) operator.
The EPL shown below declares a table that keeps a set of events, and shows a join that selects window aggregation state:
create table MyWindowTable (theWindow window(*) @type(MyEvent))
select theWindow.first(), theWindow.last(), theWindow.window() from MyEvent, MyWindowTable
For resetting plain table columns, you may simple assign a value using set
name =
value within an on-merge update-
action to set new values.
You may also use fire-and-forget queries.
For resetting aggregation-typed table columns, as they represent aggregation state, it is not possible to use a straight assignment. Instead, the runtime provides a reset
method.
The syntax for the reset method is column-name.reset()
and is only available as part of on-merge
and the update
-action.
You may also reset all aggregations of a table row by specifying table-alias.reset()
.
For example, consider a table that holds the count of intrusion events keyed by the combination of from-address and to-address:
create table IntrusionCountTable ( fromAddress string primary key, toAddress string primary key, countIntrusion10Sec count(*), countIntrusion60Sec count(*) )
Assume there is an event that triggers the reset of the countIntrusion10Sec
and countIntrusion60Sec
counts:
create schema IntrusionReset(fromAddress string, toAddress string)
Use on-merge
to reset the count columns like so:
on IntrusionReset as resetEvent merge IntrusionCountTable as tableRow where resetEvent.fromAddress = tableRow.fromAddress and resetEvent.toAddress = tableRow.toAddress when matched then update set countIntrusion10Sec.reset(), countIntrusion60Sec.reset()
Alternatively you may specify the table alias to reset all of a table row's aggregation state:
on IntrusionReset as resetEvent merge IntrusionCountTable as tableRow where resetEvent.fromAddress = tableRow.fromAddress and resetEvent.toAddress = tableRow.toAddress when matched then update set tableRow.reset()
For initializing plain table columns, you may simple assign a value using set
name =
value within an on-merge update-
action to set new values.
You may also use fire-and-forget queries.
For initializing aggregation-typed table columns, as they represent aggregation state, it is not possible to use a straight assignment. Instead, you must use into table
since aggregation state is complex state and requires adding and removing values or events and therefore cannot be assigned to.
If the aggregation-typed column is a count, please declare the column as a sum instead and use sum(initial value)
to initialize the value or sum(1)
to increment the sum.
The on select
clause performs a one-time, non-continuous query on a named window or table every time a triggering event arrives or a triggering pattern matches. The statement can consider all rows, or only rows that match certain criteria, or rows that correlate with an arriving event or a pattern of arriving events.
The syntax for the on select
clause is as follows:
on event_type[(filter_criteria)] [as stream_name] [insert into insert_into_def] select select_list from window_or_table_name [as stream_name] [where criteria_expression] [group by grouping_expression_list] [having grouping_search_conditions] [order by order_by_expression_list]
The event_type is the name of the type of events that trigger the query against the named window or table. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as
keyword can be used to assign a stream name. Patterns or named windows can also be specified in the on
clause, see the samples in Section 6.7.1, “Using Patterns in the On-Delete Clause” (for a named window as a trigger only insert stream events trigger actions) (tables cannot be triggers).
The insert into clause works as described in Section 5.10, “Merging Streams and Continuous Insertion: The Insert Into Clause”. The select clause is described in Section 5.3, “Choosing Event Properties and Events: The Select Clause”. For all clauses the semantics are equivalent to a join operation: The properties of the triggering event or events are available in the select
clause and all other clauses.
The window_or_table_name in the from
clause is the name of the named window or table to select rows from. The as
keyword is also available to assign a stream name to the table or named window. The as
keyword is helpful in conjunction with wildcard in the select
clause to select rows via the syntax select streamname.*
.
The optional where
clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be considered from the table or named window. The criteria_expression may also simply filter for rows to be considered by the statement.
The group by
clause, the having
clause and the order by
clause are all optional and work as described in earlier chapters.
Statements that use tables and named windows work the same. The examples herein use the OrdersNamedWindow
named window and the SecuritySummaryTable
table to provide examples for each.
The sample statement below outputs, when a trigger event arrives, the count of all rows held by the SecuritySummaryTable
table:
on QueryEvent select count(*) from SecuritySummaryTable
This sample statement outputs the total volume per symbol ordered by symbol ascending and only non-zero volumes of all rows held by the OrdersNamedWindow
named window:
on QueryEvent select symbol, sum(volume) from OrdersNamedWindow group by symbol having volume > 0 order by symbol
When using wildcard (*) to select from streams in an on-select clause, each stream, that is the triggering stream and the selected-upon table or named window, are selected, similar to a join. Therefore your wildcard select returns two columns: the triggering event and the selection result row, for each row.
on QueryEvent as queryEvent select * from OrdersNamedWindow as win
The statement above returns a queryEvent
column and a win
column for each event.
If only a single stream's event is desired in the result, use select win.*
instead.
Upon arrival of a QueryEvent event, this statement selects all rows in the OrdersNamedWindow
named window:
on QueryEvent select win.* from OrdersNamedWindow as win
The runtime executes the statement on arrival of a triggering event, in this case a QueryEvent. It posts the statement results to any listeners to the statement, in a single invocation, as the new data array.
The where
clause filters and correlates rows in the table or named window with the triggering event, as shown next:
on QueryEvent(volume>0) as query select query.symbol, query.volume, win.symbol from OrdersNamedWindow as win where win.symbol = query.symbol
Upon arrival of a QueryEvent, if that event has a value for the volume property that is greater than zero, the runtime executes the statement. The statement considers all events currently held by the OrdersNamedWindow
that match the symbol property value of the triggering QueryEvent event.
For correlated statements that correlate triggering events with rows held by a named window, the runtime internally creates efficient indexes to enable high performance querying of rows.
It analyzes the where
clause to build one or more indexes for fast lookup in the named window based on the properties of the triggering event.
To trigger an on-select when an update to the selected named window occurs or when the triggering event is the same event that is being inserted into the named window, specify the named window name as the event type.
The next statement fires the select for every change to the named window OrdersNamedWindow:
on OrdersNamedWindow as trig select onw.symbol, sum(onw.volume) from OrdersNamedWindow as onw where onw.symbol = trig.symbol
For named windows, the iterator
of the EPStatement
object representing the on select
clause returns the last batch of selected events in response to the last triggering event, or null if the last triggering event did not select any rows.
For tables, the iterator
of the EPStatement
object representing the on select
clause returns no events.
For correlated statements that correlate triggering events with rows held by a table, the runtime utilizes either primary key columns or
secondary explicitly-created indexes to enable high performance querying of rows, based on an analysis of the where
clause.
The similarities and differences between an on select
clause and a regular or outer join (and not unidirectional) are as follows:
A join is evaluated when any of the streams participating in the join have new events (insert stream) or events leaving data windows (remove stream). A join is therefore bi-directional or multi-directional. However, the on select
statement has one triggering event or pattern that causes the statement to be evaluated and is thus uni-directional.
The query within the on select
statement is not continuous: It executes only when a triggering event or pattern occurs. Aggregation and groups are computed anew considering the contents of the table or named window at the time the triggering event arrives.
On-select
and the unidirectional join can be compared as follows.
On-select
, on-merge
, on-insert
, on-delete
, on-update
and on-select-and-delete
operate only on named windows or tables. Unidirectional joins however can operate on any stream.
If the unidirectional join is between a single named window or table and a triggering event or pattern and that triggering event or pattern is marked unidirectional, the unidirectional join is equivalent to on-select
.
A unidirectional join does not execute under a named window context partition lock and instead is a consumer relationship to the named window.
The on select delete
clause performs a one-time, non-continuous query on a table or named window every time a triggering event arrives or a triggering pattern matches, similar to on-select
as described in the previous section. In addition, any selected rows are also deleted.
The syntax for the on select delete
clause is as follows:
on trigger
select [and] delete select_list...
... (please see on-select for insert into, from, group by, having, order by
)...
The syntax follows the syntax of on-select
as described earlier. The select
clause follows the optional and
keyword and the delete
keyword. The from
-clause can list either a table or a named window.
The example statement below selects and deletes all rows from OrdersNamedWindow
named window when a QueryEvent arrives:
on QueryEvent select and delete window(win.*) as rows from OrdersNamedWindow as win
The sample EPL above also shows the use of the window
aggregation function. It specifies the window
aggregation function
to instruct the runtime to output a single event, regardless of the number of rows in the named window, and that contains a column rows
that contains a collection of the selected event's underlying objects.
On Select Delete
deletes all rows that match the where-clause. When there is no where-clause it deletes all rows regardless of what the output looks like. The having
-clause is relevant to output only and does not narrow down the rows that are deleted.
An on update
clause updates rows held by a table or named window. The clause can be used to update all rows, or only rows that match certain criteria, or rows that correlate with an arriving event or a pattern of arriving events.
For updating a table or named window and for simultaneously checking if the updated row exists or for atomic update-insert operation on a named window or table, consider using on-merge
as described in Section 6.8, “Triggered Upsert Using the On-Merge Clause”.
On-merge is similar to the SQL merge
clause and provides what is known as an "Upsert" operation:
Update existing events or if no existing event(s) are found then insert a new event, all in one atomic operation provided by a single statement.
The syntax for the on update
clause is as follows:
on event_type[(filter_criteria)] [as stream_name] update window_or_table_name [as stream_name] set mutation_expression [, mutation_expression [,...]] [where criteria_expression]
The event_type is the name of the type of events that trigger an update of rows in a named window. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as
keyword can be used to assign a name for use in expressions and the where
clause. Patterns and named windows can also be specified in the on
clause.
The window_or_table_name is the name of the table or named window to update rows.
The as
keyword is also available to assign a name to the named window or table.
After the set
keyword follows a list of comma-separated mutation_expression expressions. A mutation expression is any valid EPL expression.
Subqueries may by part of expressions however aggregation functions and the prev
or prior
function may not be used in expressions.
The below table shows some typical mutation expessions:
Table 6.1. Mutation Expressions in Update and Merge
Description | Syntax and Examples |
---|---|
Assignment |
property_name = value_expression price = 10, side = 'BUY' |
Event Method Invocation (not available for tables) |
alias_or_windowname.methodname(...) orderWindow.clear() |
Property Method Invocation |
property_name.methodname(...) accountMap.clear() |
User-Defined Function Call |
functionname(...)
clearQuantities(orderRow) |
The optional where
clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be updated in the table or named window. The criteria_expression may also simply filter for rows to be updated.
Statements that use tables and named windows work the same. We use the term property and column interchangeably. The examples herein use the OrdersNamedWindow
named window and the SecuritySummaryTable
table to provide examples for each. Let's look at a couple of examples.
In the simplest form, this statement updates all rows in the named window OrdersNamedWindow
when any UpdateOrderEvent
event arrives, setting the price property to zero for all rows currently held by the named window:
on UpdateOrderEvent update OrdersNamedWindow set price = 0
This example demonstrates the use of a where
clause and updates the SecuritySummaryTable
table.
Upon arrival of a triggering ResetEvent
it updates the active
column value to false for all table rows that have an active
column value of true:
on ResetEvent update SecuritySummaryTable set active = false where active = true
The next example shows a more complete use of the syntax, and correlates the triggering event with rows held by the OrdersNamedWindow
named window:
on NewOrderEvent(volume>0) as myNewOrders update OrdersNamedWindow as myNamedWindow set price = myNewOrders.price where myNamedWindow.symbol = myNewOrders.symbol
In the above sample statement, only if a NewOrderEvent
event with a volume greater then zero arrives does the statement trigger. Upon triggering, all rows in the named window that have the same value for the symbol property as the triggering NewOrderEvent
event are then updated (their price property is set to that of the arriving event). The statement also showcases the as
keyword to assign a name for use in the where
expression.
Your application can subscribe a listener to your on update
statements to determine update events. The statement post any rows that are updated to all listeners attached to the statement as new data, and the events prior to the update as old data.
The following example shows the use of tags and a pattern. It sets the price value of orders to that of either a FlushOrderEvent
or OrderUpdateEvent
depending on which arrived:
on pattern [every ord=OrderUpdateEvent(volume>0) or every flush=FlushOrderEvent] update OrdersNamedWindow as win set price = case when ord.price is null then flush.price else ord.price end where ord.id = win.id or flush.id = win.id
When updating indexed properties use the syntax propertyName[
index] =
value with the index value being an integer number.
When updating mapped properties use the syntax propertyName(
key) =
value with the key being a string value.
The runtime executes assignments in the order they are listed. When performing multiple assignments, the runtime takes the most recent column value according to the last assignment, if any. To instruct the runtime to use the initial value before update, prefix the column name with the literal initial
.
The following statement illustrates:
on UpdateEvent as upd update MyWindow as win set field_a = 1, field_b = win.field_a, // assigns the value 1 field_c = initial.field_a // assigns the field_a original value before update
The next example assumes that your application provides a user-defined function copyFields
that receives 3 parameters:
The update event, the new row and the initial state before-update row.
on UpdateEvent as upd update MyWindow as win set copyFields(win, upd, initial)
You may invoke a method on a value object, for those properties that hold value objects, as follows:
on UpdateEvent update MyWindow as win set someproperty.clear()
For named windows only, you may also invoke a method on the named window event type.
The following example assumes that your event type provides a method by name populateFrom
that receives the update event as a parameter:
on UpdateEvent as upd update MyWindow as win set win.populateFrom(upd)
The following restrictions apply:
Each property to be updated via assignment must be writable. For tables, all columns are always writable.
For underlying event representations that are Java objects, a event object class must implement the java.io.Serializable interface as discussed in Section 5.20.1, “Immutability and Updates” and must provide setter methods for updated properties.
When using an XML underlying event type, event properties in the XML document representation are not available for update.
Nested properties are not supported for update. Variant streams may also not be updated.
Statements that reference the named window receive the new event in the insert stream and the event prior to the update in the remove stream.
For correlated statements (as above) that correlate triggering events with events held by a named window, the runtime internally creates efficient indexes to enable high performance update of events.
The iterator
of the EPStatement
object representing the on update
clause does not return any rows.
On-Update may not update primary key columns.
For correlated statements that correlate triggering events with rows held by a table, the runtime utilizes either primary key columns or
secondary explicitly-created indexes to enable high performance querying of rows, based on an analysis of the where
clause.
The iterator
of the EPStatement
object representing the on update
clause does not return any rows.
An on delete
clause removes rows from a named window or table. The clause can be used to remove all rows, or only rows that match certain criteria, or rows that correlate with an arriving event or a pattern of arriving events.
The syntax for the on delete
clause is as follows:
on event_type[(filter_criteria)] [as stream_name] delete from window_or_table_name [as stream_name] [where criteria_expression]
The event_type is the name of the type of events that trigger removal from the table or named window. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as
keyword can be used to assign a name for use in the where
clause. Patterns and named windows can also be specified in the on
clause as described in the next section.
The window_or_table_name is the name of the named window or table to delete rows from.
The as
keyword is also available to assign a name to the table or named window.
The optional where
clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be removed. The criteria_expression may also simply filter for rows without correlating.
On-delete can be used against tables and named windows. The examples herein use the OrdersNamedWindow
named window and the SecuritySummaryTable
table to provide examples for each.
In the simplest form, this statement deletes all rows from the SecuritySummaryTable
table when any ClearEvent
arrives:
on ClearEvent delete from SecuritySummaryTable
The next example shows a more complete use of the syntax, and correlates the triggering event with events held by the OrdersNamedWindow
named window:
on NewOrderEvent(volume>0) as myNewOrders delete from OrdersNamedWindow as myNamedWindow where myNamedWindow.symbol = myNewOrders.symbol
In the above sample statement, only if a NewOrderEvent
event with a volume greater then zero arrives does the statement trigger. Upon triggering, all rows in the named window that have the same value for the symbol property as the triggering NewOrderEvent
event are removed. The statement also showcases the as
keyword to assign a name for use in the where
expression.
By means of patterns the on delete
clause and on select
clause (described below) can look for more complex conditions to occur, possibly involving multiple events or the passing of time. The syntax for on delete
with a pattern expression is show next:
on pattern [pattern_expression] [as stream_name] delete from window_or_table_name [as stream_name] [where criteria_expression]
The pattern_expression is any pattern that matches zero or more arriving events. Tags can be used to name events in the pattern and can occur in the optional where
clause to correlate to events to be removed from a named window.
In the next example the triggering pattern fires every 10 seconds. The effect is that every 10 seconds the statement removes all rows from the SecuritySummaryTable
table:
on pattern [every timer:interval(10 sec)] delete from SecuritySummaryTable
The following example shows the use of tags in a pattern and executes against the OrdersNamedWindow
named window instead:
on pattern [every ord=OrderEvent(volume>0) or every flush=FlushOrderEvent] delete from OrdersNamedWindow as win where ord.id = win.id or flush.id = win.id
The pattern above looks for OrderEvent events with a volume value greater then zero and tags such events as 'ord'.
The pattern also looks for FlushOrderEvent events and tags such events as 'flush'. The where
clause deletes from the OrdersNamedWindow
named window any rows that match in the value of the 'id' property either of the arriving events.
When using on-delete
with named windows you may specify @hint('silent_delete')
to suppress output of the deleted events to listeners of the statement that declares the named window.
This is useful when listeners need to receive only expired events and not explicitly deleted events.
The @hint('silent_delete')
instruction for use with on-delete
removes the dispatches for all listeners of the statement that declares the named window in respect to the deleted events,
thereby listeners of the statement that declares the named window don't receive the deleted events.
Statements that reference the named window receive the deleted event as part of the remove stream.
For correlated statements (as above) that correlate triggering events with rows held by a named window, the runtime internally creates efficient indexes to enable high performance deletion of rows.
The iterator
of the EPStatement
object representing the on delete
clause does not return any rows.
For correlated statements that correlate triggering events with rows held by a table, the runtime utilizes either primary key columns or
secondary explicitly-created indexes to enable high performance querying of rows, based on an analysis of the where
clause.
The iterator
of the EPStatement
object representing the on delete
clause does not return any rows.
The on merge
clause is similar to the SQL merge
clause. It provides what is known as an "Upsert" operation: Update existing rows or if no existing row(s) are found then insert a new row, all in an atomic operation provided by a single statement.
The syntax for the on merge
clause has two forms.
Use on merge
with insert
to inserts one row. This syntax is:
on event_type[(filter_criteria)] [as stream_name] merge [into] window_or_table_name [as stream_name] insert [ (property_name [, property_name] [,...]) ] select select_expression [, select_expression[,...]]
The event_type is the name of the type of events that trigger the merge. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as
keyword can be used to assign a name for use in the where
clause. Patterns and named windows can also be specified in the on
clause as described in prior sections.
The window_or_table_name is the name of the named window to insert a row into.
The as
keyword is also available to assign a name to the named window or table.
It follows the insert
keyword and optionally the into
keyword. Optionally you can provide a list of property names in parenthesis.
It follows the required select
keyword and one or more select-clause expressions. The wildcard (*
) is available in the select-clause as well.
On-merge can be used with tables and named windows. This examples uses the SecuritySummaryTable
table that was defined earlier.
This example statement inserts a row into the SecuritySummaryTable
table when a SecuritySummary
arrives:
on SecuritySummary merge SecuritySummaryTable insert select ipAddress, userId
The following EPL is equivalent to the statement above and uses the upsert syntax instead, described next.
on SecuritySummary merge SecuritySummaryTable where 1=2 when not matched then insert select ipAddress, userId
The syntax for on merge
for update-insert (upsert) is as follows:
on event_type[(filter_criteria)] [as stream_name] merge [into] window_or_table_name [as stream_name] [where criteria_expression] when [not] matched [and search_condition] then [ insert [into streamname] [ (property_name [, property_name] [,...]) ] select select_expression [, select_expression[,...]] [where filter_expression] | update set mutation_expression [, mutation_expression [,...]] [where filter_expression] | delete [where filter_expression] ] [then [insert|update|delete]] [,then ...] [when ... then ... [...]]
The event_type is the name of the type of events that trigger the merge. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as
keyword can be used to assign a name for use in the where
clause. Patterns and named windows can also be specified in the on
clause as described in prior sections.
The window_or_table_name is the name of the named window or table to insert, update or delete rows.
The as
keyword is also available to assign a name to the named window or table.
The optional where
clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be considered of the table or named window.
We recommend specifying a criteria expression that is as specific as possible.
Following the where
clause is one or more when matched
or when not matched
clauses in any order. Each may have an additional search condition associated.
After each when [not] matched
follow one or more then
clauses that each contains the action to take: Either an insert
, update
or delete
keyword.
After when not matched
only insert
action(s) are available. After when matched
any insert
, update
and delete
action(s) are available.
After insert
follows, optionally, the into
keyword followed by the stream name or named window to insert-into. If no into
and stream name is specified, the insert applies to the current table or named window. It follows an optional list of columns inserted. It follows the required select
keyword and one or more select-clause expressions. The wildcard (*
) is available in the select-clause as well. It follows an optional where-clause that may return Boolean false to indicate that the action should not be applied.
After update
follows the set
keyword and one or more mutation expressions. For mutation expressions please see Section 6.6, “Updating Data: The On Update Clause”.
It follows an optional where-clause that may return Boolean false to indicate that the action should not be applied.
After delete
follows an optional where-clause that may return Boolean false to indicate that the action should not be applied.
When according to the where-clause criteria_expression the runtime finds no rows in the named window or table that match the condition, the runtime evaluates each when not matched clause. If the optional search condition returns true or no search condition was provided then the runtime performs all of the actions listed after each then
.
When according to the where-clause criteria_expression the runtime finds one or more rows in the named window or table that match the condition, the runtime evaluates each when matched clause. If the optional search condition returns true or no search condition was provided the runtime performs all of the actions listed after each then
.
The runtime executes when matched
and when not matched
in the order specified. If the optional search condition returns true or no search condition was specified then the runtime takes the associated action (or multiple actions for multiple then
keywords). When the block of actions completed the runtime proceeds to the next matching row, if any. After completing all matching rows the runtime continues to the next triggering event if any.
On-merge can be used with tables and named windows. The examples herein declare a ProductWindow
named window and also use the SecuritySummaryTable
table to provide examples for each.
This example statement updates the SecuritySummaryTable
table when a ResetEvent
arrives setting the active
column's value to false:
on ResetEvent merge SecuritySummaryTable when matched and active = true then update set active = false
A longer example utilizing a named window follows. You start by declaring a schema that provides a product id and that holds a total price:
create schema ProductTotalRec as (productId string, totalPrice double)
We create a named window that holds a row for each unique product:
create window ProductWindow#unique(productId) as ProductTotalRec
The events for this example are order events that hold an order id, product id, price, quantity and deleted-flag declared by the next schema:
create schema OrderEvent as (orderId string, productId string, price double, quantity int, deletedFlag boolean)
The following statement utilizes on-merge
to total up the price for each product based on arriving order events:
on OrderEvent oe merge ProductWindow pw where pw.productId = oe.productId when matched then update set totalPrice = totalPrice + oe.price when not matched then insert select productId, price as totalPrice
In the above example, when an order event arrives, the runtime looks up in the product named window the matching row or rows for the same product id as the arriving event. In this example the runtime always finds no row or one row as the product named window is declared with a unique data window based on product id.
If the runtime finds a row in the named window, it performs the update action adding up the price as defined under when matched
. If the runtime does not find a row in the named window it performs the insert action as defined under when not matched
, inserting a new row.
The insert
keyword may be followed by a list of columns as shown in this EPL snippet:
// equivalent to the insert shown in the last 2 lines in above EPL ...when not matched then insert(productId, totalPrice) select productId, price
The second example demonstrates the use of a select-clause with wildcard, a search condition and the delete
keyword. It creates a named window that holds order events and employs on-merge to insert order events for which no corresponding order id was found, update quantity to the quantity provided by the last arriving event and delete order events that are marked as deleted:
create window OrderWindow#keepall as OrderEvent
on OrderEvent oe merge OrderWindow pw where pw.orderId = oe.orderId when not matched then insert select * when matched and oe.deletedFlag=true then delete when matched then update set pw.quantity = oe.quantity, pw.price = oe.price
In the above example the oe.deletedFlag=true
search condition instructs the runtime to take the delete action only if the deleted-flag is set.
You may specify multiple actions by providing multiple then
keywords each followed by an action. Each of the insert
, update
and delete
actions can itself have a where-clause as well.
If a where-clause exists for an action, the runtime evaluates the where-clause and applies the action only if the where-clause returns Boolean true.
This example specifies two update actions and uses the where-clause to trigger different update behavior depending on whether the order event price is less than zero.
This example assumes that the host application defined a clearorder
user-defined function, to demonstrate calling a user-defined function as part of the update mutation expressions:
on OrderEvent oe merge OrderWindow pw where pw.orderId = oe.orderId when matched then update set clearorder(pw) where oe.price < 0 then update set pw.quantity = oe.quantity, pw.price = oe.price where oe.price >= 0
To insert events into another stream and not the named window, use insert
into
streamname.
In the next example each matched-clause contains two actions, one action to insert a log event and a second action to insert, delete or update:
on OrderEvent oe merge OrderWindow pw where pw.orderId = oe.orderId when not matched then insert into LogEvent select 'this is an insert' as name then insert select * when matched and oe.deletedFlag=true then insert into LogEvent select 'this is a delete' as name then delete when matched then insert into LogEvent select 'this is a update' as name then update set pw.quantity = oe.quantity, pw.price = oe.price
While the runtime evaluates and executes all actions listed under the same matched-clause in order, you may not rely on updated field values of an earlier action to trigger the where-clause of a later action. Similarly you should avoid simultaneous update and delete actions for the same match: the runtime does not guarantee whether the update or the delete take final affect.
Your application can subscribe a listener to on merge
statements to determine inserted, updated and removed events. Statements post any events that are inserted to, updated or deleted from a named window to all listeners attached to the statement as new data and removed data.
The following limitations apply to on-merge statements:
Aggregation functions and the prev
and prior
operators are not available in conditions and the select
-clause.
Statements that reference the named window receive an insert and remove stream represening the insertions, changes and deletions to named window rows.
For correlated statements (as above) that correlate triggering events with rows held by a named window, the runtime internally creates efficient indexes to enable high performance update and removal of events especially from named windows that hold large numbers of events.
The iterator
of the EPStatement
object representing the on merge
clause does not return any rows.
On-Merge may not update primary key columns.
For correlated statements that correlate triggering events with rows held by a table, the runtime utilizes either primary key columns or
secondary explicitly-created indexes to enable high performance querying of rows, based on an analysis of the where
clause.
The iterator
of the EPStatement
object representing the on merge
clause does not return any rows.
You may explicitly create an index on a table or a named window. The runtime considers explicitly-created as well as implicitly-allocated indexes (named windows only) in query planning and execution of the following types of usages of tables and named windows:
Fire-and-forget (on-demand, non-continuous) queries as described in Section 16.7, “Execute Fire-and-Forget Queries Using EPFireAndForgetService”.
On-select
, on-merge
, on-update
, on-delete
and on-insert
.
Subqueries against tables and named windows.
For joins (including outer joins) with named windows the runtime considers the filter criteria listed in parenthesis using the syntax
name_window_name(filter_criteria)
for index access.
For joins with tables the runtime considers the primary key columns (if any) as well as any table indexes.
The syntax to create an explicit index on a named window or table is:
create [unique] index index_name on window_or_table_name ( column_expression [hash|btree|index_type_expression] [, column_expression] [hash|btree|index_type_expression] [,...] )
The optional unique keyboard indicates that the column expressions uniquely identify rows. If unique is not specified the index allows duplicate rows.
The index_name is the name assigned to the index. The name uniquely identifies the index and is used in compiler query plan logging.
The window_or_table_name is the name of an existing table or named window. If the named window or table has rows already, the runtime builds an index for the rows.
After the table name or named window name follows a list of pairs of column_expression column expression and index type.
A column expression is the expression that is subject to index building. Typically a column expression is an event property or column name. For special application-provided or spatial indexes other column expressions are allowed and such indexes may allow multiple columns to be combined. Further detail on key expressions can be found at Section 5.2.13, “Composite Keys and Array Values as Keys”.
Following each column expression you may specify the index type by providing the optional hash
or btree
keywords or an index_type_expression.
For special application-provided or spatial indexes please use the index_type_expression.
If you specify no keyword or the hash
keyword for a property, the index will be a hash-based (unsorted) index in respect to that property. If you specify the btree
keyword, the index will be a binary-tree-based sorted index in respect to that property.
You may combine hash
and btree
properties for the same index.
Specify btree
for a property if you expect to perform numerical or string comparison using relational operators (<, >, >=, <=), the between
or the in
keyword for ranges and inverted ranges. Use hash
(the default) instead of btree
if you expect to perform exact comparison using =
.
For hash
and btree
index types the column expression must be an event property or column name. Expressions such as col+1
are not currently supported for
hash
and btree
index types but are supported for other index types.
The create table
syntax is the same for tables and named windows. The examples herein create a new UserProfileWindow
named window and also use the SecuritySummaryTable
table.
This sample EPL creates an non-unique index on the active
column of table SecuritySummaryTable
:
create index MyIndex on SecuritySummaryTable(active)
We list a few example statements next that create a named window and create a single index:
// create a named window create window UserProfileWindow#time(1 hour) select * from UserProfile
// create a non-unique index (duplicates allowed) for the user id property only create index UserProfileIndex on UserProfileWindow(userId)
Next, execute a fire-and-forget query as shown below; this sample uses the prepared version to demonstrate:
String query = "select * from UserProfileWindow where userId='Joe'"; EPFireAndForgetPreparedQuery prepared = epRuntime.getFireAndForgetService().prepareQuery(query); // query performance excellent in the face of large number of rows EPFireAndForgetQueryResult result = prepared.execute(); // ...later ... prepared.execute(); // execute a second time
A unique index is generally preferable over non-unique indexes. For named windows, if your data window declares a unique data window (#unique
, #firstunique
, including intersections and grouped unique data windows) it is not necessary to create a unique index unless index sharing is enabled, since the compiler and runtime considers the unique data window declaration in query planning.
The runtime enforces uniqueness (e.g. unique constraint) for unique indexes. If your application inserts a duplicate row the runtime raises a runtime exception when processing the statement and discards the row. The default error handler logs such an exception and continues.
For example, if the user id together with the profile id uniquely identifies an entry into the named window, your application can create a unique index as shown below:
// create a unique index on user id and profile id create unique index UserProfileIndex on UserProfileWindow(userId, profileId)
By default, the runtime builds a hash code -based index useful for direct comparison via equals (=). Filter expressions that look for ranges or use in, between
do not benefit from the hash-based index and should use the btree
keyword. For direct comparison via equals (=) then compiler does not use btree
indexes.
The next example creates a composite index over two fields symbol
and buyPrice
:
// create a named window create window TickEventWindow#time(1 hour) as (symbol string, buyPrice double)
// create a non-unique index create index idx1 on TickEventWindow(symbol hash, buyPrice btree)
A sample fire-and-forget query is shown below (this time the API calls are not shown):
// query performance excellent in the face of large number of rows select * from TickEventWindow where symbol='GE' and buyPrice between 10 and 20
A table that does not declare one or more primary key columns cannot have a secondary index, as the table holds a maximum of one row.
Fire-and-Forget queries can be run against both tables and named windows. We use the term property and column interchangeably.
For selecting from named windows and tables, please see the examples in Section 16.7, “Execute Fire-and-Forget Queries Using EPFireAndForgetService”.
For data manipulation (insert, update, delete) queries, the fire-and-forget query API returns the inserted, updated or deleted rows when the query executes against a named window.
Your application can insert rows into a table or named window using fire-and-forget (on-demand, non-continuous) queries as described in Section 16.7, “Execute Fire-and-Forget Queries Using EPFireAndForgetService”.
The compiler allows the standard SQL syntax and values
keyword and also supports using select
to provide values.
The syntax using the values
keyword is:
insert into window_or_table_name [(property_names)] values (value_expressions)
The syntax using select
is as follows:
insert into window_or_table_name [(property_names)] select value_expressions
The window_or_table_name is the name of the table or named window to insert rows into.
After the named window or table name you can optionally provide a comma-separated list of property names.
When providing property names, the order of value expressions in the values list or select clause must match the order of property names specified. Column names provided in the select-clause, if specified, are ignored.
When not providing property names and when specifying the values
keyword, the order of values must match the order of properties
declared for the named window or table.
When not providing property names and when specifying the select-clause, expressions must name the properties to be inserted into
by assigning a column name using the as
keyword.
The example code snippet inserts a new order row into the OrdersWindow
named window:
String query = "insert into OrdersWindow(orderId, symbol, price) values ('001', 'GE', 100)"; runtime.getFireAndForgetService().executeQuery(query);
Instead of the values
keyword you may specify a select-clause as this example shows:
String query = "insert into OrdersWindow(orderId, symbol, price) select '001', 'GE', 100"; runtime.getFireAndForgetService().executeQuery(query);
The following EPL inserts the same values as above but specifies property names as part of the select-clause expressions:
insert into OrdersWindow select '001' as orderId, 'GE' as symbol, 100 as price
The next EPL inserts the same values as above and does not specify property names thereby populating the first 3 properties of the type of the named window:
insert into OrdersWindow values ('001', 'GE', 100)
Your application can update table and named window rows using fire-and-forget (on-demand, non-continuous) queries as described in Section 16.7, “Execute Fire-and-Forget Queries Using EPFireAndForgetService”.
The syntax for the update
clause is as follows:
update window_or_table_name [as stream_name] set mutation_expression [, mutation_expression [,...]] [where criteria_expression]
The window_or_table_name is the name of the table or named window to remove rows from.
The as
keyword is also available to assign a name to the table or named window.
After the set
keyword follows a comma-separated list of mutation expressions. For fire-and-forget queries the following restriction applies: Aggregation functions and the prev
or prior
function may not be used in expressions. Mutation expressions are detailed in Section 6.6, “Updating Data: The On Update Clause”.
The optional where
clause contains a criteria_expression that identifies rows to be updated.
The example code snippet updates those rows of the named window that have a negative value for volume:
String query = "update OrdersNamedWindow set volume = 0 where volumne = 0"; runtime.getFireAndForgetService().executeQuery(query);
To instruct the runtime to use the initial property value before update, prefix the property name with the literal initial
.
Your application can delete rows from a named window or table using fire-and-forget (on-demand, non-continuous) queries as described in Section 16.7, “Execute Fire-and-Forget Queries Using EPFireAndForgetService”.
The syntax for the delete
clause is as follows:
delete from window_or_table_name [as stream_name] [where criteria_expression]
The window_or_table_name is the name of the named window or table to delete rows from.
The as
keyword is also available to assign a name to the named window or table.
The optional where
clause contains a criteria_expression that identifies rows to be removed from the named window or table.
The example code snippet deletes from a named window all rows that have a negative value for volume:
String query = "delete from OrdersNamedWindow where volume <= 0"; runtime.getFireAndForgetService().executeQuery(query);
Columns in a named window and table may also hold an event or multiple events. More information on the insert into
clause providing event columns is in Section 5.10.5, “Event as a Property”.
A sample declaration for a named window and a table is:
create schema InnerData (value string)
create table ContainerTable (innerdata InnerData)
create window ContainerWindow#time(30) as (innerdataArray InnerData[]) // array of events
The second sample creates a named window that specifies two columns: A column that holds an OrderEvent, and a column by name priceTotal
. A matching insert into
statement is also part of the sample:
create window OrdersWindow#time(30) as select this, price as priceTotal from OrderEvent
insert into OrdersWindow select order, price * unit as priceTotal from ServiceOrderEvent as order
Note that the this
proprerty must exist on the event and must return the event class itself (JavaBean events only). The property type of the additional priceTotal
column is the property type of the existing price
property.