esper.codehaus.org and espertech.comDocumentation
Single-row functions return a single value for every single result row generated by your statement. These functions can appear anywhere where expressions are allowed.
Esper allows static Java library methods as single-row functions, and also features built-in single-row functions. In addition, Esper allows instance method invocations on named streams.
You may also register your own single-row function name with the engine so that your EPL statements become less cluttered. This is described in detail in Section 18.3, “Single-Row Function”. Single-row functions that return an object can be chained.
Esper auto-imports the following Java library packages:
java.lang.*
java.math.*
java.text.*
java.util.*
Thus Java static library methods can be used in all expressions as shown in below example:
select symbol, Math.round(volume/1000) from StockTickEvent.win:time(30 sec)
In general, arbitrary Java class names have to be fully qualified (e.g. java.lang.Math) but Esper provides a mechanism for user-controlled imports of classes and packages as outlined in Section 16.4.6, “Class and package imports”.
The below table outlines the built-in single-row functions available.
Table 10.1. Syntax and results of single-row functions
Single-row Function | Result |
---|---|
case value when compare_value then result [when compare_value then result ...] [else result] end |
Returns |
case when condition then result [when condition then result ...] [else result] end |
Returns the |
cast(expression, type_name) |
Casts the result of an expression to the given type. |
coalesce(expression, expression [, expression ...]) |
Returns the first non- |
current_evaluation_context() |
Returns an object containing the engine URI, EPL statement name and context partition id (when applicable). |
current_timestamp[()] |
Returns the current engine time as a |
exists(dynamic_property_name) |
Returns true if the dynamic property exists for the event, or false if the property does not exist. |
instanceof(expression, type_name [, type_name ...]) |
Returns true if the expression returns an object whose type is one of the types listed. |
istream() |
Returns true if the event is part of the insert stream and false if the event is part of the remove stream. |
max(expression, expression [, expression ...]) |
Returns the highest numeric value among the 2 or more comma-separated expressions. |
min(expression, expression [, expression ...]) |
Returns the lowest numeric value among the 2 or more comma-separated expressions. |
prev(expression, event_property) |
Returns a property value or all properties of a previous event, relative to the event order within a data window, or according to an optional index parameter (N) the positional Nth-from-last value. |
prevtail(expression, event_property) |
Returns a property value or all properties of the first event in a data window relative to the event order within a data window, or according to an optional index parameter (N) the positional Nth-from-first value. |
prevwindow(event_property) |
Returns a single property value of all events or all properties of all events in a data window in the order that reflects the sort order of the data window. |
prevcount(event_property) |
Returns the count of events (number of data points) in a data window. |
prior(integer, event_property) |
Returns a property value of a prior event, relative to the natural order of arrival of events |
typeof(expression) |
If expression is a stream name, returns the event type name of the evaluated event, often used with variant streams. If expression is a property name or expression, returns the name of the expression result type. |
The case
control flow function has two versions. The first version takes a value and a list of compare values to compare against, and returns the result where the first value equals the compare value. The second version takes a list of conditions and returns the result for the first condition that is true.
The return type of a case
expression is the compatible aggregated type of all return values.
The case
expression is sometimes used with the new
operator to return multiple results, see Section 9.13, “The 'new' Keyword”.
The example below shows the first version of a case
statement. It has a String
return type and returns the value 'one'.
select case myexpression when 1 then 'one' when 2 then 'two' else 'more' end from ...
The second version of the case
function takes a list of conditions. The next example has a Boolean
return type and returns the boolean value true.
select case when 1>0 then true else false end from ...
The cast
function casts the return type of an expression to a designated type. The function accepts two parameters:
The first parameter is the property name or expression that returns the value to be casted. The second parameter is the type to cast to.
You can use the as
keyword instead of comma (,
) to separate parameters.
Valid parameters for the second (type) parameter are:
Any of the Java built-in types: int, long, byte, short, char, double, float, string, BigInteger, BigDecimal
, where string
is a short notation for java.lang.String
and BigInteger
as well as BigDecimal
are the classes in java.math
. The type name is not case-sensitive. For example:
cast(price, double)
The fully-qualified class name of the class to cast to, for example:
cast(product as org.myproducer.Product)
For parsing date-time values, any of the date-time types: date, calendar, long
. For these types the dateformat
parameter is required as discussed below.
The cast
function is often used to provide a type for dynamic (unchecked) properties. Dynamic properties are properties whose type is not known at compile type.
These properties are always of type java.lang.Object
.
The cast
function as shown in the next statement casts the dynamic "price" property of an "item" in the OrderEvent to a double value.
select cast(item.price?, double) from OrderEvent
The cast
function returns a null
value if the expression result cannot be casted to the desired type, or if the expression result itself is null
.
The cast
function adheres to the following type conversion rules:
For all numeric types, the cast
function utilitzes java.lang.Number
to convert numeric types, if required.
For casts to string
or java.lang.String
, the function calls toString
on the expression result.
For casts to other objects including application objects, the cast
function considers a Java class's superclasses as well as all directly or indirectly-implemented interfaces by superclasses.
The cast
function can parse string-type date-time values to long-type milliseconds, Date
or Calendar
objects.
You must provide the dateformat
named parameter as the last parameter to the cast
function. The date format can be any platform date format
(SimpleDateFormat
) as a string-type value.
The next EPL outputs the date May 2, 2010 as a Date
-type value:
select cast('20100502', date, dateformat: 'yyyyMMdd') from OrderEvent
You may specify long
for a millisecond value or Calendar
for a calendar value instead of date
.
You may use date-time methods when cast
is returning a date-time value. Expressions can be any expression and do not need to be string constants.
You may parse dates that are ISO 8601-formatted dates by specifying iso
as the date format. The ISO 8601 date format is described in Section 7.6.5.1.1, “Specifying Dates”.
For example, assuming the orderDate
property is a ISO 8601 formatted date, the engine can convert it to a long millisecond value like this:
select cast(orderDate, long, dateformat: 'iso') from OrderEvent
The result of the coalesce
function is the first expression in a list of expressions that returns a non-null value. The return type is the compatible aggregated type of all return values.
This example returns a String-typed result of value 'foo':
select coalesce(null, 'foo') from ...
The current_evaluation_context
function takes no parameters and returns expression evaluation contextual information as an object of type
com.espertech.esper.client.hook.EPLExpressionEvaluationContext
. The object provides the engine URI, the statement name and the context partition id of
the currently-evaluated expression.
For example:
select current_evaluation_context().getEngineURI() as engineURI from MyEvent
The context partition id will be -1
when the statement is not associated to a context.
The current_timestamp
function is a reserved keyword and requires no parameters. The result of the current_timestamp
function is the long
-type millisecond value of the current engine system time.
The function returns the current engine timestamp at the time of expression evaluation. When using external-timer events, the function provides the last value of the externally-supplied time at the time of expression evaluation.
This example selects the current engine time:
select current_timestamp from MyEvent // equivalent to select current_timestamp() from MyEvent
The exists
function returns a boolean value indicating whether the dynamic property, provided as a parameter to the function, exists on the event. The exists
function accepts a single dynamic property name as its only parameter.
The exists
function is for use with dynamic (unchecked) properties. Dynamic properties are properties whose type is not known at compile type. Dynamic properties return a null value
if the dynamic property does not exists on an event, or if the dynamic property exists but the value of the dynamic property is null.
The exists
function as shown next returns true if the "item" property contains an object that has a "serviceName" property. It returns false if the "item" property is null, or if the "item" property does not
contain an object that has a property named "serviceName" :
select exists(item.serviceName?) from OrderEvent
The grouping
function is a SQL-standard function useful in statements that have a group by
-clause and that utilize one of the rollup
, cube
or grouping sets
keywords. The function can be used only in the select
-clause, having
-clause and order by
-clauses.
The function takes a single expression as a parameter and returns an integer value of zero or one indicating whether a specified expression in a group-by
-clause is aggregated or not.
The function returns 1 for aggregated or 0 for not aggregated.
The grouping
function can help you distinguish null values returned because of the output row's aggregation level from null values returned by event properties or other expressions.
The parameter expression must match exactly one of the expressions in the group-by
-clause.
Please see an example in the next section.
The grouping_id
function is a SQL-standard function useful in statements that have a group by
-clause and that utilize one of the rollup
, cube
or grouping sets
keywords. The function can be used only in the select
-clause, having
-clause and order by
-clauses.
The function takes one or more expressions as a parameter and returns an integer value indicating grouping level. The engine computes the grouping level by taking the results of multiple grouping
functions and concatenating them into a bit vector (a string of ones and zeros).
Assume a car event that has a property for name, place and number of cars:
create schema CarEvent(name string, place string, numcars int)
The next EPL computes the total number of cars for each of the following groupings: per name and place, per name, per place and overall.
select name, place, sum(numcars), grouping(name), grouping(place), grouping_id(name, place) from CarEvent group by grouping sets((name, place),name, place,())
Assume your application processes a car event with properties like so: CarEvent={name='skoda', place='france', numcars=100}
.
The engine outputs 4 rows as shown in the next table:
Table 10.2. Example output for grouping
and grouping_id
functions (CarEvent 1)
name | place | sum(numcars) | grouping(name) | grouping(place) | grouping_id(name, place) |
---|---|---|---|---|---|
skoda | france | 100 | 0 | 0 | 0 |
skoda | null | 100 | 0 | 1 | 1 |
null | france | 100 | 1 | 0 | 2 |
null | null | 100 | 1 | 1 | 3 |
Assume your application processes a second car event: CarEvent={name='skoda', place='germany', numcars=75}
.
The engine outputs 4 rows as shown in the next table:
Table 10.3. Example output for grouping
and grouping_id
functions (CarEvent 2)
name | place | sum(numcars) | grouping(name) | grouping(place) | grouping_id(name, place) |
---|---|---|---|---|---|
skoda | germany | 75 | 0 | 0 | 0 |
skoda | null | 175 | 0 | 1 | 1 |
null | germany | 75 | 1 | 0 | 2 |
null | null | 175 | 1 | 1 | 3 |
The parameter expressions must match exactly to expressions in the group-by
-clause.
The instanceof
function returns a boolean value indicating whether the type of value returned by the expression is one of the given types. The first parameter to the instanceof
function is an expression to evaluate. The second and subsequent parameters are Java type names.
The function determines the return type of the expression at runtime by evaluating the expression, and compares the type of object returned by the expression to the defined types.
If the type of object returned by the expression matches any of the given types, the function returns true
. If the expression returned null
or a type that does not
match any of the given types, the function returns false
.
The instanceof
function is often used in conjunction with dynamic (unchecked) properties. Dynamic properties are properties whose type is not known at compile type.
This example uses the instanceof
function to select different properties based on the type:
select case when instanceof(item, com.mycompany.Service) then serviceName? when instanceof(item, com.mycompany.Product) then productName? end from OrderEvent
The instanceof
function returns false
if the expression tested by instanceof returned null.
Valid parameters for the type parameter list are:
Any of the Java built-in types: int, long, byte, short, char, double, float, string
, where string
is a short notation for java.lang.String
. The type name is not case-sensitive. For example, the next function tests if the dynamic "price" property is either of type float or type double:
instanceof(price?, double, float)
The fully-qualified class name of the class to cast to, for example:
instanceof(product, org.myproducer.Product)
The function considers an event class's superclasses as well as all the directly or indirectly-implemented interfaces by superclasses.
The istream
function returns a boolean value indicating whether within the context of expression evaluation the current event or set of events (joins) are part of the insert stream (true) or part of the remove stream (false). The function takes no parameters.
Use the istream
function with data windows and select irstream
and insert irstream into
.
In the following example the istream
function always returns boolean true since no data window is declared:
select irstream *, istream() from OrderEvent
The next example declares a data window. For newly arriving events the function returns boolean true, for events that expire after 10 seconds the function returns boolean false:
select irstream *, istream() from OrderEvent.win:time(10 sec)
The istream
function returns true for all cases where insert or remove stream does not apply, such as when used in parameter expressions to data windows
or in stream filter expressions.
The min
and max
function take two or more parameters that itself can be expressions. The min
function returns the lowest numeric value among the 2 or more comma-separated expressions, while the max
function returns the highest numeric value.
The return type is the compatible aggregated type of all return values.
The next example shows the max
function that has a Double
return type and returns the value 1.1.
select max(1, 1.1, 2 * 0.5) from ...
The min
function returns the lowest value. The statement below uses the function to determine the smaller of two timestamp values.
select symbol, min(ticks.timestamp, news.timestamp) as minT from StockTickEvent.win:time(30 sec) as ticks, NewsEvent.win:time(30 sec) as news where ticks.symbol = news.symbol
The prev
function returns the property value or all event properties of a previous event. For data windows that introduce a sort order other than the order of arrival, such as the sorted data window and the time order data window, the function returns the event at the specified position.
The prev
function is not an aggregation function and therefore does not return results per group when used with group by
. Please consider the last
, lastever
or first
aggregation functions instead as described in Section 10.2.2, “Event Aggregation Functions”. You must use an aggregation function instead of prev
when querying a named window or table.
The first parameter to the prev
function is an index parameter and denotes the i-th previous event, in the order established by the data window. If no index is provided, the default index is 1 and the function returns the previous event.
The second parameter is a property name or stream name. If specifying a property name, the function returns the value for the previous event property value. If specifying a stream name, the function returns the previous event underlying object.
This example selects the value of the price
property of the 2nd-previous event from the current Trade event:
select prev(2, price) from Trade.win:length(10)
By using the stream alias in the previous
function, the next example selects the trade event itself that is immediately previous to the current Trade event
select prev(1, trade) from Trade.win:length(10) as trade
Since the prev
function takes the order established by the data window into account, the function works well with sorted windows.
In the following example the statement selects the symbol of the 3 Trade events that had the largest, second-largest and third-largest volume.
select prev(0, symbol), prev(1, symbol), prev(2, symbol) from Trade.ext:sort(3, volume desc)
The i-th previous event parameter can also be an expression returning an Integer-type value. The next statement joins the Trade data window with an RankSelectionEvent
event that provides a rank
property used to look up a certain position in the sorted Trade data window:
select prev(rank, symbol) from Trade.ext:sort(10, volume desc), RankSelectionEvent unidirectional
Use the prev
function in combination with a grouped data window to access a previous event per grouping criteria.
The example below returns the price of the previous Trade event for the same symbol, or null
if for that symbol there is no previous Trade event:
select prev(1, price) from Trade.std:groupwin(symbol).win:length(2)
The prev
function returns a null
value if the data window does not currently hold the i-th previous event. The example below illustrates this using a time batch window. Here the prev
function returns a null value for any events in which the previous event is not in the same batch of events. Note that the prior
function as discussed below can be used if a null value is not the desired result.
select prev(1, symbol) from Trade.win:time_batch(1 min)
An alternative form of the prev
function allows the index to not appear or appear after the property name if the index value is a constant and not an expression:
select prev(1, symbol) from Trade // ... equivalent to ... select prev(symbol) from Trade // ... and ... select prev(symbol, 1) from Trade
The combination of the prev
function and std:groupwin
view returns the property value for a previous event in the given data window group.
The following example returns for each event the current smallest price per symbol:
select symbol, prev(0, price) as topPricePerSymbol from Trade.std:groupwin(symbol).ext:sort(1, price asc)
The following restrictions apply to the prev
functions and its results:
The function always returns a null
value for remove stream (old data) events.
The function requires a data window view, or a std:groupwin
and data window view, without any additional sub-views. See Section 13.2, “Data Window Views” for built-in data window views.
The prev
function is similar to the prior
function. The key differences between the two functions are as follows:
The prev
function returns previous events in the order provided by the data window, while the prior
function returns prior events in the order of arrival as posted by a stream's declared views.
The prev
function requires a data window view while the prior
function does not have any view requirements.
The prev
function returns the previous event grouped by a criteria by combining the std:groupwin
view and a data window. The prior
function returns prior events posted by the last view regardless of data window grouping.
The prev
function returns a null
value for remove stream events, i.e. for events leaving a data window.
The prior
function does not have this restriction.
The prevtail
function returns the property value or all event properties of the positional-first event in a data window. For data windows that introduce a sort order other than the order of arrival, such as the sorted data window and the time order data window, the function returns the first event at the specified position.
The prevtail
function is not an aggregation function and therefore does not return results per group when used with group by
. Please consider the first
, firstever
or window
aggregation functions instead as described in Section 10.2.2, “Event Aggregation Functions”. You must use an aggregation function instead of prevtail
when querying a named window or table.
The first parameter is an index parameter and denotes the i-th from-first event in the order established by the data window. If no index is provided the default is zero and the function returns the first event in the data window. The second parameter is a property name or stream name. If specifying a property name, the function returns the value for the previous event property value. If specifying a stream name, the function returns the previous event underlying object.
This example selects the value of the price
property of the first (oldest) event held in the length window:
select prevtail(price) from Trade.win:length(10)
By using the stream alias in the prevtail
function, the next example selects the trade event itself that is the second event held in the length window:
select prevtail(1, trade) from Trade.win:length(10) as trade
Since the prevtail
function takes the order established by the data window into account, the function works well with sorted windows.
In the following example the statement selects the symbol of the 3 Trade events that had the smallest, second-smallest and third-smallest volume.
select prevtail(0, symbol), prevtail(1, symbol), prevtail(2, symbol) from Trade.ext:sort(3, volume asc)
The i-th previous event parameter can also be an expression returning an Integer-type value. The next statement joins the Trade data window with an RankSelectionEvent
event that provides a rank
property used to look up a certain position in the sorted Trade data window:
select prevtail(rank, symbol) from Trade.ext:sort(10, volume asc), RankSelectionEvent unidirectional
The prev
function returns a null
value if the data window does not currently holds positional-first or the Nth-from-first event. For batch data windows the value returned is relative to the current batch.
The following example returns the first and second symbol value in the batch:
select prevtail(0, symbol), prevtail(1, symbol) from Trade.win:time_batch(1 min)
An alternative form of the prevtail
function allows the index to not appear or appear after the property name if the index value is a constant and not an expression:
select prevtail(1, symbol) from Trade // ... equivalent to ... select prevtail(symbol) from Trade // ... and ... select prevtail(symbol, 1) from Trade
The combination of the prevtail
function and std:groupwin
view returns the property value for a positional first event in the given data window group.
Let's look at an example. This statement outputs the oldest price per symbol retaining the last 10 prices per symbol:
select symbol, prevtail(0, price) as oldestPrice from Trade.std:groupwin(symbol).win:length(10)
The following restrictions apply to the prev
functions and its results:
The function always returns a null
value for remove stream (old data) events.
The function requires a data window view, or a std:groupwin
and data window view, without any additional sub-views. See Section 13.2, “Data Window Views” for built-in data window views.
The prevwindow
function returns property values or all event properties for all events in a data window. For data windows that introduce a sort order other than the order of arrival, such as the sorted data window and the time order data window, the function returns the event data sorted in that order, otherwise it returns the events sorted by order of arrival with the newest arriving event first.
The prevwindow
function is not an aggregation function and therefore does not return results per group when used with group by
. Please consider the window
aggregation function instead as described in Section 10.2.2, “Event Aggregation Functions”. You must use an aggregation function instead of prevwindow
when querying a named window or table.
The single parameter is a property name or stream name. If specifying a property name, the function returns the value of the event property for all events held by the data window. If specifying a stream name, the function returns the event underlying object for all events held by the data window.
This example selects the value of the price
property of all events held in the length window:
select prevwindow(price) from Trade.win:length(10)
By using the stream alias in the prevwindow
function, the next example selects all trade events held in the length window:
select prevwindow(trade) from Trade.win:length(10) as trade
When used with a data window that introduces a certain sort order, the prevwindow
function returns events sorted according to that sort order.
The next statement outputs for every arriving event the current 10 underying trade event objects that have the largest volume:
select prevwindow(trade) from Trade.ext:sort(10, volume desc) as trade
The prevwindow
function returns a null
value if the data window does not currently hold any events.
The combination of the prevwindow
function and std:groupwin
view returns the property value(s) for all events in the given data window group.
This example statement outputs all prices per symbol retaining the last 10 prices per symbol:
select symbol, prevwindow(price) from Trade.std:groupwin(symbol).win:length(10)
The following restrictions apply to the prev
functions and its results:
The function always returns a null
value for remove stream (old data) events.
The function requires a data window view, or a std:groupwin
and data window view, without any additional sub-views. See Section 13.2, “Data Window Views” for built-in data window views.
The prevcount
function returns the number of events held in a data window.
The prevcount
function is not an aggregation function and therefore does not return results per group when used with group by
. Please consider the count(*)
or countever
aggregation functions instead as described in Section 10.2, “Aggregation Functions”. You must use an aggregation function instead of prevcount
when querying a named window or table.
The single parameter is a property name or stream name of the data window to return the count for.
This example selects the number of data points for the price
property held in the length window:
select prevcount(price) from Trade.win:length(10)
By using the stream alias in the prevcount
function the next example selects the count of trade events held in the length window:
select prevcount(trade) from Trade.win:length(10) as trade
The combination of the prevcount
function and std:groupwin
view returns the count of events in the given data window group.
This example statement outputs the number of events retaining the last 10 events per symbol:
select symbol, prevcount(price) from Trade.std:groupwin(symbol).win:length(10)
The following restrictions apply to the prev
functions and its results:
The function always returns a null
value for remove stream (old data) events.
The function requires a data window view, or a std:groupwin
and data window view, without any additional sub-views. See Section 13.2, “Data Window Views” for built-in data window views.
The prior
function returns the property value of a prior event. The first parameter is an integer value that denotes the i-th prior event in the natural order of arrival.
The second parameter is a property name for which the function returns the value for the prior event.
The second parameter is a property name or stream name. If specifying a property name, the function returns the property value for the prior event. If specifying a stream name, the function returns the prior event underlying object.
This example selects the value of the price
property of the 2nd-prior event to the current Trade event.
select prior(2, price) from Trade
By using the stream alias in the prior
function, the next example selects the trade event itself that is immediately prior to the current Trade event
select prior(1, trade) from Trade as trade
The prior
function can be used on any event stream or view and does not have any specific view requirements. The function operates on the order of arrival of events by the event stream or view that provides the events.
The next statement uses a time batch window to compute an average volume for 1 minute of Trade events, posting results every minute. The select-clause
employs the prior
function to select the current average and the average before the current average:
select average, prior(1, average) from TradeAverages.win:time_batch(1 min).stat:uni(volume)
The typeof
function, when parameterized by a stream name, returns the event type name of the evaluated event which can be useful with variant streams. When parameterized by an expression or property name, the function returns the type name of the expression result or null
if the expression result is null.
In summary, the function determines the return type of the expression at runtime by evaluating the expression and returns the type name of the expression result.
The typeof
function is often used in conjunction with variant streams. A variant stream is a predefined stream into which events of multiple disparate event types can be inserted. The typeof
function, when passed a stream name alias, returns the name of the event type of each event in the stream.
The following example elaborates on the use of variant streams with typeof
. The first statement declares a variant stream SequencePatternStream
:
create variant schema SequencePatternStream as *
The next statement inserts all order events and is followed by a statement to insert all product events:
insert into SequencePatternStream select * from OrderEvent;
insert into SequencePatternStream select * from PriceEvent;
This example statement returns the event type name for each event in the variant stream:
select typeof(sps) from SequencePatternStream as sps
The next example statement detects a pattern by utilizing the typeof
function to find pairs of order event immediately followed by product event:
select * from SequencePatternStream match_recognize( measures A as a, B as b pattern (A B) define A as typeof(A) = "OrderEvent", B as typeof(B) = "ProductEvent" )
When passing a property name to the typeof
function, the function evaluates whether the property type is event type (a fragment event type). If the property type is event type, the function returns the type name of the event in the property value or null
if not provided.
If the property type is not event type, the function returns the simple class name of the property value.
When passing an expression to the typeof
function, the function evaluates the expression and returns the simple class name of the expression result value or null
if the expression result value is null.
This example statement returns the simple class name of the value of the dynamic property prop
of events in stream MyStream
, or a null
value if the property is not found for an event or the property value itself is null
:
select typeof(prop?) from MyStream
When using subclasses or interface implementations as event classes or when using Map-event type inheritance, the function returns the event type name provided when the class or Map-type event was registered, or if the event type was not registered, the function returns the fully-qualified class name.
Aggregation functions are stateful and consider sets of events or value points. The group by
clause is often used in conjunction with aggregation functions to group the result-set by one or more columns.
Aggregation functions can be a column type for table declarations. This allows easy sharing of aggregated state, co-location of aggregations and other data as well as co-aggregation by multiple statements into the same aggregation state. Please see Section 6.1.2, “Table Overview” for details.
The EPL language extends the standard SQL aggregation functions by allowing filters and by further useful aggregation functions that can track a data window or compute event rates, for example. Your application may also add its own aggregation function as Section 18.5, “Aggregation Function” describes.
The EPL language allows each aggregation function to specify its own grouping criteria. Please find further information in Section 5.6.4, “Specifying grouping for each aggregation function”.
Aggregation values are always computed incrementally: Insert and remove streams result in aggregation value changes. The exceptions are on-demand queries and joins when using the unidirectional
keyword. Aggregation functions are optimized to retain the minimal
information necessary to compute the aggregated result, and to share aggregation state between eligible other aggregation functions in the same statement so that same-kind aggregation state is
never held multiple times unless required.
Most aggregation functions can also be used with unbound streams when no data window is specified. A few aggregation functions require a data window or named window as documented below.
The SQL-standard aggregation functions are shown in below table.
Table 10.4. Syntax and results of SQL-standard aggregation functions
Aggregate Function | Result |
---|---|
avedev([all|distinct] expression [, filter_expr]) |
Mean deviation of the (distinct) values in the expression, returning a value of The optional filter expression limits the values considered for computing the mean deviation. |
avg([all|distinct] expression [, filter_expr]) |
Average of the (distinct) values in the expression, returning a value of The optional filter expression limits the values considered for computing the average. |
count([all|distinct] expression [, filter_expr]) |
Number of the (distinct) non-null values in the expression, returning a value of The optional filter expression limits the values considered for the count. |
count(* [, filter_expr]) |
Number of events, returning a value of The optional filter expression limits the values considered for the count. |
max([all|distinct] expression)
fmax([all|distinct] expression, filter_expr)
|
Highest (distinct) value in the expression, returning a value of the same type as the expression itself returns.
Use
Consider using |
maxever([all|distinct] expression)
fmaxever([all|distinct] expression, filter_expr)
|
Highest (distinct) value - ever - in the expression, returning a value of the same type as the expression itself returns.
Use
Consider using |
median([all|distinct] expression [, filter_expr]) |
Median (distinct) value in the expression, returning a value of The optional filter expression limits the values considered for computing the median. |
min([all|distinct] expression)
fmin([all|distinct] expression, filter_expr)
|
Lowest (distinct) value in the expression, returning a value of the same type as the expression itself returns.
Use
Consider using |
minever([all|distinct] expression)
fminever([all|distinct] expression, filter_expr)
|
Lowest (distinct) value - ever - in the expression, returning a value of the same type as the expression itself returns.
Use
Consider using |
stddev([all|distinct] expression [, filter_expr]) |
Standard deviation of the (distinct) values in the expression, returning a value of The optional filter expression limits the values considered for computing the standard deviation. |
sum([all|distinct] expression [, filter_expr]) |
Totals the (distinct) values in the expression, returning a value of The optional filter expression limits the values considered for computing the total. |
If your application provides double-type values to an aggregation function, avoid using Not-a-Number (NaN) and infinity. Also when using double-type values, round-off errors (or rounding errors) may occur due to double-type precision. Consider rounding your result value to the desired precision.
Each of the aggregation functions above takes an optional filter expression as a parameter. The filter expression must return a boolean-type value and applies to the events considered for the aggregation. If a filter expression is provided, then only if the filter expression returns a value of true does the engine update the aggregation for that event or combination of events.
Consider the following example, which computes the quantity fraction of buy orders among all orders:
select sum(quantity, side='buy') / sum(quantity) as buy_fraction from Orders
Use the fmin
and fmax
aggregation functions instead of the min
and max
aggregation functions when providing a filter expression (the min
and max
functions are also single-row functions).
The next example computes the minimum quantity for buy orders and a separate minimum quantity for sell orders:
select fmin(quantity, side='buy'), fmin(quantity, side = 'sell') from Orders
This sample statement demonstrates specifying grouping criteria for an aggregation function using the group_by
named parameter.
It computes, for the last one minute of orders, the ratio of orders per account compared to all orders:
select count(*)/count(*, group_by:()) as ratio from Orders.win:time(1 min) group by account
The event aggregation functions return one or more events or event properties. When used with group by
the event aggregation functions
return one or more events or event properties per group.
The sorted
and the window
event aggregation functions require that a data window or named window is declared for the applicable stream.
They cannot be used on unbound streams.
The below table summarizes the event aggregation functions available:
Table 10.5. Event Aggregation Functions
Function | Result |
---|---|
first(...) |
Returns the first event or an event property value of the first event. |
last(...) |
Returns the last event or an event property value of the last event. |
maxby(criteria) |
Returns the event with the highest sorted value according to criteria expressions. |
maxbyever(criteria) |
Returns the event with the highest sorted value, ever, according to criteria expressions. |
minby(criteria) |
Returns the event with the lowest sorted value according to criteria expressions. |
minbyever(criteria) |
Returns the event with the lowest sorted value, ever, according to criteria expressions. |
sorted(criteria) |
Returns events sorted according to criteria expressions. |
window(...) |
Returns all events or all event's property values. |
In connection with named windows and tables, event aggregation functions can also be used in on-select
,
selects with named window or table in the from
clause, subqueries against named windows or tables and on-demand fire-and-forget queries.
The event aggregation functions are often useful in connection with enumeration methods and they can provide input events for enumeration. Please see Chapter 11, EPL Reference: Enumeration Methods for more information.
When comparing the last
aggregation function to the prev
function, the differences are as follows. The prev
function is not an aggregation function and thereby not sensitive to the presence of group by
. The prev
function accesses data window contents directly and respects the sort order of the data window. The last
aggregation function returns results based on arrival order and tracks data window contents in a separate shared data structure.
When comparing the first
aggregation function to the prevtail
function, the differences are as follows. The prevtail
function is not an aggregation function and thereby not sensitive to the presence of group by
. The prevtail
function accesses data window contents directly and respects the sort order of the data window. The first
aggregation function returns results based on arrival order and tracks data window contents in a separate shared data structure.
When comparing the window
aggregation function to the prevwindow
function, the differences are as follows. The prevwindow
function is not an aggregation function and thereby not sensitive to the presence of group by
. The prevwindow
function accesses data window contents directly and respects the sort order of the data window. The window
aggregation function returns results based on arrival order and tracks data window contents in a separate shared data structure.
When comparing the count
aggregation function to the prevcount
function, the differences are as follows. The prevcount
function is not an aggregation function and thereby not sensitive to the presence of group by
.
When comparing the last
aggregation function to the nth
aggregation function, the differences are as follows. The nth
aggregation function does not consider out-of-order deletes (for example with on-delete and sorted windows) and does not revert to the prior expression value when the last event or nth-event was deleted from a data window. The last
aggregation function tracks the data window and reflects out-of-order deletes.
From an implementation perspective, the first
, last
and window
aggregation functions share a common data structure for each stream.
The sorted
, minby
and maxby
aggregation functions share a common data structure for each stream.
The synopsis for the first
aggregation function is:
first(
*|stream.*|value_expression [, index_expression])
The first
aggregation function returns properties of the very first event. When used with group by
, it returns properties of the first event for each group. When specifying an index expression, the function returns properties of the Nth-subsequent event to the first event, all according to order of arrival.
The first parameter to the function is required and defines the event properties or expression result to return. The second parameter is an optional index_expression that must return an integer value used as an index to evaluate the Nth-subsequent event to the first event.
You may specify the wildcard (*
) character in which case the function returns the underlying event of the single selected stream. When selecting a single stream you may specify no parameter instead of wildcard. For joins and subqueries you must use the stream wildcard syntax below.
You may specify the stream name and wildcard (*
) character in the stream.*
syntax. This returns the underlying event for the specified stream.
You may specify a value_expression to evaluate for the first event. The value expression may not select properties from multiple streams.
The index_expression is optional. If no index expression is provided, the function returns the first event. If present, the function evaluates the index expression to determine the value for N, and evaluates the Nth-subsequent event to the first event. A value of zero returns the first event and a value of 1 returns the event subsequent to the first event. You may not specify event properties in the index expression.
The function returns null
if there are no events or when the index is larger than the number of events held. When used with group by
, it returns null
if there are no events for that group or when the index is larger than the number of events held for that group.
To explain, consider the statement below which selects the underlying event of the first sensor event held by the length window of 2 events.
select first(*) from SensorEvent.win:length(2)
Assume event E1, event E2 and event E3 are of type SensorEvent. When event E1 arrives the statement outputs the underlying event E1. When event E2 arrives the statement again outputs the underlying event E1. When event E3 arrives the statement outputs the underlying event E2, since event E1 has left the data window.
The stream wildcard syntax is useful for joins and subqueries. This example demonstrates a subquery that returns the first SensorEvent when a DoorEvent arrives:
select (select first(se.*) from SensorEvent.win:length(2) as se) from DoorEvent
The following example shows the use of an index expression. The output value for f1
is the temperature property value of the first event, the value for f2
is the temperature property value of the second event:
select first(temperature, 0) as f1, first(temperature, 1) as f2 from SensorEvent.win:time(10 sec)
You may use dot-syntax to invoke a method on the first event. You may also append a property name using dot-syntax.
The synopsis for the last
aggregation function is:
last(
*|stream.*|value_expression [, index_expression])
The last
aggregation function returns properties of the very last event. When used with group by
, it returns properties of the last event for each group. When specifying an index expression, the function returns properties of the Nth-prior event to the last event, all according to order of arrival.
Similar to the first
aggregation function described above, you may specify the wildcard (*
) character, no parameter or stream name and wildcard (*
) character or a value_expression to evaluate for the last event.
The index_expression is optional. If no index expression is provided, the function returns the last event. If present, the function evaluates the index expression to determine the value for N, and evaluates the Nth-prior event to the last event. A value of zero returns the last event and a value of 1 returns the event prior to the last event. You may not specify event properties in the index expression.
The function returns null
if there are no events or when the index is larger than the number of events held. When used with group by
, it returns null
if there are no events for that group or when the index is larger than the number of events held for that group.
The next statement selects the underlying event of the first and last sensor event held by the time window of 10 seconds:
select first(*), last(*) from SensorEvent.win:time(10 sec)
The statement shown next selects the last temperature (f1
) and the prior-to-last temperature (f1
) of sensor events in the last 10 seconds:
select last(temperature, 0) as f1, select last(temperature, 1) as f2 from SensorEvent.win:time(10 sec)
The synopsis for the maxby
aggregation function is:
maxby
(sort_criteria_expression [asc/desc][, sort_criteria_expression [asc/desc]...])
The maxby
aggregation function returns the greatest of all events, compared by using criteria expressions.
When used with group by
, it returns the greatest of all events per group.
This example statement returns the sensor id and the temperature of the sensor event that had the highest temperature among all sensor events:
select maxby(temperature).sensorId, maxby(temperature).temperature from SensorEvent
The next EPL returns the sensor event that had the highest temperature and the sensor event that had the lowest temperature, per zone, among the last 10 seconds of sensor events:
select maxby(temperature), minby(temperature) from SensorEvent.win:time(10 sec) group by zone
Your EPL may specify multiple criteria expressions. If the sort criteria expression is descending please append the desc
keyword.
The following EPL returns the sensor event with the highest temperature and if there are multiple sensor events with the highest temperature the query returns the sensor event that has the newest timestamp value:
select maxby(temperature asc, timestamp desc) from SensorEvent
Event properties that are listed in criteria expressions must refer to the same event stream and cannot originate from different event streams.
If your query does not define a data window and does not refer to a named window, the semantics of maxby
are the same as maxbyever
.
The synopsis for the maxbyever
aggregation function is:
maxbyever
(sort_criteria_expression [asc/desc][, sort_criteria_expression [asc/desc]...])
The maxbyever
aggregation function returns the greatest of all events that ever occurred, compared by using criteria expressions.
When used with group by
, it returns the greatest of all events that ever occurred per group.
Compared to the maxby
aggregation function the maxbyever
does not consider the data window or named window contents
and instead considers all arriving events.
The next EPL computes the difference, per zone, between the maximum temperature considering all events and the maximum temperature considering only the events in the last 10 seconds:
select maxby(temperature).temperature - maxbyever(temperature).temperature from SensorEvent.win:time(10) group by zone
The synopsis for the minby
aggregation function is:
minby
(sort_criteria_expression [asc/desc][, sort_criteria_expression [asc/desc]...])
Similar to the maxby
aggregation function, the minby
aggregation function returns the lowest of all events, compared by using criteria expressions.
When used with group by
, it returns the lowest of all events per group.
Please review the section on maxby
for more information.
Similar to the maxbyever
aggregation function, the minbyever
aggregation function returns the lowest of all events that ever occurred, compared by using criteria expressions.
When used with group by
, it returns the lowest of all events per group that ever occured.
Please review the section on maxbyever
for more information.
The synopsis for the sorted
aggregation function is:
sorted
(sort_criteria_expression [asc/desc][, sort_criteria_expression [asc/desc]...])
The sorted
aggregation function maintains a list of events sorted according to criteria expressions.
When used with group by
, it maintains a list of events sorted according to criteria expressions per group.
The sample EPL listed next returns events sorted according to temperature ascending for the same zone:
select sorted(temperature) from SensorEvent group by zone
Your EPL may specify multiple criteria expressions. If the sort criteria expression is descending please append the desc
keyword.
Enumeration methods can be useful in connection with sorted
as the function provides the sorted events as input.
This EPL statement finds the sensor event that when sorted according to temperature is the first sensor event for a Friday timestamp among sensor events for the same zone:
select sorted(temperature).first(v => timestamp.getDayOfWeek()=6) from SensorEvent
Event properties that are listed in criteria expressions must refer to the same event stream and cannot originate from different event streams.
If used in a regular select statement, the use of sorted
requires that your EPL defines a data window for the stream or utilizes a named window.
The synopsis for the window
aggregation function is:
window(
*|stream.*|value_expression)
The window
aggregation function returns all rows. When used with group by
, it returns the rows for each group.
Similar to the first
aggregation function described above, you may specify the wildcard (*
) character or stream name and wildcard (*
) character or a value_expression to evaluate for all events.
The function returns null
if there are no rows. When used with group by
, it returns null
if there are no rows for that group.
The next statement selects the underlying event of all events held by the time window of 10 seconds:
select window(*) from SensorEvent.win:time(10 sec)
If used in a regular select statement, the window
aggregation function requires that your stream is bound by a data window or a named window. You may not use the window
aggregation function on unbound streams with the exception of on-demand queries or subqueries.
This example statement assumes that the OrderWindow
named window exists. For each event entering or leaving the OrderWindow
named window
it outputs the total amount removing negative amounts:
select window(*).where(v => v.amount > 0).aggregate(0d, (r, v) => r + v.amount) from OrderWindow
Under approximation aggregation function we understand aggregations that perform approximate analysis. Compared to the previously-introduced aggregation functions, the functions discussed here have a degree of accuracy and probabilistic behavior.
Count-min sketch (or CM sketch) is a probabilistic sub-linear space streaming algorithm (source: Wikipedia). Count-min sketch computes an approximate frequency, without retaining distinct values in memory, making the algorithm suitable for summarizing very large spaces of distinct values. The estimated count can be used for estimated top-K and estimated heavy-hitters, for example.
The original and detail of the algorithm is described in the paper by Graham Cormode and S. Muthukrishnan. An improved data stream summary: The Count-min sketch and its applications (2004. 10.1016/j.jalgor.2003.12.001).
Count-min sketch can only be used with tables and is not available as an aggregation function other than in a table declaration.
Count-min sketch does not consider events leaving a data window and does not process a remove stream.
The table column type for Count-min sketch is countMinSketch
.
For example, the next EPL declares a table that holds a Count-min sketch (does not provision a top-K):
create table WordCountTable(wordcms countMinSketch())
You can parameterize the algorithm by providing a JSON-format structure to the declaration. The available parameters are all optional:
Table 10.6. Count-min Sketch Parameters
Name | Description |
---|---|
epsOfTotalCount |
Specifies the accuracy (number of values counted * accuracy >= number of errors) of type double. This value defaults to 0.0001. |
confidence |
Provides the certainty with which we reach the accuracy of type double. The default is 0.99. |
seed |
A seed value for computing hash codes of type integer. This default is 123456. |
topk |
The number of top-K values as an integer. If null, the algorithm maintains no top-K list. This value defaults to null (no top-K available). |
agent |
The agent is an extension API class that can interact with Count-min sketch state and also receives the value objects. The agent defines the type of the values that can be counted. The default agent only allows string-type values and utilizes UTF-16 charset encoding.
The default agent is |
The next example EPL declares all available parameters:
create table WordCountTable (wordcms countMinSketch({ epsOfTotalCount: 0.000002, confidence: 0.999, seed: 38576, topk: 20, agent: 'com.mycompany.CountMinSketchCustomAgent' }))
The default for the topk
parameter is null
. Thereby the engine by default does not compute top-K.
By specifying a positive integer value for topk
the algorithm maintains a list of values representing the top estimated counts.
By default, the Count-min sketch group of aggregation functions operates on string-type values only.
The aggregation function allows registering an agent that can handle any other type of value objects and that allows overriding behavior.
The agent class must implement the interface com.espertech.esper.client.util.CountMinSketchAgent
.
Please see the JavaDoc for implementing an agent. The agent API is an extension API and is subject to change between versions.
The countMinSketchAdd
function counts value(s). It expects a single parameter expression returning the value(s) to be counted.
The function can only be used with statements that utilize into table
.
This example EPL counts words:
into table WordCountTable select countMinSketchAdd(word) as wordcms from WordEvent
The countMinSketchFrequency
function returns an estimated count for a given value.
It expects a single parameter expression returning the value(s) for which to estimate and return the long-type count.
The function can only be used as a table-access function against a table column that declares the aggregation countMinSketch
.
The next example EPL returns, when a EstimateWordCountEvent
event arrives, the estimated frequency of a given word:
select WordCountTable.wordcms.countMinSketchFrequency(word) from EstimateWordCountEvent
The countMinSketchTopK
function returns top-K. The function expects no parameters.
The function can only be used as a table-access function against a table column that declares the aggregation countMinSketch
and only if the Count-min sketch was parameterized with a non-null topk
parameter (the default is null
, see declaration above).
The function returns an array of com.espertech.esper.client.util.CountMinSketchTopK
.
The following EPL outputs top-K every 10 seconds:
select WordCountTable.wordcms.countMinSketchTopk() from pattern[every timer:interval(10 sec)]
We provide a sample agent code that handles String-type values below. The complete code is available for class CountMinSketchAgentStringUTF16
as part of sources.
public class CountMinSketchAgentStringUTF16 implements CountMinSketchAgent { public Class[] getAcceptableValueTypes() { return new Class[] {String.class}; } public void add(CountMinSketchAgentContextAdd ctx) { String text = (String) ctx.getValue(); if (text == null) { return; } byte[] bytes = toBytesUTF16(text); ctx.getState().add(bytes, 1); // increase count by 1 } public Long estimate(CountMinSketchAgentContextEstimate ctx) { String text = (String) ctx.getValue(); if (text == null) { return null; } byte[] bytes = toBytesUTF16(text); return ctx.getState().frequency(bytes); } }
Esper provides the following additional aggregation functions beyond those in the SQL standard:
Table 10.7. Syntax and results of EPL aggregation functions
Aggregate Function | Result |
---|---|
countever(* [, filter_expr]) countever(expression [, filter_expr]) |
The
When used with a data window, the result of the function does not change as data points leave a data window. Use the
The optional filter expression limits the values considered for counting rows. The The next example statement outputs the count-ever for sell orders: select countever(*, side='sell') from Order |
firstever(expression [, filter_expr]) |
The
When used with a data window, the result of the function does not change as data points leave a data window. Use the The optional filter expression limits the values considered for retaining the first-ever value. The next example statement outputs the first price ever for sell orders: select firstever(price, side='sell') from Order |
lastever(expression [, filter_expr]) |
Returns the last value or last value per group, when used with This sample statement outputs the total price, the first price and the last price per symbol for the last 30 seconds of events and every 5 seconds: select symbol, sum(price), lastever(price), firstever(price) from StockTickEvent.win:time(30 sec) group by symbol output every 5 sec
When used with a data window, the result of the function does not change as data points leave a data window (for example when all data points leave the data window). Use the The optional filter expression limits the values considered for retaining the last-ever value. The next example statement outputs the last price (ever) for sell orders: select lastever(price, side='sell') from Order |
leaving() |
Returns true when any remove stream data has passed, for use in the
The
This sample statement uses select symbol, sum(price) from StockTickEvent.win:time(30 sec) having leaving() |
nth(expression, N_index) |
Returns the Nth oldest element; If N=0 returns the most recent value. If N=1 returns the value before the most recent value. If N is larger than the events held in the data window for this group, returns null. A maximum N historical values are stored, so it can be safely used to compare recent values in large views without incurring excessive overhead.
As compared to the This statement outputs every 2 seconds the groups that have new data and their last price and the previous-to-last price: select symbol, nth(price, 1), last(price) from StockTickEvent group by symbol output last every 2 sec |
rate(number_of_seconds) |
Returns an event arrival rate per second over the provided number of seconds, computed based on engine time.
Returns null until events fill the number of seconds. Useful with A sample statement to output, every 2 seconds, the arrival rate per second considering the last 10 seconds of events is shown here: select rate(10) from StockTickEvent output snapshot every 2 sec The aggregation function retains an engine timestamp value for each arriving event. |
rate(timestamp_property[, accumulator]) |
Returns an event arrival rate over the data window including the last remove stream event. The timestamp_property is the name of a long-type property of the event that provides a timestamp value. The first parameter is a property name or expression providing millisecond timestamp values. The optional second parameter is a property or expression for computing an accumulation rate: If a value is provided as a second parameter then the accumulation rate for that quantity is returned (e.g. turnover in dollars per second). This footprint is designed for use with a data window and requires a data window declared onto the stream. Returns null until events start leaving the window.
This sample statement outputs event rate for each group (symbol) with fixed sample size of four events
(and considering the last event that left). The select colour, rate(timestamp) as rate from StockTickEvent.std:groupwin(symbol).win:length(4) group by symbol |
Built-in aggregation functions can be disabled via configuration (see Section 16.4.22.4, “Extended Built-in Aggregation Functions”). A custom aggregation function of the same name as a built-on function may be registered to override the built-in function.
A user-defined function (UDF) is a single-row function that can be invoked anywhere as an expression itself or within an expresson. The function must simply be a public static method that the classloader can resolve at statement creation time. The engine resolves the function reference at statement creation time and verifies parameter types.
For information on calling external services via instance method invocation, please see Section 5.17.5, “Class and Event-Type Variables”. For invoking methods on events, please see Section 5.4.5, “Using the Stream Name”
You may register your own function name for the user-defined function. Please see the instructions in Section 18.3, “Single-Row Function” for registering a function name for a user-defined single-row function.
A single-row function that has been registered with a function name can simply be referenced as function_name(
parameters)
thus EPL statements can be less cluttered as no class name is required.
The engine also optimizes evaluation of such registered single-row functions when used in filter predicate expressions as described in Section 18.3.4, “Single-Row Functions in Filter Predicate Expressions”.
An example EPL statement that utilizes the discount
function is shown next (assuming that function has been registered).
select discount(quantity, price) from OrderEvent
When selecting from a single stream, use the wildcard (*)
character to pass the underlying event:
select discount(*) from OrderEvent
Alternatively use the stream alias or EPL pattern tag to pass an event:
select discount(oe) from OrderEvent as oe
User-defined functions can be also be invoked on instances of an event: Please see Section 5.4.5, “Using the Stream Name” to invoke event instance methods on a named stream.
Note that user-defined functions (not single-row functions) are candidate for caching their return result if the parameters passed are constants and they are not used chained. Please see below for details and configuration.
The example below assumes a class MyClass
that exposes a public static method myFunction
accepting 2 parameters, and
returing a numeric type such as double
.
select 3 * com.mycompany.MyClass.myFunction(price, volume) as myValue from StockTick.win:time(30 sec)
User-defined functions also take array parameters as this example shows. The section on Section 9.5, “Array Definition Operator” outlines in more detail the types of arrays produced.
select * from RFIDEvent where com.mycompany.rfid.MyChecker.isInZone(zone, {10, 20, 30})
Java class names have to be fully qualified (e.g. java.lang.Math) but Esper provides a mechanism for user-controlled imports of classes and packages as outlined in Section 16.4.6, “Class and package imports”.
User-defined functions can return any value including null
, Java objects or arrays. Therefore user-defined functions can serve to transform, convert or map events, or to extract information and assemble further events.
The following statement is a simple pattern that looks for events of type E1 that are followed by events of type E2. It assigns the tags "e1" and "e2" that the function can use to assemble a final event for output:
select MyLib.mapEvents(e1, e2) from pattern [every e1=E1 -> e2=E2]
User-defined functions may also be chained: If a user-defined function returns an object then the object can itself be the target of the next function call and so on.
Assume that there is a calculator
function in the MyLib
class that returns a class which provides the search
method taking two parameters. The EPL
that takes the result of the calculator
function and that calls the search
method on the result and returns its return value is shown below:
select MyLib.calculator().search(zonevariable, zone) from RFIDEvent]
A user-defined function should be implemented thread-safe.
A function that converts from one event type to another event type is shown in the next example. The first statement declares a stream that consists of MyEvent events. The second statement employs a conversion function to convert MyOtherEvent events to events of type MyEvent:
insert into MyStream select * from MyEvent insert into MyStream select MyLib.convert(other) from MyOtherEvent as other
In the example above, assuming the event classes MyEvent and MyOtherEvent are Java classes, the static method should have the following footprint:
public static MyEvent convert(MyOtherEvent otherEvent)
For user-defined functions that take no parameters or only constants as parameters the engine automatically caches the return result of the function, and invokes the function only once. This is beneficial to performance if your function indeed returns the same result for the same input parameters.
You may disable caching of return values of user-defined functions via configuration as described in Section 16.4.22.3, “User-Defined Function or Static Method Cache”.
EPL follows Java standards in terms of widening, performing widening automatically in cases where widening type conversion is allowed without loss of precision, for both boxed and primitive types.
When user-defined functions are overloaded, the function with the best match is selected based on how well the arguments to a function can match up with the parameters, giving preference to the function that requires the least number of widening conversions.
Boxing and unboxing of arrays is not supported in UDF as it is not supported in Java. For example, an array of Integer
and an array of int
are not compatible types.
When passing the event or underlying event to your method, either declare the parameter to take EventBean
(i.e. myfunc(EventBean event)
)
or as the underlying event type (i.e. myfunc(OrderEvent event)
).
When using {}
array syntax in EPL, the resulting type is always a boxed type: "{1, 2}"
is an array of Integer
(and not int
since it may contain null values), "{1.0, 2d}"
is an array of Double
and "{'A', "B"}"
is an array of String
, while "{1, "B", 2.0}"
is an array of Object
(Object[]
).
Esper can pass an object containing contextual information such as statement name, function name, engine URI and context partition id to your
method. The container for this information is EPLMethodInvocationContext
in package com.espertech.esper.client.hook
.
Please declare your method to take EPLMethodInvocationContext
as the last parameter. The engine then passes the information along.
A sample method footprint and EPL are shown below:
public static double computeSomething(double number, EPLMethodInvocationContext context) {...}
select MyLib.computeSomething(10) from MyEvent
The transpose
function is only valid in the select-clause and indicates that the result of the parameter expression should become the underlying event object of the output event.
The transpose
function takes a single expression as a parameter. The result object of the parameter expression is subject to transposing as described below.
The function can be useful with insert into
to allow an object returned by an expression to become the event itself in the output stream.
Any expression returning a Java object can be used with the transpose
function. Typical examples for expressions are a static method invocation, the result of an enumeration method, a plug-in single row function or a subquery.
The examples herein assume that a single-row function by name makeEvent
returns an OrderEvent
instance (a POJO object, not shown).
The following EPL takes the result object of the invocation of the makeEvent
method (assumed to be an OrderEvent instance) and returns the OrderEvent instance as the underlying event of the output event:
select transpose(makeEvent(oi)) from OrderIndication oi
Your select-clause can select additional properties or expressions. In this case the output event underlying object is a pair of the expression result object and the additional properties.
The next EPL also selects the origin
property of the order indication event. The output event is a pair of the OrderEvent instance and a map containing the property name and value of origin:
select origin, transpose(makeEvent(oi)) from OrderIndication oi
If the transpose
function is not a top-level function, i.e. if it occurs within another expression or within any other clause then the select-clause, the function simply returns the expression result of the parameter expression.
You may insert transposed output events into another stream.
If the stream name in the insert-into clause is already associated to an event type, the engine checks whether the event type associated to the stream name provided in the insert-into clause matches the event type associated to the object returned by the expression. If the stream name in the insert-into clause is not already associated to an existing event type the engine associates a new event type using the stream name provided in the insert-into clause.
The type returned by the expression must match the event representation that is defined for the stream, i.e. must be a subtype or implementation of the respective class (POJO, object-array or Map).
For example, the next statement associates the stream name OrderEvent
with a class. Alternatively this association can be achieved via static or runtime configuration API:
create schema OrderEvent as com.mycompany.OrderEvent
An EPL statement can insert into the OrderEvent
stream the OrderEvent
instance returned by the makeEvent
method, as follows:
insert into OrderEvent select transpose(makeEvent(oi)) from OrderIndication oi
It is not valid to select additional properties or expressions in this case, as they would not be part of the output event. The following is not valid:
// not valid insert into OrderEvent select origin, transpose(makeEvent(oi)) from OrderIndication oi