www.espertech.comDocumentation
Esper may generate and compile code upon statement creation using the Janino compiler. Code generation is a technique that blends state-of-the-art from modern compilers and MPP databases.
Code generation can significantly speed up processing as it eliminates virtual calls and especially megamorphic calls (a callsite with 3 or more possible implementations is megamorphic). Code generation allows the runtime to optimize the generated code and allows the hardware to execute faster.
The engine implements the best architecture for performance engineering in data processing by performing code generation. Not all workloads can benefit from code generation to the same degree.
Code generation is enabled by default and can be disabled entirely. Please refer to Section 15.4.12, “Engine Settings related to Code Generation and Compilation” for configuration options.
For example, consider the expression a + b
(field a
plus field b
).
Upon creating a statement the engine performs these steps:
Analyzes the expression and determines where fields a
and b
come from (for example event type or variable) and the field type (for example string or integer).
Verifies that the addition arithmetic operation can indeed be applied to the two fields. It verifies that both fields are indeed numeric and can thus be added.
Without code generation, in order to evaluate the expression a + b
the engine needs to make at least 3 virtual calls: One to obtain the value of field a
, one to obtain the value of field b
and one to perform the +
plus-operation.
With code generation the engine can reduce the number of virtual calls. In the best case the number of virtual calls to evaluate the a + b
expression is one (for the invocation of the generated code itself).
All code generation takes place at time of EPL statement creation. There is no code generation at runtime.
In the default configuration, the engine generates code for interdependent expressions (expressions that depend on other expressions) and their complete evaluation path including code for obtaining event property values.
The engine does not generate code for (not a comprehensive list):
Constants and other expressions that can typically be evaluated with zero or very few virtual calls.
Expressions that only perform a state lookup such as the prev
or prior
function.
By default, in the case that code generation fails, the engine logs a WARN-level message and falls back to regular evaluation, all at time of EPL statement creation. Please report any stack traces as a Github issue and include the code that was produced by code generation as well as the EPL statement. The fallback can be disabled by configuration.
You can log generated classes at INFO log level by setting the configuration flag for code logging as described in Section 15.4.16.5, “Code Generation Logging”.
As an alternative you can configure your log provider configuration file by setting DEBUG level for class com.espertech.esper.codegen.compile.CodegenCompilerJanino
(provider class may change between versions).
The information herein is for developers and is specific to the Janino compiler at the version provided with the distribution.
To have Janino generate classes into a given directory, define the system property org.codehaus.janino.source_debugging.dir
to a file system directory.
The IDE can debug into generated classes and show the source code provided that the IDE can access the source code.
To include debug symbol information in the class binaries, or to include additional comments regarding the generating code itself in the generated source code, you must change the configuration as outlined in Section 15.4.12, “Engine Settings related to Code Generation and Compilation”.