MSH-SWA-0001 - Expressions: Virtual streams in Scream - Application note

Chapter 2. Description

2.1 Introduction

Scream (version 4.2 and onwards) has the ability to interpret mathematical expressions that include one or more data streams as input and generate new, virtual data streams based on the results of the expression evaluations. An expression is represented as a formula, with normal mathematical rules regarding evaluation (e.g. operator precedence, parenthesis, etc). Common mathematical functions are supported (e.g. cos(), sin(), tan(), abs()). An expression must reference at least one other stream - real or virtual - because this determines the time base for the resulting virtual stream.

Expressions are evaluated on a sample-by-sample basis and then fed to a GCF encoder. The output streams appear in the main Scream window whenever the encoder produces a GCF block as output.

2.2 Configuration

Virtual streams are defined by adding entries to the [Expressions] section of the file scream.ini. (or whichever file has been specified using the /i: command line option – see scream documentation for more details). The [Expressions] section must be created before it can be used: it is not present in the default scream.ini.

Each expression defines a single virtual output stream and must be entered on a single line. There is no limit to the length of the line. Any number of expressions can be entered but, because they may require a significant amount of processing power, adding too many may cause Scream to appear unresponsive. It is advisable to add expressions one at a time, checking the impact upon system performance at each stage.

By default, replayed streams can not be used as inputs for virtual streams. If you wish to perform calculations on replayed streams, add the line

EvalExpressionsForFiles=1

to the [Custom] section of your scream.ini file. The [Custom] section must be created before it can be used: it is not present in the default scream.ini.

2.3 Expressions

Each expression is of the form

LHS=RHS // Instrument

Where LHS is the left-hand side of the expression, RHS is the right-hand side of the expression and Instrument is any arbitrary text preceded by two solidus ('/') characters, used for grouping entries in Scream's source tree.

2.3.1 The left-hand side

The left-hand side of an expression defines the name of the virtual output stream which will be generated. It must be of the form

SYSID_STRID

where:

SYSID is the virtual System ID;
_ is a literal underscore character ('_'); and
STRID is the virtual Stream ID.

Both System ID and Stream ID can be arbitrary and need have no relation to the input stream IDs. They should, however, conform to the GCF specification - i.e. any name which would include an invalid GCF System ID or Stream ID would also be invalid in this context.

2.3.2 The right-hand side

The right-hand side of an expression contains a mathematical formula which can contain stream references, operators, constants, special variables and functions.

An optional, trailing instrument name is introduced by two solidus characters ('//'), as in comments in the C99 or C++ programming languages. Where an instrument name is supplied, the text is used as the instrument name in Scream's main window, which makes it easy to group related virtual streams.

For example, using // High-pass filtered as the instrument name for all of your filtered streams and also using // Integrated as the instrument name for all of your derived acceleration streams will cause these streams to appear in two clearly-labelled groups in ythe source tree in Scream's main window.

The possible components of the right-hand side are described in the following sub-sections.

2.3.2.1 Stream references

Stream references must be of the form

STREAM_SYSID_STRID

STREAM_SYSID_STRID[offset]

where

STREAM_ is a token which must be entered exactly as shown - i.e. a literal, seven-character string;
SYSID specifies the System ID of the stream to be used as an input;
STRID specifies the Stream ID of the stream to be used as an input; and
offset is an integer (enclosed in square brackets) which specifies a time offset in units of samples. A negative offset refers to samples in the past, whilst a positive offset refers to samples in the future.

Note: The input streams can themselves be the results of expressions and such streams can be subscripted to refer to past and/or future samples. Circular referencing is not allowed, so (i) no two expressions can refer to each other and (ii) no expression can refer to its own present or future outputs. It is, however, permissible for an expression to refer to its own past samples. This allows, for example, IIR filters to be defined.

See the high-pass filter recipe in section 3.2.5 for an example of an expression which refers to its own past outputs.

2.3.2.2 Operators

All of the conventional arithmetic operators are available for use in expressions. A complete list of operators is given in section 4.1, along with their priorities. The lower the priority in the table, the higher the precedence. This means that parentheses, which have the lowest priority (0), can always be used to over-ride the built-in precedence rules.

2.3.2.3 Constants

Constants can be used in expressions. They can be entered in several different formats. For example:

12345 specifies an integer;
1.2345 specifies a floating-point number;
1.23e6 specifies a floating point number in exponent notation, equivalent to 1.2345×10⁶ and 1,234,500;
$1A2B specifies an integer using hexadecimal notation; and
b00101011 specifies an integer using binary notation.

2.3.2.4 Tokens

A number of special tokens can be used in the right hand side of expressions. They are detailed in the following table.

Name	Meaning
STREAM_xxxxx	a sample value, as described in section 2.3.2.1
STREAMSPS	the sample rate of the output† stream, in samples per second
STREAMTIME	the time-stamp of the current sample of the output† stream

† Where multiple streams are used as inputs, the output sample rate need not be equal to the sample rate of every input stream - see section 2.3.3 on page 9 for details.

2.3.2.5 Functions

Many functions are available for use in the right hand side of expressions. The most useful, in this context, are typically the trigonometric functions, Sin(), Cos() and Tan(), and some of the numerical functions, such as SquareRoot() and Absolute().

In addition, two functions are provided for the special case of "rotating" data. This is useful when, for example, a borehole instrument has been installed without the N/S axis being physically aligned to true North. The two horizontal streams from the instrument can be combined mathematically to produce two virtual output streams which produce the same data that would be seen if the instrument had been aligned correctly.

These are ROTATEX(), which generates a virtual X component and ROTATEY(), which generates a virtual Y component.

Usage:

ROTATEX( inputX, inputY, angle )

ROTATEY( inputX, inputY, angle )

where

inputX is the stream reference (see section 2.3.2.1) of the X channel of the input data
inputY is the stream reference of the Y channel of the input data
angle specifies the angle, in degrees, through which the data should be rotated.

The underlying implementation for ROTATEX is:

Cos( ArcTan2(y, x) + angle ) × √(x² + y²)

and for ROTATEY is:

Sin( ArcTan2(y, x) + angle ) × √(x² + y²)

(The ArcTan2() function is an extension of the inverse Tan() function. By using the signs of its arguments, it is defined across all four quadrants.)

See section 3.2.6 for examples of the use of these two functions and section 4.2 for a complete list of available functions.

2.3.3 Sample rates

The sample rate of the virtual output stream on the left-hand side of the expression is determined by the sample rate(s) of the input stream(s) on the right-hand side of the expression.

If the right-hand side involves multiple streams with different sample rates, the output sample rate will be equal to the highest of the input sample rates. In this case, where values from a lower-rate input stream are referenced in the expression, the sample with the time-stamp nearest to the current output time-stamp will be used.

2.4 Chaining expressions

As explained in section 2.3.2.1, expressions can use the output of other expressions as their input streams. This allows complex expressions to be built up from simpler sub-expressions with the intermediate results available as additional streams. See the integration recipe in section 3.2.7 for an example of this technique.

2.5 Start-up considerations

When a new stream is created, there is a twenty-second delay before the calculation of expressions begins. This allows for the possibility of back-fill starting shortly after a digitiser becomes available over the network: it helps ensure that processing starts with the back-fill data rather than just the live data.

This can cause problems when generating virtual streams from replayed files. One mitigation technique is to start the replay and stop it as soon as all streams have appeared, wait thirty seconds and then start the replay again.