View on GitHub

boon-docs

Boon Logic Software Documentation

Logo

Introduction to the Boon Nano

The Boon Nano is a high-speed, high-efficiency, clustering and segmentation algorithm based on unsupervised machine learning. The Nano builds clusters of similar n-space vectors (or patterns) in real-time based on their similarity. Each pattern has a sequence of features that the Nano uses in its measurement of similarity.

Figure 1: Semi-structured data is segmented by the Boon Nano based on L1-distance similarity.

Examples of Patterns

Using the Boon Nano

The Boon Nano clusters its input data by assigning to each pattern an integer called its cluster ID. Patterns assigned the same cluster ID are similar in the sense of having a small L1-distance from each other. The similarity required for patterns assigned to the same cluster is determined by the percent variation setting and the configured feature ranges (described below). Sometimes a pattern is processed by the Nano that is not similar to any of the existing clusters. In this case, one of two actions is taken. If learning mode is on and if the maximum allowed clusters has not been reached, the pattern becomes the first member of a new cluster and the number of clusters in the model increases by one. If learning mode is off or the maximum number of clusters has been reached, the pattern is assigned the special cluster ID 0. There is no assumption that can be made about the similarity of pattern assigned to cluster 0, but they are all known to be significantly different from the non-zero clusters in the existing model. This dynamic learning process of assigning patterns to existing clusters or, if needed, creating new clusters is called training a model.

Figure 2: The number of clusters grows quickly as the first patterns from the input data are processed. The slope of the growth curve levels off as the model matures and as nearly all incoming patterns already have a cluster to which they can be assigned.

The Boon Nano is deployed in both a general-use platform called Expert Console and as a streaming sensor analytics application called Amber

Configuring the Boon Nano

Clustering Configuration

The Boon Nano uses the clustering configuration to determine the properties of the model that will be built for the input data.

Figure 3: One feature (all samples from the same sensor) and streaming window size of 25. Each input vector is 25 successive samples where we form successive patterns by dropping the oldest sample from the current pattern and appending the next sample from the input stream.

Autotuning Configuration

Two clustering parameters, the percent variation and the range for each feature, can be autotuned, that is, chosen automatically, by the Boon Nano prescanning representative data. The range for each feature can be autotuned either individually or a single range can be autotuned to apply to all features.

One of the most difficult parameters to configure in unsupervised machine learning is the desired number of clusters needed to produce the best results (as with K-means) or (in the case of the Boon Nano) the desired percent variation to use. This is because one would not generally know a priori the underlying proximity structure of the input vectors to be segmented.

To address this, the Boon Nano can automatically tune its percent variation to create an balanced combination of coherence within clusters and separation between clusters. In nearly all cases, autotuning produces the best value for the percent variation setting. However, if more granularity is desired you can lower the percent variation manually. Similarly, if the autotuned percent variation is creating too much granularity (and too many clusters) then you can choose to manually increase the percent variation above the autotuned value.

Clustering Results

When a single pattern is assigned a cluster ID, this is called an inference. Besides its cluster ID, a number of other useful analytic outputs are generated. Here is description of each of the ML-based outputs generated by the Boon Nano in response to each pattern.

Cluster ID (ID)

The Boon Nano assigns a Cluster ID to each input vector as they are processed. Following configuration of the Nano, the first pattern is always assigned to a new cluster ID of 1. The next pattern, if it is within the defined percent variation of cluster 1, is also assigned to cluster 1. Otherwise it is assigned to a new cluster 2. Continuing this way all patterns are assigned cluster IDs in such a way that each pattern in each cluster is within the desired percent variation of that cluster’s template. In some circumstances the cluster ID 0 may be assigned to a pattern. This happens, for example, if learning has been turned off or if the maximum cluster count has been reached. It should be noted that cluster IDs are assigned serially so having similar cluster IDs (for instance, 17 and 18) says nothing about the similarity of those clusters. However, a number of other measurements described below can be used to understand the relation of the pattern to the model.

In some conditions, a negative cluster ID may be assigned to a pattern. A negative ID indicates that the pattern is not part of the learned model, but the Boon Nano can still give information about the pattern as it relates to the learned model. This can be especially useful for anomaly detection where a pattern falls outside the model. In this case Root Cause Analysis using the negative cluster ID will provide additional information about the reason the pattern is outside the model. (See Root Cause Analysis below.)

Operational Mode (OM)

Although cluster IDs do not indicate anything about relative proximity of clusters in n-dimensional space, in many situations the sequence of cluster IDs that are being assigned to successive patterns will indicate a standing operating state of the data source generating the patterns (for example, sensor telemetry from a motor running for hours at a set speed). We derive the Operational Mode value by computing the average of the most recent cluster IDs. The OM value can be used to identify significant changes in operational state of the data source. A significant change in OM value indicates that something has changed in the recent sequence of cluster IDs. For example, a motor running at 10% and then increasing at 50% will show a discrete shift in OM value. Conversely, OM values can be used across time to link states of the data source that are likely to be the same. For example, a motor running at 10%, then 50%, and then back to 10% will show a discrete shift in OM from the 10% mode to the 50% mode and then back to the 10% mode. This converse approach must be use with caution, as it is possible, but unlikely, that two different operational states of a data source could have the same OM value.

Two Anomaly Axes

The Boon Nano can measure anomalies along two different axes: frequency and distance. Together these two axes capture the full scope of what may be considered as an “anomaly” within a set of input patterns.

Raw Anomaly Index (RI) (Frequency Anomaly)

The Boon Nano assigns to each pattern a Raw Anomaly Index, that indicates how many patterns are in its cluster relative to other clusters. These integer values range from 0 to 1000 where values close to zero signify patterns that are the most common and happen very frequently. Values close to 1000 are very infrequent and are considered more anomalous the closer the values get to 1000. Patterns with cluster ID of 0 have a raw anomaly index of 1000.

Smoothed Anomaly Index (SI) (Frequency Anomaly)

Building on the raw anomaly index, we create a Smoothed Anomaly Index which is an edge-preserving, exponential, smoothing filter applied to the raw anomaly indexes of successive input patterns. These values are also integer values ranging from 0 to 1000 with similar meanings as the raw anomaly index. In cases where successive input patterns do not indicate any temporal or local proximity, this smoothing may not be meaningful.

Figure 4: Raw sensor signal (Blue) and SI, the Smoothed Anomaly Index (Amber), showing a rarely occuring pattern in the sensor stream model.

Frequency Index (FI) (Frequency Anomaly)

Similar to the anomaly indexes, the Frequency Index measures the relative number of patterns placed in each cluster. The frequency index measures all cluster sizes relative to the average size cluster. Values equal to 1000 occur about equally often, neither abnormally frequent or infrequent. Values close to 0 are abnormally infrequent, and values significantly above 1000 are abnormally frequent.

Distance Index (DI) (Distance Anomaly)

The Distance Index measures the distance of each cluster template to the centroid of all of the cluster templates. This overall centroid is used as the reference point for this measurement. The values range from 0 to 1000 indicating that distance with indexes close to 1000 as indicating patterns furthest from the center and values close to 0 are very close. Patterns in a space that are similar distances apart have values that are close to the average distance between all clusters to the centroid.

Novelty Index (NI) (Combined Distance and Frequency Anomaly)

The Novelty Index measures along both anomaly axes in one value. The values range from 0 to 1000 with the following interpretations.

Practically speaking, NI values close to 0 indicate a non-anomalous pattern. As the NI grows toward 1000, it indicates increasing non-compliance or non-alignment of patterns relative to the trained model.

Figure 5: Novelty Index of an industrial robot measuring current from 6 joints as a single 6-feature pattern. Over time the Novelty Index increases from 0 (fully compliant relative to training) and shows increasing non-compliance as the asset wears toward a maintenance event.

Smoothed Novelty Index (NS) (Combined Distance and Frequency Anomaly)

The Smoothed Novelty Index is the exponentially smoothed sum of recent novelty indexes (NI). As such it is suited for time-series where a single spike in the NI value may not be significant but where a trend of increasing NI values indicate increasing non-compliance of patterns relative to the model.

Root Cause Analysis (RC)

Each processed pattern is assigned a cluster ID. The Boon Nano associates each ID with an associated Root Cause vector. This vector has the same number of values of the number of features in patterns of the model. Each value in the vector is a representation of each feature’s significance in the cluster to which it was assigned. Values range from 0 to 1 where relatively high values indicate features that are more influential in the creation of the cluster. Values close to 0 lack statistically significance and no conclusion can be drawn from them. This is especially important for anomaly detection where the RCA vector can be used to determine which features are implicated in a detected anomaly (Figure 6).

Figure 6: The RCA vector shown in the figure was from a 6-joint cobot running a 3-position motion profile. The Novelty Index for one pattern was near 600 as shown in Figure 5. This NI value was returned along with a negative cluster ID. The RCA vector for this negative ID indicates that joints 2 and 3 are implicated in this anomaly.

Nano Status: Information about the Nano Model

While Nano Results (previous section) give specific analytic results for the patterns in the most recently processed sample buffer, Nano Status provides core analytics about the Nano model itself has been constructed since the Nano was configured. The status information is indexed by cluster ID beginning with cluster 0.

anomalyIndexes

The values in this list give raw anomaly index (RI) for each cluster in the Nano’s current model. The cluster assigned the most patterns has anomaly index of 0 up to a maximum of 1000 for a cluster that has only been assigned one pattern. Cluster 0 always has anomaly index of 1000.

Figure 7: Pulmonary CT image using PCA coloring to show distinct tissue textures and the gradients between them.

Example

We now present a very simple example to illustrate some of these ideas. A set of 48 patterns is shown in the figure below. A quick look across these indicates that there are at least two different clusters here. Each pattern has 16 features so we configure the Nano for

Figure 8: A collection of 48 16-dimensional vectors to be clustered

We could select the mininum and maximum by visual inspection, but it is not possible to determine the correct Percent Variation this way. So we instead load the patterns into the Nano and tell the Nano to Autotune those parameters. The results comes back with:

We configure the Nano with these parameters and then run the patterns through the Nano, requesting as a result the “ID” assigned to each input pattern. We receive back the following list: {1, 1, 2, 1, 1, 2, 1, 2, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 2, 2, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 3, 3, 2, 2, 2, 2, 1, 1}

Comparing this to the sequence in the figure, we see that this is a reasonable clustering assignment. Further, we see that there is a third cluster that may have been missed by our intuitive clustring. This cluster had just three patterns assigned to it. The figure below shows the waveforms plotted on the same axes and colored according according to their assigned cluster IDs.

Figure 9: 48 patterns colored according to their assigned clusters

The Raw Anomaly index for each of the three clusters are as follows:

This indicates Cluster 1 had the most patterns assigneed to it. Cluster 2 was also common, and Cluster 3 was significantly less common. It is worth noting that a Raw Anomaly Index of 563 would not be sufficient in practice to indicate an anomaly in the machine learning model. Typically, useful anomaly indexes must be in the range of 700 to 1000 to indicate a pattern that is far outside the norm of what has been learned.

Simplification Disclaimer: This is an artificially small and simple example to illustrate the meaning of some of the basic principles of using the Boon Nano. In particular: