Skip to content

Commit e7c62c5

Browse files
author
sandeepmittal
committed
updated sphix docs: AD and API
1 parent 94ec04f commit e7c62c5

3 files changed

Lines changed: 14 additions & 5 deletions

File tree

sphinx/source/api/api_code.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,13 @@ ParamInterface
156156
:project: api
157157
:path: ../../../include/chimbuko/param.hpp
158158

159+
CopodParam
160+
----------
161+
162+
.. doxygenfile:: copod_param.hpp
163+
:project: api
164+
:path: ../../../include/chimbuko/param/copod_param.hpp
165+
159166
HbosParam
160167
---------
161168

sphinx/source/appendix/appendix_usage.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Options for the provenance database:
2525
Options for the parameter server:
2626

2727
- **ad_win_size** : Number of events around an anomaly to store; provDB entry size is proportional to this
28-
- **ad_alg** : AD algorithm to use. "sstd" or "hbos"
28+
- **ad_alg** : AD algorithm to use. "sstd" or "hbos" or "copod"
2929
- **ad_outlier_sstd_sigma** : number of standard deviations that defines an outlier.
3030
- **ad_outlier_hbos_threshold** : The percentile of events outside of which are considered anomalies by the HBOS algorithm.
3131

@@ -172,7 +172,7 @@ Additional AD Variables
172172
- **-program_idx** : For workflows with multiple component programs, a "program index" must be supplied to the AD instances attached to those processes.
173173
- **-rank** : By default the data rank assigned to an AD instance is taken from its MPI rank in MPI_COMM_WORLD. This rank is used to verify the incoming trace data. This option allows the user to manually set the rank index.
174174
- **-override_rank** : This option disables the data rank verification and instead overwrites the data rank of the incoming trace data with the data rank stored in the AD instance. The value supplied must be the original data rank (this is used to generate the correct trace filename).
175-
- **-ad_algorithm** : This sets the AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
175+
- **-ad_algorithm** : This sets the AD algorithm to use for online analysis: "sstd" or "hbos" or "copod". Default value is "hbos".
176176
- **-hbos_threshold** : This sets the threshold to control density of detected anomalies used by HBOS algorithm. Its value ranges between 0 and 1. Default value is 0.99
177177

178178

sphinx/source/introduction/ad.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,14 +39,16 @@ of a function :math:`i`, respectively, and :math:`\alpha` is a control parameter
3939

4040
Advanced anomaly analysis
4141
~~~~~~~~~~~~~~~~~~~~~~~~~
42-
A determistic and non-parametric statistical anomaly detection algorithm called Histogram Based Outilier Scoring (HBOS) is implemented as part of Chimbuko's anomaly analysis module. HBOS is an unsupervised anomaly detection algorithm which scores data in linear time. It supports dynamic bin widths which ensures long-tail distributions of function executions are captured and global anomalies are detected better. HBOS normalizes the histogram and calculates the anomaly scores by taking inverse of estimated densities of function executions. The score is a multiplication of the inverse of the estimated densities given by the following Equation
42+
1. Histogram Based Outlier Score (HBOS) is a deterministic and non-parametric statistical anomaly detection algorithm. It is implemented as part of Chimbuko's anomaly analysis module. HBOS is an unsupervised anomaly detection algorithm which scores data in linear time. It supports dynamic bin widths which ensures long-tail distributions of function executions are captured and global anomalies are detected better. HBOS normalizes the histogram and calculates the anomaly scores by taking inverse of estimated densities of function executions. The score is a multiplication of the inverse of the estimated densities given by the following Equation
4343

4444
.. math::
4545
HBOS_{i} = \log_{2} (1 / density_{i})
4646
47-
where :math:`i` is a function execution and :math:`density_{i}` is function execution probability. HBOS works in :math:`O(nlogn)` using dynamic bin-width or in linear time :math:`O(n)` using fixed bin width. After scoring, the top 1% of scores are filtered as anomalous function executions. This filter value can be set at runtime to adjust the density of detected anomalies.
47+
where :math:`i` is a function execution and :math:`density_{i}` is function execution probability. HBOS works in :math:`O(nlogn)` using dynamic bin-width or in linear time :math:`O(n)` using fixed bin width. After scoring, the top 1% of scores are filtered as anomalous function executions. This filter value can be set at runtime to adjust the density of detected anomalies.
4848
49-
(See `ADOutlier <../api/api_code.html#adoutlier>`__ and `HbosParam <../api/api_code.html#hbosparam>`__).
49+
2. Another algorithm is added into Chimbuko's advanced anomaly analysis called the COPula based Outlier Detection (COPOD), which is a deterministic, parameter-free anomaly detection algorithm. It computes empirical copulas for each sample in the dataset. A copula defines the dependence structure between random variables. For each sample in the dataset, COPOD algorithm computes left-tail empirical copula from left-tail empirical cumulative distribution function, right-tail copula from right-tail empirical cumulative distribution function, and a skewness-corrected empirical copula using a skewness coefficient calculated from left-tail and right-tail empirical cumulative distribution functions. These three computed values are interpreted as left-tail, right-tail, and skewness-corrected probabilities, respectively. Lowest probability value results in largest negative-log value, which is the score assigned to the sample in the dataset. Samples with the highest scores in the dataset are tagged as anomalous.
50+
51+
(See `ADOutlier <../api/api_code.html#adoutlier>`__, `HbosParam <../api/api_code.html#hbosparam>`__ and `CopodParam <../api/api_code.html#copodparam>`__).
5052

5153
Provenance data collection
5254
--------------------------

0 commit comments

Comments
 (0)