You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sphinx/source/appendix/appendix_usage.rst
+148-7Lines changed: 148 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,128 @@
2
2
Usage
3
3
*********
4
4
5
+
Chimbuko Config
6
+
~~~~~~~~~~~~~~~
7
+
8
+
Options for visualization module:
9
+
10
+
- **viz_root** : Path to the visualization module.
11
+
- **viz_worker_port** : The port on which to run the redis server for the visualization backend.
12
+
- **viz_port** : the port on which to run the webserver
13
+
14
+
General options for Chimbuko backend:
15
+
16
+
- **backend_root** : The root install directory of the PerformanceAnalysis libraries. If set to "infer" it will be inferred from the path of the executables
17
+
18
+
Options for the provenance database:
19
+
20
+
- **provdb_nshards** : Number of database shards
21
+
- **provdb_engine** : The OFI libfabric provider used for the Mochi stack.
22
+
- **provdb_port** : The port of the provenance database
23
+
- **provdb_nthreads** : Number of worker threads; should be >= the number of shards
24
+
25
+
Options for the parameter server:
26
+
27
+
- **ad_win_size** : Number of events around an anomaly to store; provDB entry size is proportional to this
28
+
- **ad_alg** : AD algorithm to use. "sstd" or "hbos"
29
+
- **ad_outlier_sstd_sigma** : number of standard deviations that defines an outlier.
30
+
- **ad_outlier_hbos_threshold** : The percentile of events outside of which are considered anomalies by the HBOS algorithm.
31
+
32
+
Options for TAU:
33
+
34
+
- **export TAU_ADIOS2_ENGINE=${value}** : Online communication engine (recommended SST, but alternative BP4 although this goes through the disk system and may be slower unless the BPfiles are stored on a burst disk)
35
+
- **export TAU_ADIOS2_ONE_FILE=FALSE** : a different connection file for each rank
- **export TAU_ADIOS2_PERIOD=1000000** : period in us between ADIOS2 io steps
38
+
- **export TAU_THREAD_PER_GPU_STREAM=1** : force GPU streams to appear as different TAU virtual threads
39
+
- **export TAU_THROTTLE=1** : enable/disable throttling of short-running functions
40
+
- **TAU_ADIOS2_PATH** : path where the adios2 files are to be stored. Chimbuko services creates the directory chimbuko/adios2 in the working directory and this should be used by default
41
+
- **TAU_ADIOS2_FILE_PREFIX** : the prefix of tau adios2 files; full filename is ${TAU_ADIOS2_PREFIX}-${EXE_NAME}-${RANK}.bp
42
+
43
+
Launch Services
44
+
~~~~~~~~~~~~~~~
45
+
46
+
Description of running the Chimbuko head node Services:
47
+
First, This script sources the chimbuko config script with variables defined in `previous section <./appendix_usage.html#chimbuko-config>`_.
48
+
49
+
Next, it instantiates provenance database using the following command:
where **${provdb_addr}** is address of provenance database and other variables are defined `here <../appendix/appendix_usage.html#additional-provdb-variables>`_.
56
+
57
+
Next, the following commands instantiates visualization module:
After visualization module (its variables are described `here <./appendix_usage.html#parameter-server-variables>`_) is successfully instantiated, the parameter server is launched as part of Chimbuko services
The parameter server command line variables used as input for **pserver** command are described `here <../appendix/appendix_usage.html#parameter-server-variables>`_.
121
+
5
122
Additional ProvDB Variables
6
123
~~~~~~~~~~~~~~~~~~~~~~~~~~~
7
124
125
+
- **-nthreads** : Number of threads used by provenance database
126
+
- **-nshards** : Number of shards used by provenance database
8
127
- **-db_write_dir** : This is used to specify a path to provenance database to write on disk.
9
128
- **-engine** : This is the OFI libfabric provider used for the Mochi stack. Its value can be set to "ofi+tcp;ofi_rxm".
10
129
@@ -20,9 +139,11 @@ Visualization Variables
20
139
Parameter Server Variables
21
140
~~~~~~~~~~~~~~~~~~~~~~~~~~
22
141
23
-
- **PSERVER_NT** : The number of threads used to handle incoming communications from the AD modules
24
-
- **PSERVER_LOGDIR** : A directory for logging output
25
-
- **PSERVER_ALG** : Set AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
142
+
- **-port ${pserver_port}** : the port used by parameter server
143
+
- **-nt ${pserver_nt}** : The number of threads used to handle incoming communications from the AD modules
144
+
- **-logdir ${log_dir}** : A directory for logging output
145
+
- **-ad ${pserver_alg}** : Set AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
146
+
- **${ps_extra_args}** : Extra arguments used by parameter server.
26
147
27
148
Note that all the above are optional arguments, although if the **VIZ_ADDRESS** is not provided, no information will be sent to the webserver.
28
149
@@ -54,8 +175,28 @@ Additional AD Variables
54
175
- **-ad_algorithm** : This sets the AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
55
176
- **-hbos_threshold** : This sets the threshold to control density of detected anomalies used by HBOS algorithm. Its value ranges between 0 and 1. Default value is 0.99
56
177
57
-
Application Variables
58
-
~~~~~~~~~~~~~~~~~~~~~
59
178
60
-
- **${APPLICATION}** : Application executable.
61
-
- **${APPLICATION_ARGS}** : List of arguments specific to the application.
179
+
Offline Analysis
180
+
~~~~~~~~~~~~~~~~
181
+
182
+
For an offline analysis the user runs the application on its own, with Tau's ADIOS2 plugin configured to use the **BPFile** engine (**TAU_ADIOS2_ENGINE=BPFile** environment option; `see previous section <./appendix_usage.html#chimbuko-config>`_). Once complete, Tau will generate a file with a **.bp** extension and a filename chosen according to the user-specified **TAU_ADIOS2_FILENAME** environment option. The user can then copy this file to a location accessible to the Chimbuko application, for example on a local machine.
Once complete, the user should locate the **.bp** file and copy to a location accessible to Chimbuko.
191
+
192
+
- **${RANKS}** : Number MPI ranks.
193
+
- **${APPLICATION}** : Path to the application executable.
194
+
- **${APPLICATION_ARGS}** : Input arguments required by the application.
195
+
196
+
On the analysis machine, the provenance database and parameter server should be instantiated as in the previous section. The AD modules must still be spawned under MPI with one AD instance per rank of the original job:
Note that the first argument of **driver**, which specifies the ADIOS2 engine, has been set to **BPFile**, and the process is not run in the background.
0 commit comments