Skip to content

Commit d0c41d8

Browse files
author
sandeepmittal
committed
documentation edit: Chimbuko Installation + Appendix
1 parent 77ce3dd commit d0c41d8

2 files changed

Lines changed: 185 additions & 160 deletions

File tree

sphinx/source/appendix/appendix_usage.rst

Lines changed: 148 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,128 @@
22
Usage
33
*********
44

5+
Chimbuko Config
6+
~~~~~~~~~~~~~~~
7+
8+
Options for visualization module:
9+
10+
- **viz_root** : Path to the visualization module.
11+
- **viz_worker_port** : The port on which to run the redis server for the visualization backend.
12+
- **viz_port** : the port on which to run the webserver
13+
14+
General options for Chimbuko backend:
15+
16+
- **backend_root** : The root install directory of the PerformanceAnalysis libraries. If set to "infer" it will be inferred from the path of the executables
17+
18+
Options for the provenance database:
19+
20+
- **provdb_nshards** : Number of database shards
21+
- **provdb_engine** : The OFI libfabric provider used for the Mochi stack.
22+
- **provdb_port** : The port of the provenance database
23+
- **provdb_nthreads** : Number of worker threads; should be >= the number of shards
24+
25+
Options for the parameter server:
26+
27+
- **ad_win_size** : Number of events around an anomaly to store; provDB entry size is proportional to this
28+
- **ad_alg** : AD algorithm to use. "sstd" or "hbos"
29+
- **ad_outlier_sstd_sigma** : number of standard deviations that defines an outlier.
30+
- **ad_outlier_hbos_threshold** : The percentile of events outside of which are considered anomalies by the HBOS algorithm.
31+
32+
Options for TAU:
33+
34+
- **export TAU_ADIOS2_ENGINE=${value}** : Online communication engine (recommended SST, but alternative BP4 although this goes through the disk system and may be slower unless the BPfiles are stored on a burst disk)
35+
- **export TAU_ADIOS2_ONE_FILE=FALSE** : a different connection file for each rank
36+
- **export TAU_ADIOS2_PERIODIC=1** : enable/disable ADIOS2 periodic output
37+
- **export TAU_ADIOS2_PERIOD=1000000** : period in us between ADIOS2 io steps
38+
- **export TAU_THREAD_PER_GPU_STREAM=1** : force GPU streams to appear as different TAU virtual threads
39+
- **export TAU_THROTTLE=1** : enable/disable throttling of short-running functions
40+
- **TAU_ADIOS2_PATH** : path where the adios2 files are to be stored. Chimbuko services creates the directory chimbuko/adios2 in the working directory and this should be used by default
41+
- **TAU_ADIOS2_FILE_PREFIX** : the prefix of tau adios2 files; full filename is ${TAU_ADIOS2_PREFIX}-${EXE_NAME}-${RANK}.bp
42+
43+
Launch Services
44+
~~~~~~~~~~~~~~~
45+
46+
Description of running the Chimbuko head node Services:
47+
First, This script sources the chimbuko config script with variables defined in `previous section <./appendix_usage.html#chimbuko-config>`_.
48+
49+
Next, it instantiates provenance database using the following command:
50+
51+
.. code:: bash
52+
53+
provdb_admin "${provdb_addr}" -engine ${provdb_engine} -nshards ${provdb_nshards} -nthreads ${provdb_nthreads} -db_write_dir ${provdb_writedir}
54+
55+
where **${provdb_addr}** is address of provenance database and other variables are defined `here <../appendix/appendix_usage.html#additional-provdb-variables>`_.
56+
57+
Next, the following commands instantiates visualization module:
58+
59+
.. code:: bash
60+
61+
export SHARDED_NUM=${provdb_nshards}
62+
export PROVDB_ADDR=${prov_add}
63+
64+
export SERVER_CONFIG="production"
65+
export DATABASE_URL="sqlite:///${viz_dir}/main.sqlite"
66+
export ANOMALY_STATS_URL="sqlite:///${viz_dir}/anomaly_stats.sqlite"
67+
export ANOMALY_DATA_URL="sqlite:///${viz_dir}/anomaly_data.sqlite"
68+
export FUNC_STATS_URL="sqlite:///${viz_dir}/func_stats.sqlite"
69+
export PROVENANCE_DB=${provdb_writedir}
70+
export CELERY_BROKER_URL="redis://${HOST}:${viz_worker_port}"
71+
72+
#Setup redis
73+
cp -r $viz_root/redis-stable/redis.conf .
74+
sed -i "s|^dir ./|dir ${viz_dir}/|" redis.conf
75+
sed -i "s|^bind 127.0.0.1|bind 0.0.0.0|" redis.conf
76+
sed -i "s|^daemonize no|daemonize yes|" redis.conf
77+
sed -i "s|^pidfile /var/run/redis_6379.pid|pidfile ${viz_dir}/redis.pid|" redis.conf
78+
sed -i "s|^logfile "\"\""|logfile ${log_dir}/redis.log|" redis.conf
79+
sed -i "s|.*syslog-enabled no|syslog-enabled yes|" redis.conf
80+
81+
echo "==========================================="
82+
echo "Chimbuko Services: Launch Chimbuko visualization server"
83+
echo "==========================================="
84+
cd ${viz_root}
85+
86+
echo "Chimbuko Services: create db ..."
87+
python3 manager.py createdb
88+
89+
echo "Chimbuko Services: run redis ..."
90+
redis-server ${viz_dir}/redis.conf
91+
sleep 5
92+
93+
echo "Chimbuko Services: run celery ..."
94+
CELERY_ARGS="--loglevel=info --concurrency=1"
95+
python3 manager.py celery ${CELERY_ARGS} 2>&1 | tee "${log_dir}/celery.log" &
96+
sleep 10
97+
98+
echo "Chimbuko Services: run webserver ..."
99+
python3 run_server.py $HOST $viz_port 2>&1 | tee "${log_dir}/webserver.log" &
100+
sleep 2
101+
102+
echo "Chimbuko Services: redis ping-pong ..."
103+
redis-cli -h $HOST -p ${viz_worker_port} ping
104+
105+
cd ${base}
106+
107+
ws_addr="http://${HOST}:${viz_port}/api/anomalydata"
108+
ps_extra_args+=" -ws_addr ${ws_addr}"
109+
110+
echo $HOST > ${var_dir}/chimbuko_webserver.host
111+
echo $viz_port > ${var_dir}/chimbuko_webserver.port
112+
113+
114+
After visualization module (its variables are described `here <./appendix_usage.html#parameter-server-variables>`_) is successfully instantiated, the parameter server is launched as part of Chimbuko services
115+
116+
.. code:: bash
117+
118+
pserver -ad ${pserver_alg} -nt ${pserver_nt} -logdir ${log_dir} -port ${pserver_port} ${ps_extra_args}
119+
120+
The parameter server command line variables used as input for **pserver** command are described `here <../appendix/appendix_usage.html#parameter-server-variables>`_.
121+
5122
Additional ProvDB Variables
6123
~~~~~~~~~~~~~~~~~~~~~~~~~~~
7124

125+
- **-nthreads** : Number of threads used by provenance database
126+
- **-nshards** : Number of shards used by provenance database
8127
- **-db_write_dir** : This is used to specify a path to provenance database to write on disk.
9128
- **-engine** : This is the OFI libfabric provider used for the Mochi stack. Its value can be set to "ofi+tcp;ofi_rxm".
10129

@@ -20,9 +139,11 @@ Visualization Variables
20139
Parameter Server Variables
21140
~~~~~~~~~~~~~~~~~~~~~~~~~~
22141

23-
- **PSERVER_NT** : The number of threads used to handle incoming communications from the AD modules
24-
- **PSERVER_LOGDIR** : A directory for logging output
25-
- **PSERVER_ALG** : Set AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
142+
- **-port ${pserver_port}** : the port used by parameter server
143+
- **-nt ${pserver_nt}** : The number of threads used to handle incoming communications from the AD modules
144+
- **-logdir ${log_dir}** : A directory for logging output
145+
- **-ad ${pserver_alg}** : Set AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
146+
- **${ps_extra_args}** : Extra arguments used by parameter server.
26147

27148
Note that all the above are optional arguments, although if the **VIZ_ADDRESS** is not provided, no information will be sent to the webserver.
28149

@@ -54,8 +175,28 @@ Additional AD Variables
54175
- **-ad_algorithm** : This sets the AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
55176
- **-hbos_threshold** : This sets the threshold to control density of detected anomalies used by HBOS algorithm. Its value ranges between 0 and 1. Default value is 0.99
56177

57-
Application Variables
58-
~~~~~~~~~~~~~~~~~~~~~
59178

60-
- **${APPLICATION}** : Application executable.
61-
- **${APPLICATION_ARGS}** : List of arguments specific to the application.
179+
Offline Analysis
180+
~~~~~~~~~~~~~~~~
181+
182+
For an offline analysis the user runs the application on its own, with Tau's ADIOS2 plugin configured to use the **BPFile** engine (**TAU_ADIOS2_ENGINE=BPFile** environment option; `see previous section <./appendix_usage.html#chimbuko-config>`_). Once complete, Tau will generate a file with a **.bp** extension and a filename chosen according to the user-specified **TAU_ADIOS2_FILENAME** environment option. The user can then copy this file to a location accessible to the Chimbuko application, for example on a local machine.
183+
184+
The first step is to run the application:
185+
186+
.. code:: bash
187+
188+
mpirun -n ${RANKS} ${APPLICATION} ${APPLICATION_ARGS}
189+
190+
Once complete, the user should locate the **.bp** file and copy to a location accessible to Chimbuko.
191+
192+
- **${RANKS}** : Number MPI ranks.
193+
- **${APPLICATION}** : Path to the application executable.
194+
- **${APPLICATION_ARGS}** : Input arguments required by the application.
195+
196+
On the analysis machine, the provenance database and parameter server should be instantiated as in the previous section. The AD modules must still be spawned under MPI with one AD instance per rank of the original job:
197+
198+
.. code:: bash
199+
200+
mpirun -n ${RANKS} driver BPFile ${ADIOS2_FILE_DIR} ${ADIOS2_FILE_PREFIX} ${OUTPUT_LOC} -pserver_addr ${PSERVER_ADDR} -provdb_addr ${PROVDB_ADDR} ...
201+
202+
Note that the first argument of **driver**, which specifies the ADIOS2 engine, has been set to **BPFile**, and the process is not run in the background.

0 commit comments

Comments
 (0)