You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sphinx/source/appendix/appendix_usage.rst
+28-11Lines changed: 28 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,17 @@
2
2
Usage
3
3
*********
4
4
5
+
Additional ProvDB Variables
6
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
7
+
8
+
- **-db_write_dir** : This is used to specify a path to provenance database to write on disk.
9
+
- **-engine** : This is the OFI libfabric provider used for the Mochi stack. Its value can be set to "ofi+tcp;ofi_rxm".
10
+
5
11
Visualization Variables
6
12
~~~~~~~~~~~~~~~~~~~~~~~
7
13
14
+
- **${provdb_writedir}** : A directory which stores provenance database
15
+
- **${provdb_nshards}** : Number of shards used between provenance database and visualization module.
8
16
- **${VIZ_PORT}** : The port to assign to the visualization module
9
17
- **${VIZ_DATA_DIR}**: A directory for storing logs and temporary data (assumed to exist)
10
18
- **${VIZ_INSTALL_DIR}**: The directory where the visualization module is installed
@@ -14,22 +22,25 @@ Parameter Server Variables
14
22
15
23
- **PSERVER_NT** : The number of threads used to handle incoming communications from the AD modules
16
24
- **PSERVER_LOGDIR** : A directory for logging output
17
-
- **VIZ_ADDRESS** : Address of the visualization module (see above).
18
-
- **PROVDB_ADDR**: The address of the provenance database (see above). This option enables the storing of the final globally-aggregated function profile information into the provenance database.
19
25
- **PSERVER_ALG** : Set AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
20
26
21
27
Note that all the above are optional arguments, although if the **VIZ_ADDRESS** is not provided, no information will be sent to the webserver.
22
28
29
+
Additional pserver Variables
30
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31
+
32
+
- **-ws_addr** : Address of the visualization module.
33
+
- **-provdb_addr** : The address of the provenance database (see above). This option enables the storing of the final globally-aggregated function profile information into the provenance database.
34
+
- **-prov_outputpath** : This is the path to the provenance database on disk.
35
+
23
36
AD Variables
24
37
~~~~~~~~~~~~
25
38
26
-
- **RANKS** : The number of MPI ranks that the application will be run on
27
-
- **ADIOS2_ENGINE** : The ADIOS2 communications engine. For online analysis this should be **SST** by default (an alternative, **BP4** is discussed below)
28
-
- **ADIOS2_FILE_DIR** : The directory in which the ADIOS2 file is written (see below)
29
-
- **ADIOS2_FILE_PREFIX** : The ADIOS2 file prefix (see below)
30
-
- **PSERVER_ADDR**: The address of the parameter server from above.
31
-
- **PROVDB_ADDR**: The address of the provenance database from above.
32
-
- **NSHARDS**: The number of provenance database shards
39
+
- **${ADIOS2_ENGINE}** : The ADIOS2 communications engine. For online analysis this should be **SST** by default (an alternative, **BP4** is discussed below)
40
+
- **${ADIOS2_PATH}** : The directory in which the ADIOS2 file is written (see below)
41
+
- **${ADIOS2_FILE_PREFIX}** : The ADIOS2 file prefix.
42
+
- **${EXE_NAME}** : Name of the executable of application (see examples).
43
+
- **${ad_opts}** : This is a collection of all other `arguments <./appendix_usage.html#additional-ad-variables>`_ required by AD module for its instantiation.
33
44
34
45
Additional AD Variables
35
46
~~~~~~~~~~~~~~~~~~~~~~~
@@ -40,5 +51,11 @@ Additional AD Variables
40
51
- **-program_idx** : For workflows with multiple component programs, a "program index" must be supplied to the AD instances attached to those processes.
41
52
- **-rank** : By default the data rank assigned to an AD instance is taken from its MPI rank in MPI_COMM_WORLD. This rank is used to verify the incoming trace data. This option allows the user to manually set the rank index.
42
53
- **-override_rank** : This option disables the data rank verification and instead overwrites the data rank of the incoming trace data with the data rank stored in the AD instance. The value supplied must be the original data rank (this is used to generate the correct trace filename).
43
-
- **-ad_algorithm** : This is an option which sets AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
44
-
- **-hbos_threshold** : This is the threshold to control density of detected anomalies used by HBOS algorithm. Its value ranges between 0 and 1. Default value is 0.99
54
+
- **-ad_algorithm** : This sets the AD algorithm to use for online analysis: "sstd" or "hbos". Default value is "hbos".
55
+
- **-hbos_threshold** : This sets the threshold to control density of detected anomalies used by HBOS algorithm. Its value ranges between 0 and 1. Default value is 0.99
56
+
57
+
Application Variables
58
+
~~~~~~~~~~~~~~~~~~~~~
59
+
60
+
- **${APPLICATION}** : Application executable.
61
+
- **${APPLICATION_ARGS}** : List of arguments specific to the application.
For performance reasons the provenance database is **sharded** and the writing is threaded allowing parallel writes to different shards. By default there is only a single shard and thread, but for larger jobs the user should specify more shards and threads using the **-nshards ${NSHARDS}** and **-nthreads ${NTHREADS}** options respectively. The number of shards must also be communicated to the AD module below. The optimal number of shards and threads will depend on the system characteristics, however the number of threads should be at least as large as the number of shards. We recommend that the user run our provided benchmark application and increase the number of each to minimize the round-trip latency.
44
44
45
+
**${provdb_extra_args}** is a variable that is used to provide any additional arguments to provdb_admin. A list of such additional variables can be found `here <../appendix/appendix_usage.html#additional-provdb-variables>`_.
46
+
45
47
The database will be written to disk into the directory from which the **provdb_admin** application is called, under the filename **provdb.${SHARD}.unqlite** where **${SHARD}** is the index of the database shard.
46
48
47
49
For use below we define the variable **PROVDB_ADDR=tcp://${HEAD_NODE_IP}:${PROVDB_PORT}**. For convenience, the **provdb_admin** application will write out a file **provider.address**, the contents of which can be used in place of manually defining this variable.
48
50
49
51
----------------------------------
50
52
51
-
The second step is to instantiate the visualization module:
53
+
The second step is to instantiate the visualization module.
Description of the variables can be found `here <../appendix/appendix_usage.html#parameter-server-variables>`_.
101
+
Description of the variables can be found `here <../appendix/appendix_usage.html#parameter-server-variables>`_. **${ps_extra_args}** can be used to provide additional arguments to the pserver command, as described `here <../appendix/appendix_usage.html#additional-pserver-variables>`_.
100
102
101
103
The parameter server opens communications on TCP port 5559. For use below we define the variable **PSERVER_ADDR=${HEAD_NODE_IP}:5559**.
102
104
103
105
----------------------------------
104
106
105
-
The fourth step is to instantiate the AD modules:
107
+
The provenance database, visualization module and parameter server are launched using the following **jsrun** command:
**${SERVICES}** is the path to a script which includes commands from the previously described first, second and third step, respectively. This command should successfully launch the provenance database, the visualization module, and the parameter server.
115
+
116
+
----------------------------------
117
+
118
+
Next, the AD module can be instantiated using **jsrun** command as follows:
Description of the variables can be found `here <../appendix/appendix_usage.html#ad-variables>`_.
113
126
@@ -117,7 +130,7 @@ The **ADIOS2_ENGINE** can be chosen as either **SST** or **BP4**. The former use
117
130
118
131
In the above we have assumed that the provenance database is being used. However if this component is not in use, the AD will automatically output the provenance data as JSON documents "${STEP}.anomalies.json", "${STEP}.normalexecs.json" and "${STEP}.metadata.json" placed in the directory "${PROV_DIR}/${PROGRAM_IDX}/${RANK}", where STEP is the i/o step; PROGRAM_IDX is the program index; RANK is the rank of the AD instance; and PROV_DIR is set by default to the working directory but can specified manually using the optional argument -prov_outputpath (cf. below).
119
132
120
-
The AD module has a number of additional options that can be used to tune its behavior. The full list can be obtained by running **driver** without any arguments. However a few useful options are described `here <../appendix/appendix_usage.html#additional-ad-variables>`_.
133
+
The AD module has a number of additional options that can be used to tune its behavior. The full list can be obtained by running **driver** without any arguments. However a few useful options are described `here <../appendix/appendix_usage.html#additional-ad-variables>`_. These are part of the **${ad_opts}** in the above command.
121
134
122
135
For debug purposes, the AD module can be made more verbose by setting the environment variable **CHIMBUKO_VERBOSE=1**.
123
136
@@ -131,9 +144,10 @@ The final step is to instantiate the application
Aside from interacting with the visualization module, once complete the user can also interact directly with the provenance database using the **provdb_query** tool as described below: :ref:`install_usage/run_chimbuko:Interacting with the Provenance Database`.
150
+
Description of variables is provided `here <../appendix/appendix_usage.html#application-variables>`_.
0 commit comments