Skip to content

Commit 1768ee1

Browse files
committed
Update READMEs: PageRank params and Vermeer configs
Clarify algorithm parameters and configuration guidance across computer/README.md and vermeer/README.md. In computer/README.md PageRank options were renamed and documented (page_rank.alpha, bsp.max_superstep, pagerank.l1DiffThreshold) and a pointer to the full PageRank implementation was added to avoid confusion from the simplified example. In vermeer/README.md example Docker volume mounts now recommend a dedicated config directory (~/vermeer-config) and include a security note about avoiding mounting the whole home directory. The master.ini/worker.ini sample blocks were reworked to use revised keys (http_peer, grpc_peer, master_peer, run_mode, task_parallel_num, etc.) and a note clarifies that HugeGraph connection details are supplied via the graph load API. Additional notes direct readers to the real WorkerComputer/MasterComputer interfaces and existing algorithm examples; minor performance-tuning guidance was also adjusted to reflect the new task_parallel_num setting.
1 parent b0fe1ab commit 1768ee1

2 files changed

Lines changed: 49 additions & 45 deletions

File tree

computer/README.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -139,9 +139,14 @@ k8s.worker.memory=8Gi
139139
bsp.etcd.url=http://etcd-cluster:2379
140140

141141
# Algorithm parameters (PageRank example)
142-
pagerank.damping_factor=0.85
143-
pagerank.max_iterations=20
144-
pagerank.convergence_tolerance=0.0001
142+
# Alpha parameter (1 - damping factor), default: 0.15
143+
page_rank.alpha=0.85
144+
145+
# Maximum supersteps (iterations), controlled by BSP framework
146+
bsp.max_superstep=20
147+
148+
# L1 norm difference threshold for convergence, default: 0.00001
149+
pagerank.l1DiffThreshold=0.0001
145150
```
146151

147152
#### 2. Submit Job
@@ -230,6 +235,10 @@ public interface Computation<M extends Value> {
230235

231236
### Example: Simple PageRank
232237

238+
> **NOTE**: This is a simplified example showing the key concepts.
239+
> For the complete implementation including all required methods (`name()`, `category()`, `init()`, etc.),
240+
> see: `computer/computer-algorithm/src/main/java/org/apache/hugegraph/computer/algorithm/centrality/pagerank/PageRank.java`
241+
233242
```java
234243
package org.apache.hugegraph.computer.algorithm.centrality.pagerank;
235244

vermeer/README.md

Lines changed: 37 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -101,18 +101,20 @@ Pull the image:
101101
docker pull hugegraph/vermeer:latest
102102
```
103103

104-
Create local configuration files `~/master.ini` and `~/worker.ini` (see [Configuration](#configuration) section).
104+
Create a dedicated config directory (e.g., `~/vermeer-config/`) with `master.ini` and `worker.ini` files (see [Configuration](#configuration) section).
105105

106106
Run with Docker:
107107

108108
```bash
109109
# Master node
110-
docker run -v ~/:/go/bin/config hugegraph/vermeer --env=master
110+
docker run -v ~/vermeer-config:/go/bin/config hugegraph/vermeer --env=master
111111

112112
# Worker node
113-
docker run -v ~/:/go/bin/config hugegraph/vermeer --env=worker
113+
docker run -v ~/vermeer-config:/go/bin/config hugegraph/vermeer --env=worker
114114
```
115115

116+
> **Security Note**: Only mount directories containing Vermeer configuration files. Avoid mounting your entire home directory to minimize security risks.
117+
116118
#### Docker Compose
117119

118120
Update `master_peer` in `~/worker.ini` to `172.20.0.10:6689`, then:
@@ -200,52 +202,46 @@ make clean-all # Also remove downloaded tools (supervisord, protoc)
200202
### Master Configuration (`master.ini`)
201203

202204
```ini
203-
[master]
204-
# Master server listen address
205-
listen_addr = :6688
206-
207-
# Master gRPC server address
208-
grpc_addr = :6689
205+
[default]
206+
# Master HTTP listen address
207+
http_peer = 0.0.0.0:6688
209208

210-
# Worker heartbeat timeout (seconds)
211-
worker_timeout = 30
209+
# Master gRPC listen address
210+
grpc_peer = 0.0.0.0:6689
212211

213-
# Task execution timeout (seconds)
214-
task_timeout = 3600
212+
# Master peer address (self-reference for workers)
213+
master_peer = 127.0.0.1:6689
215214

216-
[hugegraph]
217-
# HugeGraph PD address for metadata
218-
pd_peers = 127.0.0.1:8686
215+
# Run mode
216+
run_mode = master
219217

220-
# HugeGraph HTTP endpoint for result writing
221-
server = http://127.0.0.1:8080
218+
# Task scheduling strategy
219+
task_strategy = 1
222220

223-
# Graph space name
224-
graph = hugegraph
221+
# Number of parallel tasks
222+
task_parallel_num = 1
225223
```
226224

225+
**Note**: HugeGraph connection details (`pd_peers`, `server`, `graph`) are provided in the graph load API request, not in the configuration file. See [HugeGraph Integration](#hugegraph-integration) section for details.
226+
227227
### Worker Configuration (`worker.ini`)
228228

229229
```ini
230-
[worker]
231-
# Worker listen address
232-
listen_addr = :6789
230+
[default]
231+
# Worker HTTP listen address
232+
http_peer = 0.0.0.0:6788
233+
234+
# Worker gRPC listen address
235+
grpc_peer = 0.0.0.0:6789
233236

234237
# Master gRPC address to connect
235238
master_peer = 127.0.0.1:6689
236239

237-
# Worker ID (unique)
238-
worker_id = worker01
239-
240-
# Number of compute threads
241-
compute_threads = 4
240+
# Run mode
241+
run_mode = worker
242242

243-
# Memory limit (GB)
244-
memory_limit = 8
245-
246-
[storage]
247-
# Local disk path for spilling
248-
data_path = ./data
243+
# Worker group identifier
244+
worker_group = default
249245
```
250246

251247
## Available Algorithms
@@ -386,6 +382,9 @@ Load from Hadoop Distributed File System:
386382

387383
Custom algorithms implement the `Algorithm` interface in `algorithms/algorithms.go`:
388384

385+
> **NOTE**: The following is a simplified conceptual interface for illustration purposes.
386+
> For actual algorithm implementation, see the `WorkerComputer` and `MasterComputer` interfaces defined in `apps/compute/api.go`.
387+
389388
```go
390389
type Algorithm interface {
391390
// Initialize the algorithm
@@ -404,6 +403,9 @@ type Algorithm interface {
404403

405404
### Example: Simple Degree Count
406405

406+
> **NOTE**: This is a simplified conceptual example. Actual algorithms must implement the `WorkerComputer` interface.
407+
> See `vermeer/algorithms/degree.go` for a working example.
408+
407409
```go
408410
package algorithms
409411

@@ -482,16 +484,9 @@ tools/protoc/osxm1/protoc *.proto --go-grpc_out=. --go_out=.
482484

483485
## Performance Tuning
484486

485-
### Worker Configuration
486-
487-
- **compute_threads**: Set to number of CPU cores for CPU-bound algorithms
488-
- **memory_limit**: Set to 70-80% of available RAM
489-
- **partition_count**: Increase for better parallelism (default: auto-calculated)
490-
491487
### Master Configuration
492488

493-
- **worker_timeout**: Increase for slow networks or heavily loaded workers
494-
- **task_timeout**: Increase for long-running algorithms (e.g., Louvain on large graphs)
489+
- **task_parallel_num**: Number of parallel tasks (default: 1). Increase for better task scheduling throughput.
495490

496491
### Algorithm-Specific
497492

0 commit comments

Comments
 (0)