Skip to content

feat: add estimate-size subcommand for network size estimation#88

Closed
jacderida wants to merge 1 commit into
WithAutonomi:mainfrom
jacderida:feat-network_size_estimate
Closed

feat: add estimate-size subcommand for network size estimation#88
jacderida wants to merge 1 commit into
WithAutonomi:mainfrom
jacderida:feat-network_size_estimate

Conversation

@jacderida
Copy link
Copy Markdown
Collaborator

@jacderida jacderida commented May 6, 2026

Summary

Adds a non-breaking ant-node estimate-size subcommand that estimates the size of the live network.

  • Bootstraps a saorsa-core P2PNode in NodeMode::Client (no listen socket, no DHT routing participation), runs N random-key iterative FIND_NODE lookups, and infers the network size from the XOR distance to the k-th closest peer in each lookup. Standard Kademlia density estimator: N̂ = k · 2^256 / d_k. Per-sample values are aggregated into a mean, median, and 95% confidence interval.
  • Reuses the existing bootstrap-resolution cascade (CLI/env → config file → auto-discovered bootstrap_peers.toml), so flags users already know carry over.
  • clap's Option<Subcommand> pattern means invocations without a subcommand continue to launch a node exactly as before. No behavior change for existing callers.
  • Progress is reported to stderr (connecting, bootstrap time, routing-table size, per-sample status, sampling completion) so an operator can see whether the command is working or stuck. Final result goes to stdout.
  • Default per-lookup timeout is 90s; saorsa-core's iterative lookup can take this long when a dead peer's dial cascade drags out an early iteration.

Why

Operators currently have no way to gauge how many nodes are participating in the live network — ant-node doesn't emit node-count metrics and there's no separate crawler. A rough estimate is useful for capacity planning, anomaly detection, and release sanity checks.

The subcommand approach (rather than a separate binary or new crate) was chosen because it lets the estimator share ant-node's bootstrap configuration, bootstrap_peers.toml discovery, and dependency on saorsa-core — without inflating the runtime node binary in any user-visible way when not invoked.

Test plan

  • cargo fmt --all clean
  • cargo clippy --all-features -- -D clippy::panic -D clippy::unwrap_used -D clippy::expect_used clean
  • cargo test --lib estimator — 8/8 unit tests passing (sample density formula, aggregator edge cases, XOR distance, leading-u64 extraction, CI clamping)
  • cargo build clean
  • ant-node --help shows existing top-level flags plus the new subcommand (non-breaking)
  • ant-node estimate-size --help shows estimator-specific flags
  • Live testnet smoke run produced consistent estimates across successful samples (mean ~3119, median ~3035, 95% CI ±10%)
  • Reviewer to verify a fresh smoke run of ant-node estimate-size against the live network completes without timeouts at the new 90s default

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 6, 2026 20:08
@jacderida jacderida force-pushed the feat-network_size_estimate branch from c3f5d28 to ab5a3a3 Compare May 6, 2026 20:12
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new ant-node estimate-size subcommand to estimate live network size by bootstrapping a client-mode P2PNode, running multiple random-key FIND_NODE lookups, and aggregating per-sample Kademlia density estimates.

Changes:

  • Introduces a new estimator module with the sampling/aggregation logic and unit tests.
  • Extends the ant-node CLI with an estimate-size subcommand and routes execution in main.rs.
  • Exposes the new module from src/lib.rs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/lib.rs Exports the new estimator module from the crate.
src/estimator.rs Implements client-mode bootstrap, lookup sampling, and estimate aggregation (+unit tests).
src/bin/ant-node/main.rs Dispatches between “run node” vs “estimate-size” subcommand execution paths.
src/bin/ant-node/cli.rs Adds Command::EstimateSize and associated CLI flags/help text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/estimator.rs
Comment thread src/estimator.rs
Comment thread src/estimator.rs Outdated
Comment thread src/bin/ant-node/cli.rs Outdated
@jacderida jacderida force-pushed the feat-network_size_estimate branch from ab5a3a3 to 91b17ed Compare May 6, 2026 20:19
Operators have no easy way to gauge how many nodes are participating in the
live Autonomi network. ant-node does not emit node-count metrics, and there
is no purpose-built crawler — yet a rough size estimate is useful for
capacity planning, anomaly detection, and release sanity checks.

Add a non-breaking `ant-node estimate-size` subcommand that bootstraps a
saorsa-core P2PNode in NodeMode::Client (no listen socket, no DHT routing
participation), runs many random-key iterative FIND_NODE lookups, and
infers the network size from the XOR distance to the k-th closest peer in
each lookup (standard Kademlia density estimator: N̂ = k · 2^256 / d_k).
Per-sample estimates are aggregated into a mean, median, and 95% confidence
interval.

The subcommand reuses the existing bootstrap-resolution cascade (CLI/env →
config file → auto-discovered bootstrap_peers.toml), so users get the same
flags they're already used to. Invocations without a subcommand continue
to launch a node exactly as before — clap's `Option<Subcommand>` pattern
is non-breaking.

Progress is reported to stderr (connecting, bootstrap time, routing-table
size, per-sample status, sampling completion) so an operator running the
command can tell whether work is happening or it is stuck. The default
per-lookup timeout is 90s — saorsa-core's iterative lookup can take this
long when a dead peer's dial cascade drags out an early iteration.

Verification:
- cargo fmt clean
- cargo clippy --all-features ... -D clippy::panic -D clippy::unwrap_used -D clippy::expect_used clean
- cargo test --lib estimator: 8/8 passing
- ant-node --help shows top-level options and the new subcommand
- ant-node estimate-size --help shows estimator-specific flags
- live testnet smoke run produced consistent estimates across successful
  samples (mean ~3119, median ~3035, 95% CI ±10%)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 6, 2026 20:26
@jacderida jacderida force-pushed the feat-network_size_estimate branch from 91b17ed to 5b2cde7 Compare May 6, 2026 20:26
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comment thread src/estimator.rs
Comment on lines +339 to +347
fn sample_estimate(target: &[u8; 32], peers: &[saorsa_core::DHTNode], k: usize) -> Option<f64> {
if peers.len() < k {
return None;
}

// The lookup returns peers sorted by distance to the target (closest first).
// We want the XOR distance to the k-th closest, i.e. the (k-1)th element.
let kth = peers.get(k - 1)?;
let kth_bytes = kth.peer_id.to_bytes();
Comment thread src/estimator.rs
Comment on lines +174 to +181
if per_sample.is_empty() {
return Err(Error::Startup(
"no samples produced a usable density estimate (all lookups failed or returned too few peers)"
.to_string(),
));
}

Ok(aggregate(per_sample, params.samples, k_used))
Comment thread src/estimator.rs
Comment on lines +105 to +112
eprintln!(
"Connecting to bootstrap peers ({} configured)... [this can take 30\u{2013}60s]",
config.bootstrap.len()
);
let p2p_node = P2PNode::new(core_config)
.await
.map_err(|e| Error::Startup(format!("Failed to create client P2P node: {e}")))?;
let p2p = Arc::new(p2p_node);
@jacderida jacderida closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants