Conversation
* Rebuild index with new config * fix 1 * index name in get stattu api correction * docs changes * Rebuild Status Persistence
VectorDB Benchmark — Dense — PassedTriggered by @omnish-endee · Commit
|
…me as addVectors)
|
/correctness_benchmarking dense |
3 similar comments
|
/correctness_benchmarking dense |
|
/correctness_benchmarking dense |
|
/correctness_benchmarking dense |
| } | ||
|
|
||
| // Get current metadata for defaults | ||
| auto meta = index_manager.getMetadata(index_id); |
There was a problem hiding this comment.
Here there should be a check if there are any vectors in the index
There was a problem hiding this comment.
Fixed. Uses live count from getElementCount (not meta->total_elements which can be stale before first save). Check is placed after getElementCount so the same count is reused in the response.
| // Get actual vector count for response | ||
| size_t actual_element_count = 0; | ||
| try { | ||
| actual_element_count = index_manager.getElementCount(index_id); | ||
| } catch (...) {} |
There was a problem hiding this comment.
this is ignoring any exceptions. This shouldnt be the case. use docs/logs.md format to throw an error.
If getIndexEntry throws (index not loaded), the response will report total_vectors: 0 which is misleading in the next code block.
There was a problem hiding this comment.
Fixed. Removed the try/catch — index existence is already validated by getMetadata above, so any load failure here is a genuine error that should propagate to the outer handler and return a 500.
| index_id, new_M, new_ef_con); | ||
|
|
||
| if (!success) { | ||
| int code = (message.find("already in progress") != std::string::npos) ? 409 : 400; |
There was a problem hiding this comment.
Error messages should not be used to determine the return code. some error code should be used here instead.
There was a problem hiding this comment.
Fixed. Replaced std::pair<bool, std::string> with a RebuildResult struct carrying an explicit http_code field. rebuildIndexAsync now returns typed codes (404, 409, 400, 202) directly — no more string matching in the route.
| return metadata_manager_->getMetadata(index_id); | ||
| } | ||
|
|
||
| // Index stats (safe to call from routes) |
There was a problem hiding this comment.
what does this comment mean ?
There was a problem hiding this comment.
Reads live count from the in-memory HNSW graph; meta->total_elements can be stale between saves.
| nlohmann::json getRebuildProgress(const std::string& username, | ||
| const std::string& index_id) const { | ||
| auto state = rebuild_.getActiveRebuild(username); | ||
| if (state && state->index_id == index_id) { | ||
| size_t processed = state->vectors_processed.load(); | ||
| size_t total = state->total_vectors.load(); | ||
| double percent = total > 0 ? (100.0 * processed / total) : 0.0; | ||
| nlohmann::json result = { | ||
| {"status", state->status}, | ||
| {"vectors_processed", processed}, | ||
| {"total_vectors", total}, | ||
| {"percent_complete", percent}, | ||
| {"started_at", Rebuild::formatTime(state->started_at)} | ||
| }; | ||
| if (state->status == "completed" || state->status == "failed") { | ||
| result["completed_at"] = Rebuild::formatTime(state->completed_at); | ||
| } | ||
| if (state->status == "failed" && !state->error_message.empty()) { | ||
| result["error"] = state->error_message; | ||
| } | ||
| return result; | ||
| } | ||
| return {{"status", "idle"}}; | ||
| } |
There was a problem hiding this comment.
This function should be in rebuild class to keep every subsystem separate
There was a problem hiding this comment.
Fixed. Moved the full method body to Rebuild::getProgress(username, index_id). IndexManager::getRebuildProgress is now a one-liner delegate: return rebuild_.getProgress(username, index_id).
| executeRebuildJob(index_id, username, new_M, new_ef_con, st); | ||
| }); | ||
| rebuild_.setActiveRebuild(username, index_id, current_count, std::move(t)); | ||
|
|
There was a problem hiding this comment.
setActiveRebuild should be called before executeRebuildJob. This way two rebuild jobs will not happen at once.
There was a problem hiding this comment.
Fixed. Now calls setActiveRebuild with std::jthread{} (empty) first — state is registered immediately, blocking any concurrent rebuild request before the thread is even spawned. The real thread is then spawned and moved in via the new attachRebuildThread method.
| } | ||
| } | ||
| } catch (const std::exception&) { | ||
| // Silently ignore cleanup errors on startup |
There was a problem hiding this comment.
dont ignore errors here
There was a problem hiding this comment.
Fixed. Changed to LOG_WARN(2053, "rebuild", "Failed to cleanup temp files on startup: " << e.what()).
|
|
||
| // State tracking — per user | ||
|
|
||
| void setActiveRebuild(const std::string& username, const std::string& index_id, |
There was a problem hiding this comment.
should use rebuild_state_mutex_
There was a problem hiding this comment.
Already fixed — setActiveRebuild holds rebuild_state_mutex_ via lock_guard on entry.
| new_alg->setVectorFetcher([vs = entry->vector_storage](ndd::idInt label, uint8_t* buffer) { | ||
| return vs->get_vector(label, buffer); | ||
| }); | ||
|
|
||
| new_alg->setVectorFetcherBatch([vs = entry->vector_storage](const ndd::idInt* labels, | ||
| uint8_t* buffers, | ||
| bool* success, | ||
| size_t count) -> size_t { | ||
| return vs->get_vectors_batch_into(labels, buffers, success, count); | ||
| }); |
There was a problem hiding this comment.
This code piece is duplicate from the same at the end of the function. deduplicate it
There was a problem hiding this comment.
Fixed. Extracted wireVectorFetchers(alg, vector_storage) as a private static helper on IndexManager. Both duplicate blocks replaced with a single call each. Note: the first call (on new_alg) must happen before any addPoint — searchBaseLayer during graph construction needs the fetcher for base-layer-only nodes that don't store data inline. Comment preserved at the call site.
| auto fresh_alg = std::make_unique<hnswlib::HierarchicalNSW<float>>(index_path, 0); | ||
|
|
||
| fresh_alg->setVectorFetcher([vs = entry->vector_storage](ndd::idInt label, uint8_t* buffer) { | ||
| return vs->get_vector(label, buffer); | ||
| }); | ||
|
|
||
| fresh_alg->setVectorFetcherBatch([vs = entry->vector_storage](const ndd::idInt* labels, | ||
| uint8_t* buffers, | ||
| bool* success, | ||
| size_t count) -> size_t { | ||
| return vs->get_vectors_batch_into(labels, buffers, success, count); |
There was a problem hiding this comment.
this is duplicate from reloadIndex. deduplicate
There was a problem hiding this comment.
Cannot call reloadIndex here — executeRebuildJob holds operation_mutex while reloadIndex acquires indices_mutex_, but deleteIndex holds indices_mutex_ and then acquires operation_mutex. Calling reloadIndex from inside executeRebuildJob would deadlock with a concurrent delete on the same index. Keeping the inline block and added a comment explaining the constraint.
VectorDB Benchmark - Ready To Run
Post one of the command below. Only members with write access can trigger runs. Available Modes
Infrastructure
Both servers start on demand and are always terminated after the run — pass or fail. How Correctness Benchmarking Works
|
|
/correctness_benchmarking dense |
VectorDB Benchmark — Dense — FailedTriggered by @hemant-endee · Commit
|
No description provided.