Skip to content

Commit a233399

Browse files
update readme
1 parent f6a56b5 commit a233399

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# CodeEvolve Experiments Repository
22

3-
This repository contains the complete experimental setup, benchmark implementations, and reproducibility code for the CodeEvolve research paper.
3+
This repository contains the complete experimental setup, benchmark implementations, and reproducibility code for the CodeEvolve [paper](https://arxiv.org/abs/2510.14150).
44

55
> **CodeEvolve: An open source evolutionary coding agent for algorithm discovery and optimization**
66
> Henrique Assumpção, Diego Ferreira, Leandro Campos, Fabricio Murai
@@ -143,7 +143,7 @@ codeevolve \
143143

144144
### Exact Reproducibility
145145

146-
Our experimental results were obtained using Qwen and Gemini models as the backbone for our LLM ensembles. Both models were accessed via an internal API system at Inter that routed requests to the respective LLM providers. Many commercial LLM providers do not guarantee deterministic outputs even when random seeds are provided. As a result, **exact numerical reproduction of our paper results is not guaranteed**, even when using the same configuration files and seeds. Despite these limitations, our ablation studies demonstrate that CodeEvolve consistently achieves **state-of-the-art results across multiple seeds and experimental runs** on all considered benchmarks. The core algorithmic contributions remain robust to LLM stochasticity.
146+
Our experimental results were obtained using Qwen and Gemini models as the backbone for our LLM ensembles. Both models were accessed via an internal API system at Inter that routed requests to the respective LLM provider. Many commercial LLM providers do not guarantee deterministic outputs even when random seeds are provided. As a result, **exact numerical reproduction of our paper results is not guaranteed**, even when using the same configuration files and seeds. Despite these limitations, our ablation studies demonstrate that CodeEvolve consistently achieves **state-of-the-art results across multiple seeds and experimental runs** on all considered benchmarks. The core algorithmic contributions remain robust to LLM stochasticity.
147147

148148
## Analyzing Results
149149

0 commit comments

Comments
 (0)