Skip to content

Commit bb8fd34

Browse files
update README
1 parent dbc00a2 commit bb8fd34

1 file changed

Lines changed: 153 additions & 14 deletions

File tree

README.md

Lines changed: 153 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,177 @@
1-
# Experiments with CodeEvolve
2-
This repository contains benchmark implementations, experimental configurations, and reproducibility code for the CodeEvolve paper:
1+
# CodeEvolve Experiments Repository
32

4-
> **CodeEvolve: an open source evolutionary coding agent for algorithm discovery and optimization**
3+
This repository contains the complete experimental setup, benchmark implementations, and reproducibility code for the CodeEvolve research paper.
4+
5+
> **CodeEvolve: An open source evolutionary coding agent for algorithm discovery and optimization**
56
> Henrique Assumpção, Diego Ferreira, Leandro Campos, Fabricio Murai
67
> [arXiv:2510.14150](https://arxiv.org/abs/2510.14150)
78
89
## Overview
910

10-
TODO
11+
This companion repository to [science-codeevolve](https://github.com/inter-co/science-codeevolve) provides:
12+
13+
- **Complete benchmark problems** used in the paper's evaluation
14+
- **Experimental configurations** for reproducing all results
15+
- **Raw experimental data** from paper runs (`.pkl`, `.py`, `.txt` files)
16+
- **Analysis notebooks** with visualizations and statistical tests
17+
18+
All experiments validate CodeEvolve's performance on algorithmic discovery tasks from mathematics, demonstrating competitive or superior results compared to closed-source systems like Google DeepMind's AlphaEvolve.
1119

1220
## Repository Structure
1321

14-
TODO
22+
```
23+
science-codeevolve-experiments/
24+
├── experiments/ # Raw experimental results
25+
│ └── alphaevolve_math_problems/
26+
│ ├── autocorrelation_problems/ # Autocorrelation inequalities
27+
│ ├── minimizing_max_min_dist/ # Max-min distance problems
28+
│ └── packing_problems/ # Circle and hexagon packing
29+
├── notebooks/ # Analysis and visualization
30+
│ ├── experiment_analysis.ipynb # Main analysis notebook
31+
│ └── figs/ # Generated figures from paper
32+
├── problems/ # Benchmark problem definitions
33+
│ ├── alphaevolve_math_problems/
34+
│ │ ├── autocorrelation_problems/
35+
│ │ ├── minimizing_max_min_dist/
36+
│ │ └── packing_problems/
37+
└── README.md
38+
```
39+
40+
### Directory Details
41+
42+
- **`experiments/`**: Contains results from paper experiments including:
43+
- Solution histories (`.py` files)
44+
- Checkpoints (`.pkl` files)
45+
- Logs and metadata (`.txt` files)
46+
- Multiple runs with different seeds/configurations
47+
48+
- **`notebooks/`**: Jupyter notebooks for analysis
49+
- `experiment_analysis.ipynb`: Statistical analysis and comparisons
50+
51+
- **`problems/`**: Problem definitions with:
52+
- Initial solution templates (`input/`)
53+
- Configuration files for different LLMs (`configs/`)
54+
- Evaluation scripts
1555

1656
## Prerequisites
17-
Install CodeEvolve and dependencies:
57+
58+
### Install CodeEvolve Framework
59+
60+
First, install the main CodeEvolve framework:
1861

1962
```bash
2063
# Clone and install CodeEvolve framework
2164
git clone https://github.com/inter-co/science-codeevolve.git
2265
cd science-codeevolve
2366
conda env create -f environment.yml
2467
conda activate codeevolve
68+
cd ..
69+
```
2570

71+
### Clone Experiments Repository
72+
73+
```bash
2674
# Clone this experiments repository
27-
cd ..
2875
git clone https://github.com/inter-co/science-codeevolve-experiments.git
2976
cd science-codeevolve-experiments
77+
```
78+
79+
### Configure LLM API Access
80+
81+
Set your LLM API credentials as environment variables:
3082

31-
# Set your LLM API credentials
32-
export API_KEY=your_api_key
83+
```bash
84+
export API_KEY=your_api_key_here
3385
export API_BASE=your_api_base_url
3486
```
3587

36-
## Reproducing results
88+
## Reproducing Paper Results
89+
90+
### Running a Benchmark Problem
91+
92+
Each problem has configuration files for different LLM providers (Gemini, Qwen, etc.). Here's how to run an experiment:
93+
94+
```bash
95+
# Example: Circle packing in a square (26 circles) with Qwen
96+
codeevolve \
97+
--inpt_dir=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26 \
98+
--cfg_path=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26/configs/qwen_config.yaml \
99+
--out_dir=results/circle_packing_26_qwen \
100+
--terminal_logging
101+
102+
# Example: First autocorrelation inequality with Gemini
103+
codeevolve \
104+
--inpt_dir=problems/alphaevolve_math_problems/autocorrelation_problems/first_autocorr_ineq/input \
105+
--cfg_path=problems/alphaevolve_math_problems/autocorrelation_problems/first_autocorr_ineq/configs/gemini_config.yaml \
106+
--out_dir=results/autocorr_first_gemini \
107+
--terminal_logging
108+
```
109+
110+
### Available Benchmark Problems
111+
112+
| Problem Category | Problem | Dimensions | Description |
113+
|-----------------|---------|------------|-------------|
114+
| **Autocorrelation** | First Autocorr Ineq | - | First autocorrelation inequality |
115+
| | Second Autocorr Ineq | - | Second autocorrelation inequality |
116+
| **Heilbronn** | Triangle | - | Heilbronn triangle problem |
117+
| | Convex | 13, 14 | Heilbronn convex hull problem |
118+
| **Max-Min Distance** | Dimension 2 | 2D | Maximize minimum distance |
119+
| | Dimension 3 | 3D | Maximize minimum distance |
120+
| **Packing** | Circle in Rectangle | - | Pack circles in rectangle |
121+
| | Circle in Square | 26, 32 | Pack N circles in unit square |
122+
| | Hexagon Packing | 11, 12 | Pack N hexagons in larger hexagon |
123+
124+
### Resuming from Checkpoints
125+
126+
To resume an interrupted run:
127+
128+
```bash
129+
codeevolve \
130+
--inpt_dir=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26 \
131+
--out_dir=results/circle_packing_26_qwen \
132+
--load_ckpt=-1 # Load latest checkpoint
133+
```
134+
135+
Or load a specific checkpoint epoch:
136+
137+
```bash
138+
codeevolve \
139+
--inpt_dir=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26 \
140+
--out_dir=results/circle_packing_26_qwen \
141+
--load_ckpt=100 # Load checkpoint from epoch 100
142+
```
143+
144+
### Exact Reproducibility
145+
146+
Our experimental results were obtained using Qwen and Gemini models as the backbone for our LLM ensembles. Both models were accessed via an internal API system at Inter that routed requests to the respective LLM providers. Many commercial LLM providers do not guarantee deterministic outputs even when random seeds are provided. As a result, **exact numerical reproduction of our paper results is not guaranteed**, even when using the same configuration files and seeds. Despite these limitations, our ablation studies demonstrate that CodeEvolve consistently achieves **state-of-the-art results across multiple seeds and experimental runs** on all considered benchmarks. The core algorithmic contributions remain robust to LLM stochasticity.
37147

38-
TODO
148+
## Analyzing Results
149+
150+
### Using the Analysis Notebook
151+
152+
```bash
153+
# Make sure jupyter is installed
154+
conda activate codeevolve
155+
pip install jupyter matplotlib pandas
156+
157+
# Launch notebook
158+
jupyter notebook notebooks/experiment_analysis.ipynb
159+
```
160+
161+
The notebook provides:
162+
- Solution quality over time plots
163+
- Comparison with AlphaEvolve baselines
164+
- Ablation study analysis
165+
166+
## Getting Help
167+
168+
- **Issues**: [GitHub Issues](https://github.com/inter-co/science-codeevolve/issues)
169+
- **Discussions**: [GitHub Discussions](https://github.com/inter-co/science-codeevolve/discussions)
170+
- **Paper**: [arXiv:2510.14150](https://arxiv.org/abs/2510.14150)
39171

40172
## Citation
41173

42-
If you use CodeEvolve in your research, please cite our paper:
174+
If you use CodeEvolve or these benchmarks in your research, please cite:
43175

44176
```bibtex
45177
@article{assumpção2025codeevolveopensourceevolutionary,
@@ -53,12 +185,19 @@ If you use CodeEvolve in your research, please cite our paper:
53185
}
54186
```
55187

188+
## Releases
189+
190+
Experiments are versioned to match the main repository:
191+
192+
- **v0.1.0**: Initial release, corresponds to v1 of technical report
193+
- **v0.2.0**: Current release, corresponds to v3 of technical report
194+
56195
## Acknowledgements
57196

58197
The authors thank Bruno Grossi for his continuous support during the development of this project. We thank Fernando Augusto and Tiago Machado for useful conversations about possible applications of CodeEvolve. We also thank the [OpenEvolve](https://github.com/codelion/openevolve) community for their inspiration and discussion about evolutionary coding agents.
59198

60-
## License and Disclaimer
199+
## License
61200

62201
All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0.
63202

64-
**This is not an official Inter product.**
203+
**This is not an official Inter product.**

0 commit comments

Comments
 (0)