Benchmark Runner

class sudoku_smt_solvers.benchmarks.benchmark_runner.BenchmarkRunner(puzzles_dir: str = 'benchmarks/puzzles', results_dir: str = 'benchmarks/results', timeout: int = 120)

A benchmark runner for comparing different Sudoku solver implementations.

This class manages running performance benchmarks across multiple Sudoku solvers, collecting metrics like solve time and propagation counts, and saving results to CSV files.

Attributes:: puzzles_dir (str): Directory containing puzzle JSON files results_dir (str): Directory where benchmark results are saved timeout (int): Maximum time in seconds allowed for each solver attempt solvers (dict): Dictionary mapping solver names to solver classes

run_benchmarks() → None

Run all solvers on all puzzles and save results.

Executes benchmarks for each solver on each puzzle, collecting performance metrics and saving results to a timestamped CSV file.

The CSV output includes: - Solver name - Puzzle unique ID - Solution status - Solve time - Propagation count

Also calculates and stores aggregate statistics per solver: - Total puzzles attempted - Number of puzzles solved - Total and average solving times - Total and average propagation counts

run_solver(solver_name: str, puzzle: List[List[int]]) → Dict

Run a single solver on a puzzle and collect results with timeout.

Args:

solver_name: Name of the solver to use puzzle: 2D list representing the Sudoku puzzle

Returns:

Dict containing:: status: 'sat', 'unsat', 'timeout', or 'error' solve_time: Time taken in seconds propagations: Number of clause propagations (if available)