Spaces:
Sleeping
Sleeping
metadata
title: OpenEnv Code Debugger
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
app_port: 7860
Code Debug OpenEnv
An OpenEnv-compatible environment for the Meta x PyTorch Hackathon where an AI agent debugs broken Python code.
Overview
The agent receives buggy Python code and test descriptions, submits fixes, and is rewarded by the fraction of tests passing (0.0β1.0). The episode ends when all tests pass or the step limit is reached.
Tasks
| Task | Difficulty | Bug Type |
|---|---|---|
| task_001_off_by_one | Easy | Fibonacci returns wrong variable |
| task_002_wrong_operator | Easy | < instead of > in find_max |
| task_003_mutable_default | Medium | Mutable default argument in list builder |
| task_004_scope_bug | Medium | Closure captures loop variable by reference |
| task_005_binary_search | Hard | Binary search boundary bugs |
| task_006_graph_cycle | Hard | DFS cycle detection missing recursion stack |
API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /health |
Health check |
| GET | /tasks |
List all available tasks |
| POST | /reset |
Start a new episode |
| POST | /step/{episode_id} |
Submit fixed code |
| GET | /state/{episode_id} |
Get episode metadata |
Reward
reward = tests_passed / total_tests # range: 0.0 β 1.0
done = reward == 1.0 OR step_count >= max_steps
Setup & Run
Local (development)
pip install fastapi uvicorn pydantic httpx openai
cd Desktop/Meta
uvicorn code_debug_env.server.app:app --host 0.0.0.0 --port 7860 --reload
Docker
cd Desktop/Meta/code_debug_env
docker build -t code-debug-env -f server/Dockerfile ..
docker run -p 7860:7860 code-debug-env
Inference Script
export API_BASE_URL="https://router.huggingface.co/v1"
export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
export HF_TOKEN="your_token"
export ENV_URL="http://localhost:7860" # or HF Space URL
python inference.py
Expected output format
[START] task=task_001_off_by_one env=http://localhost:7860 model=Qwen/Qwen2.5-72B-Instruct
[STEP] step=1 action='def fib...' reward=1.00 done=true error=null
[END] success=true steps=1 score=1.000 rewards=1.00
Environment Variables
| Variable | Required | Description |
|---|---|---|
API_BASE_URL |
Yes | LLM API endpoint |
MODEL_NAME |
Yes | Model identifier |
HF_TOKEN |
Yes | Hugging Face / API key |
ENV_URL |
No | OpenEnv server URL (default: http://localhost:7860) |
Project Structure
code_debug_env/
βββ models.py # Pydantic models (DebugAction, DebugObservation, DebugState)
βββ client.py # HTTP client wrapper
βββ openenv.yaml # Environment manifest
βββ pyproject.toml # Package metadata
βββ tasks/ # Task definitions (JSON)
β βββ easy/
β βββ medium/
β βββ hard/
βββ server/
βββ environment.py # Core logic (reset/step/state)
βββ executor.py # Safe subprocess code runner
βββ app.py # FastAPI server
βββ Dockerfile
inference.py # Root-level inference script