Spaces:
Sleeping
Sleeping
| title: OpenEnv Code Debugger | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| pinned: false | |
| app_port: 7860 | |
| # Code Debug OpenEnv | |
| An **OpenEnv-compatible environment** for the Meta x PyTorch Hackathon where an AI agent debugs broken Python code. | |
| ## Overview | |
| The agent receives buggy Python code and test descriptions, submits fixes, and is rewarded by the fraction of tests passing (0.0β1.0). The episode ends when all tests pass or the step limit is reached. | |
| ## Tasks | |
| | Task | Difficulty | Bug Type | | |
| |------|-----------|----------| | |
| | task_001_off_by_one | Easy | Fibonacci returns wrong variable | | |
| | task_002_wrong_operator | Easy | `<` instead of `>` in find_max | | |
| | task_003_mutable_default | Medium | Mutable default argument in list builder | | |
| | task_004_scope_bug | Medium | Closure captures loop variable by reference | | |
| | task_005_binary_search | Hard | Binary search boundary bugs | | |
| | task_006_graph_cycle | Hard | DFS cycle detection missing recursion stack | | |
| ## API Endpoints | |
| | Method | Path | Description | | |
| |--------|------|-------------| | |
| | GET | `/health` | Health check | | |
| | GET | `/tasks` | List all available tasks | | |
| | POST | `/reset` | Start a new episode | | |
| | POST | `/step/{episode_id}` | Submit fixed code | | |
| | GET | `/state/{episode_id}` | Get episode metadata | | |
| ## Reward | |
| ``` | |
| reward = tests_passed / total_tests # range: 0.0 β 1.0 | |
| done = reward == 1.0 OR step_count >= max_steps | |
| ``` | |
| ## Setup & Run | |
| ### Local (development) | |
| ```bash | |
| pip install fastapi uvicorn pydantic httpx openai | |
| cd Desktop/Meta | |
| uvicorn code_debug_env.server.app:app --host 0.0.0.0 --port 7860 --reload | |
| ``` | |
| ### Docker | |
| ```bash | |
| cd Desktop/Meta/code_debug_env | |
| docker build -t code-debug-env -f server/Dockerfile .. | |
| docker run -p 7860:7860 code-debug-env | |
| ``` | |
| ## Inference Script | |
| ```bash | |
| export API_BASE_URL="https://router.huggingface.co/v1" | |
| export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct" | |
| export HF_TOKEN="your_token" | |
| export ENV_URL="http://localhost:7860" # or HF Space URL | |
| python inference.py | |
| ``` | |
| ### Expected output format | |
| ``` | |
| [START] task=task_001_off_by_one env=http://localhost:7860 model=Qwen/Qwen2.5-72B-Instruct | |
| [STEP] step=1 action='def fib...' reward=1.00 done=true error=null | |
| [END] success=true steps=1 score=1.000 rewards=1.00 | |
| ``` | |
| ## Environment Variables | |
| | Variable | Required | Description | | |
| |----------|----------|-------------| | |
| | `API_BASE_URL` | Yes | LLM API endpoint | | |
| | `MODEL_NAME` | Yes | Model identifier | | |
| | `HF_TOKEN` | Yes | Hugging Face / API key | | |
| | `ENV_URL` | No | OpenEnv server URL (default: http://localhost:7860) | | |
| ## Project Structure | |
| ``` | |
| code_debug_env/ | |
| βββ models.py # Pydantic models (DebugAction, DebugObservation, DebugState) | |
| βββ client.py # HTTP client wrapper | |
| βββ openenv.yaml # Environment manifest | |
| βββ pyproject.toml # Package metadata | |
| βββ tasks/ # Task definitions (JSON) | |
| β βββ easy/ | |
| β βββ medium/ | |
| β βββ hard/ | |
| βββ server/ | |
| βββ environment.py # Core logic (reset/step/state) | |
| βββ executor.py # Safe subprocess code runner | |
| βββ app.py # FastAPI server | |
| βββ Dockerfile | |
| inference.py # Root-level inference script | |
| ``` | |