Spaces:

arnavk
/

openenv-code-debugger

Sleeping

App Files Files Community

openenv-code-debugger / README.md

arnavzz

fix: add HF Spaces YAML metadata to README

b95e073 about 1 month ago

preview code

raw

history blame contribute delete

3.25 kB

metadata

title: OpenEnv Code Debugger
emoji: 🐛
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
app_port: 7860

Code Debug OpenEnv

An OpenEnv-compatible environment for the Meta x PyTorch Hackathon where an AI agent debugs broken Python code.

Overview

The agent receives buggy Python code and test descriptions, submits fixes, and is rewarded by the fraction of tests passing (0.0–1.0). The episode ends when all tests pass or the step limit is reached.

Tasks

Task	Difficulty	Bug Type
task_001_off_by_one	Easy	Fibonacci returns wrong variable
task_002_wrong_operator	Easy	`<` instead of `>` in find_max
task_003_mutable_default	Medium	Mutable default argument in list builder
task_004_scope_bug	Medium	Closure captures loop variable by reference
task_005_binary_search	Hard	Binary search boundary bugs
task_006_graph_cycle	Hard	DFS cycle detection missing recursion stack

API Endpoints

Method	Path	Description
GET	`/health`	Health check
GET	`/tasks`	List all available tasks
POST	`/reset`	Start a new episode
POST	`/step/{episode_id}`	Submit fixed code
GET	`/state/{episode_id}`	Get episode metadata

Reward

reward = tests_passed / total_tests  # range: 0.0 – 1.0
done   = reward == 1.0 OR step_count >= max_steps

Setup & Run

Local (development)

pip install fastapi uvicorn pydantic httpx openai
cd Desktop/Meta
uvicorn code_debug_env.server.app:app --host 0.0.0.0 --port 7860 --reload

Docker

cd Desktop/Meta/code_debug_env
docker build -t code-debug-env -f server/Dockerfile ..
docker run -p 7860:7860 code-debug-env

Inference Script

export API_BASE_URL="https://router.huggingface.co/v1"
export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
export HF_TOKEN="your_token"
export ENV_URL="http://localhost:7860"  # or HF Space URL

python inference.py

Expected output format

[START] task=task_001_off_by_one env=http://localhost:7860 model=Qwen/Qwen2.5-72B-Instruct
[STEP] step=1 action='def fib...' reward=1.00 done=true error=null
[END] success=true steps=1 score=1.000 rewards=1.00

Environment Variables

Variable	Required	Description
`API_BASE_URL`	Yes	LLM API endpoint
`MODEL_NAME`	Yes	Model identifier
`HF_TOKEN`	Yes	Hugging Face / API key
`ENV_URL`	No	OpenEnv server URL (default: http://localhost:7860)

Project Structure

code_debug_env/
├── models.py          # Pydantic models (DebugAction, DebugObservation, DebugState)
├── client.py          # HTTP client wrapper
├── openenv.yaml       # Environment manifest
├── pyproject.toml     # Package metadata
├── tasks/             # Task definitions (JSON)
│   ├── easy/
│   ├── medium/
│   └── hard/
└── server/
    ├── environment.py # Core logic (reset/step/state)
    ├── executor.py    # Safe subprocess code runner
    ├── app.py         # FastAPI server
    └── Dockerfile
inference.py           # Root-level inference script