Spaces:

arnavk
/

openenv-code-debugger

Sleeping

App Files Files Community

openenv-code-debugger / README.md

arnavzz

fix: add HF Spaces YAML metadata to README

b95e073 about 1 month ago

preview code

raw

history blame contribute delete

3.25 kB

	---
	title: OpenEnv Code Debugger
	emoji: 🐛
	colorFrom: blue
	colorTo: green
	sdk: docker
	pinned: false
	app_port: 7860
	---

	# Code Debug OpenEnv

	An OpenEnv-compatible environment for the Meta x PyTorch Hackathon where an AI agent debugs broken Python code.

	## Overview

	The agent receives buggy Python code and test descriptions, submits fixes, and is rewarded by the fraction of tests passing (0.0–1.0). The episode ends when all tests pass or the step limit is reached.

	## Tasks

	\| Task \| Difficulty \| Bug Type \|
	\|------\|-----------\|----------\|
	\| task_001_off_by_one \| Easy \| Fibonacci returns wrong variable \|
	\| task_002_wrong_operator \| Easy \| `<` instead of `>` in find_max \|
	\| task_003_mutable_default \| Medium \| Mutable default argument in list builder \|
	\| task_004_scope_bug \| Medium \| Closure captures loop variable by reference \|
	\| task_005_binary_search \| Hard \| Binary search boundary bugs \|
	\| task_006_graph_cycle \| Hard \| DFS cycle detection missing recursion stack \|

	## API Endpoints

	\| Method \| Path \| Description \|
	\|--------\|------\|-------------\|
	\| GET \| `/health` \| Health check \|
	\| GET \| `/tasks` \| List all available tasks \|
	\| POST \| `/reset` \| Start a new episode \|
	\| POST \| `/step/{episode_id}` \| Submit fixed code \|
	\| GET \| `/state/{episode_id}` \| Get episode metadata \|

	## Reward

	```
	reward = tests_passed / total_tests # range: 0.0 – 1.0
	done = reward == 1.0 OR step_count >= max_steps
	```

	## Setup & Run

	### Local (development)

	```bash
	pip install fastapi uvicorn pydantic httpx openai
	cd Desktop/Meta
	uvicorn code_debug_env.server.app:app --host 0.0.0.0 --port 7860 --reload
	```

	### Docker

	```bash
	cd Desktop/Meta/code_debug_env
	docker build -t code-debug-env -f server/Dockerfile ..
	docker run -p 7860:7860 code-debug-env
	```

	## Inference Script

	```bash
	export API_BASE_URL="https://router.huggingface.co/v1"
	export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
	export HF_TOKEN="your_token"
	export ENV_URL="http://localhost:7860" # or HF Space URL

	python inference.py
	```

	### Expected output format

	```
	[START] task=task_001_off_by_one env=http://localhost:7860 model=Qwen/Qwen2.5-72B-Instruct
	[STEP] step=1 action='def fib...' reward=1.00 done=true error=null
	[END] success=true steps=1 score=1.000 rewards=1.00
	```

	## Environment Variables

	\| Variable \| Required \| Description \|
	\|----------\|----------\|-------------\|
	\| `API_BASE_URL` \| Yes \| LLM API endpoint \|
	\| `MODEL_NAME` \| Yes \| Model identifier \|
	\| `HF_TOKEN` \| Yes \| Hugging Face / API key \|
	\| `ENV_URL` \| No \| OpenEnv server URL (default: http://localhost:7860) \|

	## Project Structure

	```
	code_debug_env/
	├── models.py # Pydantic models (DebugAction, DebugObservation, DebugState)
	├── client.py # HTTP client wrapper
	├── openenv.yaml # Environment manifest
	├── pyproject.toml # Package metadata
	├── tasks/ # Task definitions (JSON)
	│ ├── easy/
	│ ├── medium/
	│ └── hard/
	└── server/
	├── environment.py # Core logic (reset/step/state)
	├── executor.py # Safe subprocess code runner
	├── app.py # FastAPI server
	└── Dockerfile
	inference.py # Root-level inference script
	```