HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published about 24 hours ago • 15
Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers Paper • 2604.01128 • Published 1 day ago • 6
Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers Paper • 2604.01128 • Published 1 day ago • 6
Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants Paper • 2604.00842 • Published 1 day ago • 5 • 1
Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants Paper • 2604.00842 • Published 1 day ago • 5
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published about 24 hours ago • 6
HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published about 24 hours ago • 15
GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation Paper • 2603.26661 • Published 6 days ago • 10
Meta-Harness: End-to-End Optimization of Model Harnesses Paper • 2603.28052 • Published 4 days ago • 4
Meta-Harness: End-to-End Optimization of Model Harnesses Paper • 2603.28052 • Published 4 days ago • 4
GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation Paper • 2603.26661 • Published 6 days ago • 10
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 13 days ago • 302
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published 2 days ago • 37
VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing Paper • 2603.29852 • Published Feb 22 • 4
Learn2Fold: Structured Origami Generation with World Model Planning Paper • 2603.29585 • Published Feb 2 • 10