Real-world projects demonstrating how we approach hard engineering problems.
Fine-tuning GPT OSS 120B on 100 real-world bug-fixing tasks using GRPO, yielding +13% improvement on best@10 and fewer steps per task.