Using Claude to fix PyPy3.11 test failures securely
I got access to Claude Max for 6 months, as a promotional move Anthropic made to Open Source Software contributors. My main OSS impact is as a maintainer for NumPy, but I decided to see what claude...
Source: PyPy
I got access to Claude Max for 6 months, as a promotional move Anthropic made to Open Source Software contributors. My main OSS impact is as a maintainer for NumPy, but I decided to see what claude-code could to for PyPy's failing 3.11 tests. Most of these failures are edge cases: error messages that differ from CPython, or debugging tools that fail in certain cases. I was worried about letting an AI agent loose on my development machine. I noticed a post by Patrick McCanna (thanks Patrick!) that pointed to using bubblewrap to sandbox the agent. So I set it all up and (hopefully securely) pointed claude-code at some tests. Setting up There were a few steps to make sure I didn't open myself up to obvious gotchas. There are stories about agents wiping out data bases, or deleting mail boxes. Bubblewrap First I needed to see what bubblewrap does. I followed the instructions in the blog post to set things up with some minor variations: sudo apt install bubblewrap I couldn't run bwrap. After