Show HN: A toy MCP for AI agents to code, run, and see output of GPU code safely

Ever wondered what happens when you let an AI agent loose on GPU pipeline programming, including shaders? `shaderc-vkrunner-mcp` enables you to find out, without risking any actual GPU or dealing with driver or OS stuff.

The core idea is to give AI agents the ability to write, compile, and run GPU code and iterate safely or stably, then let you see the output if desired – all completely locally.

It runs entirely inside Docker using Mesa's software Vulkan driver. No need to try to paravirtualize any actual physical GPU, meaning agents can experiment with fancy Vulkan features – atomics, subgroups, cooperative matrices, and more – safely on the CPU. It's surprisingly capable of testing logic, even if it's slow.

I went with Vulkan since it's cross-platform (drivers provide it well these days, Mesa runs almost everywhere, and MoltenVK exists for Mac users wanting to experiment similarly), has a sufficient set of exposed primitives for hardware to accelerate the code to competitive levels, enough to explore, and avoids vendor lock-in.

But why put all these in Docker? Setting up Vulkan dev tools or similar directly on your host machine can override some environment variables, causing side effects or more. With this, Docker is the only dependency. Pull the image (multi-arch: amd64, arm64, experimental riscv64/ppc64le via QEMU build) and connect your MCP client (like Copilot or the Inspector tool). It mounts your current directory so the agent can save files (images, etc.) without exposing much further to avoid any problem since LLM will be coming up for code, maybe based on some another MCP you will tie, leading to a plethora of potential issues.

It is far from being perfect or complete. It's currently monolithic and wraps CLI tools (`shaderc`, `VkRunner`), with plans to use APIs directly later. Primarily tested with Copilot's agent. It's definitely a "toy" right now, but a fun one to explore agent capabilities. I have been able to make it output interesting visuals from simple to moderately advanced and actually tweak some compute shaders. Can AI really optimize a shader or create a cool SDF raymarcher from scratch? Lets you try, at least.

Check out the code, try feeding it some prompts, and see what breaks! Contributions are welcome, especially regarding usability for agents, since I suspect there might be quite some headroom to improve the descriptions or generation of the scheme. Some sample screenshots are in the top-level README file.

A usability note is that oftentimes, Agents don't pick up the interface at first try; however, if you feed the errors that come out, which agent does not iterate (probably just FTTB, I expect Copilot, etc. to let AI iterate over scheme issues in the near future), things start to smooth out and get productive with prompting in the right direction. Since the task here is at a lower level than usual (using Vulkan & GLSL SPIR-V directly), it takes slightly longer for the agents to adjust, at least in my experience with Claude 3.7 Sonnet and 4o.

GitHub: https://github.com/mehmetoguzderin/shaderc-vkrunner-mcp

There is a GitHub Action running right now to package Docker images to GHCR; however, it is fairly easy and quick to build locally for both top-level Dockerfile and Devcontainer to increase the fun.