Shadertoys Ported to Rust GPU

rust-gpu.github.io

83 points by efnx a day ago

I've spent a little time in this space, and I'm not sure it's a good idea to write shaders in Rust, although it's probably better than GLSL or WGSL.

Let me start with the pros:

1. Don't have to learn 2 different languages

2. Modules, crates, and the easier ability to share code

3. Easier sharing between rust structs and shader code.

Now the cons, in comparison to Slang [1]

1. No autodiff mode 2. Strictly outputs SPIR-V, while Slang can do CPU, CUDA, Pytorch, Optix, and all the major graphics APIs

3. Less support - Slang is supported by the Khronos group, and Slang gets use at Nvidia, EA, and Valve.

4. Safety isn't very valuable, most GPU code does not use pointers (it's so rare it's considered a feature by Slang!)

5. slangc probably runs a lot faster than rustc (although I would like to see a benchmark.)

6. Worse debugging experience, slang has better interop with things like NSight Graphics, and their Shader Debugger. Slang recently got support in NSight graphics for shader profiling, for example.

7. Slang has support for reflection, and has a C++ api to directly output a JSON file that contains all the reflected aspects.This makes handling the movement between rust <-> gpu much easier. Also, the example shown on the website uses `bytemuck`, but `bytemuck` won't take into consideration the struct alignment rules[2] when using WebGPU. Instead, you have to use a crate like `encase`[3] to handle that. I'm not sure given the example on the website how it would work with WebGPU.

8. If you have pre-existing shaders in GLSL or HLSL, you can use slangc directly on them. No need to rewrite.

9. In reality, you may not have to learn 2 languages but you have to learn 2 different compute models (CPU vs GPU). This is actually a much harder issue, and AFAICT it is impossible to overcome with a different language. The problem is the programmer needs to understand how the platforms are different.

[1] https://shader-slang.org/ [2] https://webgpufundamentals.org/webgpu/lessons/resources/wgsl... WGSL struct alignment widget [3] https://github.com/teoxoy/encase

WithinReason 11 hours ago

See slang it in action: https://shader-slang.org/slang-playground/
wyager 18 hours ago

Had not heard of Slang, thanks for sharing.
It's interesting that Slang looks more like Rust than WGSL does, despite WGSL kind of being de facto "owned by" the Rust community.
- pcwalton 16 hours ago
  
  Not sure why you think WGSL is owned by the Rust community--it's clearly owned by the W3C. (That's part of why it moves so slowly.) WESL [1] is the community-owned set of extensions to WGSL that, while incomplete, promises to move it much closer to Rust.
  [1]: https://github.com/wgsl-tooling-wg/wesl-spec

nefarious_ends 21 hours ago

Anyone have recommendations for resources for learning to write shaders?

FjordWarden 20 hours ago

I've always enjoyed watching The Art of Code[1], Freya[2] also has a good number of videos on it.
But that is just vertex shaders for things like Shadertoys.
There is much more to computer graphics or GPGPU than this though, Im still learning about that myself X).
[1] https://www.youtube.com/watch?v=eKtsY7hYTPg [2]https://www.youtube.com/watch?v=kfM-yu0iQBk
arjonagelhout 17 hours ago

My experience with writing shaders (such as for physically based rendering) is that the shading languages (MSL, GLSL, HLSL) are easy to switch between. The hard part is understanding the physics and understanding how GPUs work internally.
My main approach to writing shaders is to look at existing programs (e.g. Blender) and see what techniques are in use. The Google Filament renderer documentation [0] is also really good when it comes to BDSF functions.
Some papers from Unreal Engine might also help, such as "Real Shading in Unreal Engine 4" [1]
[0] https://google.github.io/filament/Filament.md.html
[1] https://cdn2.unrealengine.com/Resources/files/2013SiggraphPr...
jms55 19 hours ago

If you want to make nice looking materials and effects, you need a combination of good lighting (comes from the rendering engine, not the material), and artistic capabilities/talent. Art is a lot harder to teach than programming I feel, or at least I don't know how to teach it.
Programming the shaders themselves are pretty simple imo, they're just pure functions that return color data or triangle positions. The syntax might be a little different than you're used to depending on the shader language, but it should be easy enough to pick up in a day.
If you want to write compute shaders for computation, then it gets a lot more tricky and you need to spend some time learning about memory accesses, the underlying hardware, and profiling.
pornel 18 hours ago

https://www.scratchapixel.com
https://www.youtube.com/c/acerola_t
https://www.shadertoy.com
https://google.github.io/tour-of-wgsl/
alook 17 hours ago

The book of shaders is fantastic:
http://www.thebookofshaders.com/

impure 21 hours ago

This is big news. Shaders have been a pain to develop for in Unity. If I can program them in Rust, leveraging Rust's tooling and ecosystem, that would be huge.

tripplyons 21 hours ago

Very exciting project! Does this work on M-series macs? I've had trouble running some shaders on my laptop before.

tubs 21 hours ago

I understand the sentiment but to be very pedantic most GPUs do not understand SPIRV, it’s the drivers that do.

hackyhacky 21 hours ago

> While over-all better alternatives to both languages exist, none of them are in a place to replace *HLSL* or *GLSL*. Either because they are vendor locked, or because they don't support the traditional graphics pipeline. Examples of this include *CUDA* and *OpenCL*.

Are CUDA and OpenCL really "better alternatives" to HLSL and GLSL?

CUDA and OpenCL are compute languages; HLSL and GLSL and shader languages. And while one can theoretically do compute in a shader (and we used to!) or shaders in a compute language, I think it's dishonest to claim that CUDA is intended as an updated alternative to GLSL. It's simply apples and oranges.

pjmlp 13 hours ago

OpenCL no one cares about.
CUDA yes, it is already being used in commercial visualisation products like OctaneRender, one of the most important tools in the VFX industry.
NVidia also has plenty of customers on OptiX.
GLSL is dead end, Khronos is on the record they aren't going to develop it further, even for Vulkan, HLSL and now slang, are the way forward.
HLSL due to its use in the games industry, slang due to being developed by NVIDIA and given to Khronos as GLSL replacement.

shadowgovt 20 hours ago

So why Rust?

Rust is great for tracking the lifetime of long-lived resources, which everything in a shader isn't.

Apart from that, what makes Rust a good fit for this problem domain?

virtualritz 19 hours ago

I think that if you have a game/DCC/whatever app you write in Rust, being able to also write any shaders it uses in Rust is simply nice.
And as an added benefit it means not adding another language to a project and all that comes with it for build system/CI. I.e. cargo takes care of everything. That alone is worth a lot IMHO.
Apart from that its not related to Rust. Just replace Rust with your fav. language and imagine you could also write shaders/GPU-targeted code in it. Isn't that desirable?
- ghfhghg 19 hours ago
  
  Kind of and I've certainly used systems like that before but in practice it's not really a massive improvement. Sometimes it's even a tad annoying having another layer of indirection.
pornel 18 hours ago

Lifetime tracking isn't just for safety or resource management. It also helps write correct code, especially parallel one where shared vs mutable matters.
Unit testing of shaders is usually a pain. Traditionally they're a black box without any assert() equivalent, and you can at best try to propagate NaN and generate magenta pixels on failure. Sharing Rust code lets you unit-test parts of it on the CPU.
Sharing of data structures between CPU and GPU is nice too. WGSL is superficially similar to Rust, but using plain WGSL requires maintaining bindings and struct layouts by hand, which is a chore.
For CUDA, the alternative is C++. On the upside that's the official first-class API for CUDA, but the downside is that it's C++. With Rust you don't have the legacy language cruft, nor busywork like header files or makefiles. You get working dependency management, and don't get caught in the unix vs Windows schism.
- exDM69 12 hours ago
  
  > Traditionally they're a black box without any assert() equivalent
  Thankfully these days we have printf in shaders that you can use for "asserts". You can detect if the shader printed anything and consider it a failure.
  You can even add a conditional print in your pixel shader, run your app in renderdoc and find the pixel(s) that printed something. Once you find one, you can step through it in the shader debugger.
  This seemingly simple feature is a huge time saver.
- Const-me 16 hours ago
  
  > Sharing of data structures between CPU and GPU is nice too
  How they did it? Hard to do because GPU hardware can convert data types on the fly, e.g. you can store bytes in VRAM, and convert them to 32-bit floats in [ 0 .. +1 ] in the shader. GPUs can do that for both inputs (loaded texture texels, loaded buffer elements, vertex attributes) and outputs (rendered pixels, stored UAV elements).
  - exDM69 12 hours ago
    
    If you are using plain buffers the GPU and the CPU access data pretty much exactly the same way. With scalar block layout all the alignments are pretty much the same too.
    To get the format conversion stuff you talk about, you need to use images, vertex input or texel buffers and configure the format conversion explicitly.
    It's a good question how much of these conversions are actually done by GPU hardware and how much of it is just software (which you could write yourself in a shader and get same perf). I have not seen an apples to apples benchmark about these format conversions.
    
    Const-me 11 hours ago
    
    > If you are using plain buffers the GPU and the CPU access data pretty much exactly the same way
    Yeah, that will work fine for byte address buffers, to lesser extent constant buffers (they don’t convert data types but the access sematic and alignment are a bit tricky), but not much else. Vertex buffers, textures, and texel buffers / typed buffers in D3D are all widely used in real-time graphics.
    > which you could write yourself in a shader and get same perf
    Pretty sure it’s hardware. Emulating anisotropic texture sampler with HLSL codes would need hundreds of instructions, prohibitively expensive. Even simpler trilinear sampling is surprisingly tricky to emulate due to these screen-space partial derivatives on input.
    > I have not seen an apples to apples benchmark about these format conversions.
    Here’s a benchmark for vertex buffers https://wickedengine.net/2017/06/should-we-get-rid-of-vertex... As you see, on AMD GCN4 he indeed measured pretty much the same perf, however on nVidia Maxwell vertex buffers were 2-4 times faster.
    
    exDM69 8 hours ago
    
    > Yeah, that will work fine for byte address buffers, to lesser extent constant buffers (they don’t convert data types but the access sematic and alignment are a bit tricky), but not much else.
    This is where sharing the CPU and GPU side struct declaration is helpful. With scalar block layout (VK_EXT_scalar_block_layout in Vulkan, not sure how about d3d land) you don't even need to worry about alignment rules because they're the same for GPU and CPU (just make sure your binding base address/offset is aligned).
    > Vertex buffers, textures, and texel buffers / typed buffers in D3D are all widely used in real-time graphics.
    Of course. You don't get to share "structs" here between CPU and GPU here transparently because you need to program the GPU hardware (vertex input, texture samplers) to match.
    There are some reflection based trickery that can help here but rust-gpu afaik doesn't do that. I've seen some projects use proc macros to generate vertex input layout config for GL/Vulkan from Rust structs with some custom #[attribute] annotations.
    > Pretty sure it’s hardware.
    Now this is just guessing.
    > Emulating anisotropic texture sampler with HLSL codes would need hundreds of instructions...
    Texture sampling / interpolation is certainly hardware.
    But the conversion from rgba8_unorm to rgba32f, for example? Or r10g10b10a2?
    I've not seen any conclusive benchmark results that suggest whether it's faster to just grab these from a storage buffer in a shader and do the few arithmetic instructions or whether it's faster to use an texel buffer. Images are a different beast entirely due to tiling formats (you can't really memory map them so the point of sharing struct declarations is irrelevant).
    > Here’s a benchmark for vertex buffers
    I am familiar with this benchmark from 8 years ago, which is highly specific to vertex buffers (and post transform cache etc). It's a nicely done benchmark but it has two small flaws in it: the hw tested is quite old by now and it doesn't take into account the benefit of improved batching / reduced draw calls that can only be done with custom vertex fetch (so you don't need BindVertex/IndexBuffer calls). It would be great if this benchmark could be re-run with some newer hw.
    But this benchmark doesn't answer the question whether the typed buffer format conversions are faster than doing it in a shader (outside of vertex input).
    > however on nVidia Maxwell vertex buffers were 2-4 times faster.
    The relevant hardware got revamped in Turing series to facilitate mesh shaders, so can't extrapolate the results to present day hardware.
    Fwiw. I've been using custom vertex fetch with buffer device address in my projects for a few years now and I haven't noticed adverse performance implications on any hw I've used (Intel, NV and AMD). But I haven't done rigorous benchmarking that would compare to using vertex input stage.
    I'm not using rust-gpu for shaders at the moment, but if I was, it would be helpful to just use the same struct declarations. All my vertex data, instance data, constant buffers and compute buffers are an 1:1 translation from Rust to GLSL struct declarations which is just redundant work.
    
    Const-me 7 hours ago
    
    > This is where sharing the CPU and GPU side struct declaration is helpful
    Indeed, but sharing code between CPU and shaders is not the only way to solve the problem. I wrote a simple design-time tool which loads compiled shaders with shader reflection API, and generates a source file with C++ header (or for other projects C# structures) with these constant buffers. At least with D3D11, compiled shaders have sufficient type and memory layout info to generate these structures, matching memory layout by generating padding fields when necessary.
    > not sure how about d3d land
    Pretty sure D3D11 doesn’t have an equivalent of that Vulkan extension. Not sure about D3D12 though, only used the 12 briefly.
    > I've been using custom vertex fetch with buffer device address in my projects
    In my projects, I sometimes use a lot of non-trivial input layout features. Sometimes I need multiple vertex buffers, e.g. to generate normals on GPU with a compute shader. Sometimes I need instancing. Often I need FP16 or SNORM/UNORM vertex attributes, like RG16_UNORM for octahedron-encoded normals.
- pjmlp 13 hours ago
  
  Just as an example the way Vulkan lifetime work depending on the resource group and associated semaphores, doesn't fit neither with Rust affine types, nor with RAII, hence so many don't make use of the C++ RAII handles on the Vulkan SDK.
xphos 20 hours ago

I think being able to develop it is a testiment that rust makes embeddeding stuff like this much easier than other languages. Its also intergrated in a rust project so you might not want to hop to a new language to do some shader stuff quickly
shadowgovt 3 hours ago

Update: thanks to everyone for the thoughtful comments on this.
I'm so in the weeds with dealing with GLSL these days that "What if you could use the same language to implement the CPU logic as the GPU logic" wasn't even a goal I could see. That's actually quite huge; my concerns around Rust were mostly that it's not a deeply simple language and so much of shader frameworks is code-that-writes-code, but it's simple enough that I think it could still be used in that space, while granting the advantages of not having to context-switch in and out of one's CPU-language while developing a shader.
This has promise!
wyager 18 hours ago

Rust's ecological niche, besides the affine types/lifetime stuff, is "borrow as much stuff as we can from Haskell/ML without dragging in a language runtime".
"Without dragging in a language runtime" happens to be a primary requirement for writing bare-metal code or writing GPU shaders.
So you have a language that does have a bunch of nice new features (ADTs, acceptably modern type system, etc.) and doesn't have any of the stuff that prevents you from compiling it to random bare-metal targets like GPUs or MMUless microcontrollers.
- pjmlp 13 hours ago
  
  Basically slang, GLSL replacement, designed by NVidia and adopted by Khronos as industry standard.
thrance 20 hours ago

Rust also has a great type system and zero-cost abstractions. Plus there's already cuda if you want to run C on the GPU.
marsven_422 14 hours ago

[dead]