Container Water 06: Leaving ShaderGraph behind

If this is your first exposure to this series of posts about the “Container Water”, you’ll not “heard” me moaning about ShaderGraph. Well, over my travels, I’ve been getting progressively more frustrated at the speed in which ShaderGraph allows me to work.

This post is a slight departure from the rest of the series as I want to record some of my techniques from moving back to HLSL from ShaderGraph, in the hope that others will find it useful until Unity catch up.

2 years on… (December 2022)

We went back to ShaderGraph… in fact it wasn’t many months after writing this post. The main reason being was Unity upgrades. We were unfortunate to need to make a massive jump in HDRP version which in turn saw a rather large re-write to the structure of these shaders. I decided that rather that I didn’t want to risk this happening again, so back to the graph I went.

It’s still quite possible to work without shader graph, but you have to be prepared to jump through a web of include files to debug issues if anything changes or moves in various upgrades.

There’s also word that Unity has a version 2 of ShaderGraph on the way, as well as a new Block Shaders (similar to the old Surface Shader compiler, yipee).

To be clear, this is not me ragging on Unity specifically for this. Yes I do think ShaderGraph is fairly unfinished at this point, especially considering there’s no good alternative equivalent to the old Surface Shader system. However gamedev is software development too, and as such, I understand you have to start somewhere with new features, and ShaderGraph is a big and arguably much needed new feature.

I’m confident the team at Unity are aware of it’s shortcomings, in a small part due to us having a moan at them about it directly. They are a good bunch and are working on it.

The above not withstanding, I still had a job to do, and I wasn’t happy using ShaderGraph to do it.

TL;DR

This is a long one, and is covers my logic and approach for converting back to HLSL with some examples, but here’s a quick takeaway.

  • Use ShaderGraph to generate as close a template to the shader you want as possible.
  • Plug in dummy data into all of the shader’s outputs you need.
  • Get a diff tool – kdiff3 is free and works well – this will help you compare different shader versions if you need to – you will.
  • Get an IDE where you can open folders – VSCode is also free. This will allow you to search around HDRP and SRP core’s HLSL include libraries.
  • Use the ShaderGraph node documentation. Each node’s documentation there shows the code it generates. This will help get you acclimated with the SRP way of doing things.
  • “Find & Replace” is your friend, generated HLSL code from ShaderGraph contains a lot of repetition. You’ll often need to make multiple of the same edit.
  • Spend time making generated code easier to read, add comments if you need to. You can’t change the fact these are big and complex shaders, you can however invest time in making things easier.
  • As with conventional shader writing, break out shared code into include files where possible. This allows you to edit and maintain code in as few places as possible.

The Point of ShaderGraph

Lets start by addressing what ShaderGraph is doing, and why it’s doing it.

ShaderGraph’s goal is to make shader writing easier and more accessible across Unity’s new-ish Scriptable Render Pipeline. For most people’s purposes, this is the Universal RP, and the High Definition RP.

In theory, you can write a shader for either RP and simply include different master (or output) nodes for each applicable pipeline. You’d then change output when you changed render pipeline. Apparently this feature has been removed from 8.x.x onwards…

The core logic of the shader – the part you would create, is abstracted away from writing boiler-plate code. ShaderGraph will automatically generate all of the appropriate shader passes for your target RP, generate the appropriate lighting model, and include the required shader features you’ve chosen.

This is reminiscent of the surface shader model of the Built-in RP, except much more automated. It’s already been years since folks using Unity have been regularly writing custom lighting functions, and this is a natural next step.

Generated Code and its Anatomy

Going forward I’m looking at HDRP versions of ShaderGraph output, and HDRP is a vast and complex beast.

ShaderGraph files are actually just json data, not typical HLSL code. This is one of the problems I have with this system, you can’t easily diff shader changes here.

For example, take the following SampleTexture2DLODNode json object from my shader. I’ve cleaned it up a little to make it somewhat more human readable.

{
    "typeInfo": {
        "fullName": "UnityEditor.ShaderGraph.SampleTexture2DLODNode"
    },
    "JSONnodeData": "{
        \"m_GuidSerialized\": 
\"097ac773-77cf-45be-9431-3fa6660582d5\",
        \"m_GroupGuidSerialized\": 
\"3d11f03d-ba06-4ce7-ac12-c6c5089ec073\",
        \"m_Name\": \"Sample Texture 2D LOD\",
        \"m_NodeVersion\": 0,
        \"m_DrawState\": {    
        \"m_Expanded\": true,    
        \"m_Position\": {        
        \"serializedVersion\": \"2\",        
        \"x\": 112.00005340576172,        
        \"y\": 16.00001335144043,        
        \"width\": 206.0,        
        \"height\": 177.0        }    },
        \"m_SerializableSlots\": [        {        
        \"typeInfo\": {            
        \"fullName\": 
             \"UnityEditor.ShaderGraph.Vector4MaterialSlot\"}, 
        \"JSONnodeData\": \"{\
        \\\"m_Id\\\": 0,\
        \\\"m_DisplayName\\\": \\\"RGBA\\\",\
        \\\"m_SlotType\\\": 1,\
        \\\"m_Priority\\\": 2147483647,\
        \\\"m_Hidden\\\": false,\
        \\\"m_ShaderOutputName\\\": \\\"RGBA\\\",\
        \\\"m_StageCapability\\\": 3,\
        \\\"m_Value\\\": {\    
        \\\"x\\\": 0.0,\    
        \\\"y\\\": 0.0,\    
        \\\"z\\\": 0.0,\    
        \\\"w\\\": 0.0\    },\
        \\\"m_DefaultValue\\\": {\    
        \\\"x\\\": 0.0,\    
        \\\"y\\\": 0.0,\    
        \\\"z\\\": 0.0,\    
        \\\"w\\\": 0.0\    }\}\"        },        {        
        \"typeInfo\": {            
        \"fullName\": 
             \"UnityEditor.ShaderGraph.Vector1MaterialSlot\"},        
        \"JSONnodeData\": \"{\
        \\\"m_Id\\\": 5,\
        \\\"m_DisplayName\\\": \\\"R\\\",\
        \\\"m_SlotType\\\": 1,\
        \\\"m_Priority\\\": 2147483647,\
        \\\"m_Hidden\\\": false,\
        \\\"m_ShaderOutputName\\\": \\\"R\\\",\
        \\\"m_StageCapability\\\": 3,\
        \\\"m_Value\\\": 0.0,\
        \\\"m_DefaultValue\\\": 0.0,\
        \\\"m_Labels\\\": [\    
        \\\"X\\\"\    ]\}\"        },        {        
        \"typeInfo\": {            
        \"fullName\": 
             \"UnityEditor.ShaderGraph.Vector1MaterialSlot\"},       
        \"JSONnodeData\": \"{\
        \\\"m_Id\\\": 6,\
        \\\"m_DisplayName\\\": \\\"G\\\",\
        \\\"m_SlotType\\\": 1,\
        \\\"m_Priority\\\": 2147483647,\
        \\\"m_Hidden\\\": false,\
        \\\"m_ShaderOutputName\\\": \\\"G\\\",\
        \\\"m_StageCapability\\\": 3,\
        \\\"m_Value\\\": 0.0,\
        \\\"m_DefaultValue\\\": 0.0,\
        \\\"m_Labels\\\": [\    
        \\\"X\\\"\    ]\}\"        },        {        
        \"typeInfo\": {            
        \"fullName\": 
             \"UnityEditor.ShaderGraph.Vector1MaterialSlot\"},
        \"JSONnodeData\": \"{\
        \\\"m_Id\\\": 7,\
        \\\"m_DisplayName\\\": \\\"B\\\",\
        \\\"m_SlotType\\\": 1,\
        \\\"m_Priority\\\": 2147483647,\
        \\\"m_Hidden\\\": false,\
        \\\"m_ShaderOutputName\\\": \\\"B\\\",\
        \\\"m_StageCapability\\\": 3,\
        \\\"m_Value\\\": 0.0,\
        \\\"m_DefaultValue\\\": 0.0,\
        \\\"m_Labels\\\": [\    
        \\\"X\\\"\]\}\"},{        
        \"typeInfo\": {            
        \"fullName\": 
             \"UnityEditor.ShaderGraph.Vector1MaterialSlot\"},          },      
[...]
},

And this is node the above snippet represents.

The json object describes a how the node is constructed in shader graph, but discerning it’s relationship to other nodes is a very tricky process. It’s safe to say, Unity doesn’t intend end-users to be using git to diff these files.

How to show generated code

If we want to see the HLSL output, ShaderGraph allows you to right-click on most nodes, and select ‘Show Generated Code’. This means you can see what each individual node is doing, or by doing this on the Master node, you can see the shader in it’s entirety. So lets do that now…

Potential Future Compatibility Issues (lol)

Taking the ShaderGraph code and modifying it comes with risk, and that’s potential compatibility with future versions of HDRP and ShaderGraph. Abstraction of the end user’s logic from final HLSL means if Unity change large parts of the pipeline, they can ensure existing shaders still (mostly) compile. This is gone if you take the reigns.

Before generating your code, make sure you’ve at least some dummy data plugged into your master node for every feature you wish to use. Also make sure to set up the other surface features – refraction, coloured specular, transparency, etc. It’s easier to do this now then to add them later, although it is still possible.

If you don’t do this before you generate your code, you might have large sections of logic missing which you intend to use later. Vertex displacement/animation being an important one, as it’s only included if it’s used.

Another version of “Doom scrolling” in my book…

… 12k lines of code?!

I won’t lie, I tried this once before, saw this generated code, and shied away. I was instantly put off by the sheer complexity of the task, maybe ShaderGraph was the only answer, this code wasn’t meant to be read in this fashion.

But I ended up back here, so in true Tech Artist fashion, I needed a way to make it work, as much for my own sanity as anything. The first thing I noticed with this shader, was the number of passes HDRP requires. So I collapsed them in Rider, and commented next to the pass declaration the name of each one. This helped me understand what was going on.

Simplified overview of the generated shader

Immediate observation; 2 subshaders, one “normal”, and one for DXR (raytracing) support. As I was not looking at supporting DXR at the present, I can delete the second subshader, literally halving the length of the shader.

This leaves me with the following passes;

  • ShadowCaster
  • META
  • SceneSelectionPass
  • DepthOnly
  • GBuffer
  • MotionVectors
  • Forward

HDRP is selectively rendering each pass depending on the render context in the pipeline. Some of the passes are quite self explanatory – ShadowCaster for example is called when rendering from a light’s perspective. SceneSelectionPass I think is what helps Unity draw the orange outline, although based on current experience I’ve not gotten this working…

SceneSelectionPass, is that you?

META is to do with lightmap baking, GBuffer is HDRP’s deferred renderer, and the rest is similarly pretty self explanatory.

From scrolling through these passes, I soon noticed there appears to be a lot of repetition, so I took each of these passes into their own text file, and used a file comparison tool to see what was different.

Between the Forward and GBuffer there are just 11 differences, and the number of differences in certain passes never seems to exceed 30. So whilst this shader looks big and complicated, it’s not as bad as it first seems.

In fact, all of the passes are based on the template for the associated Master node, in this case the Lit Master, the template for which is “HDLitPass.template”, located in the “render-pipelines.high-definition” package.

$$$?

This template file is not valid shader code and will not compile as is. It’s pretty close, but the main “issues” are the lines prefixed with a ‘$’ character, and the splice() function.

The ‘$’ character is removed if that line of code is to be used in the shader, or replaced with “//” if the line is to be ignored. ShaderGraph understands what lines are needed based on functionality used in the graph.

$splice() is similar, except it’s where chunks of code are inserted, similarly dependent on the functionality requested by the shader.

This ~400 line template forms the basis of how ShaderGraph generates passs for “Lit Master”, and being able to access it, makes up for having more or less no documentation for what we’re doing.

It contains the following “blocks” and vague sections of code;

  • Render Modes
  • Opening HLSLPROGRAM tag and universal pragma declarations
  • Graph Defines
  • Variant Definitions
  • Shader Stages
  • Active Field Defines
  • Some #ifdef keyword spaghetti
  • Interpolator Packing And Struct Declarations
  • Graph generated code
  • A fairly large chunk universal code, once again full of compile-time conditionals
  • Finally, the Pass Includes.

I’m not going to totally reverse engineer this entire template, I don’t have complete-enough understanding of all of it yet. I’ve enough of a working understanding of it to get what I want out of it however.

The most interesting of the element of this template is the “Graph generated code” section. This is where the node graph is translated into HLSL and is, in theory, where most of our edits will want to be made.

This section of code is not identical across all passes, it is however very similar. The differences are based on what the required final outputs from the VertexDescription and SurfaceDescription are. For example, the SceneSelectionPass only seems to care about the Alpha property of the SurfaceDescription being populated.

Things are close enough that for my purposes, I decided move all the code in this block, across all passes, into it’s own HLSL file. This might mean we get some logic executing in certain passes which we don’t need – assuming it’s not compiled out as it’s unused – but this is easy enough to optimise if it becomes an issue.

Now in each pass in the original shader file, I replace the code between the “Graph generated code” comments with a reference to my include file. There’s likely at least 6 places in the shader needing this treatment, “Find & Replace” becomes your friend here. I’ve had a number of hard-to-diagnose issues in this process because I’ve not copied the same code to each pass before.

//----------------------------------------------------------------
// Graph generated code // GBuffer
//----------------------------------------------------------------
#include "myInclude.hlsl"
//----------------------------------------------------------------
// End graph generated code // GBuffer
//----------------------------------------------------------------

I also like to add the name of the pass in each “block” of logic in this shader. It takes a little more time, but it makes navigation around this still rather large file a lot faster.

I’m now down from ~12k line shader file, to a ~5k line shader, and a ~200 line include. Bonus points, I now mostly only need to edit the logic I care about in one place, rather then in each pass. I’ll cover some edge cases to this later.

Making it Human-Readable

With most of of my code in a much friendlier context, it’s clear how template-y it is. Which makes sense, although I felt that Shader Forge and Amplify generated more human-readable code then this…

Possibly the best example of this is the multiply function.

void Unity_Multiply_float(float A, float B, out float Out)
{
    Out = A * B;
}

// And it's use
float _Multiply_6D0246C3_Out_2;
                     Unity_Multiply_float(_SampleTexture2DLOD_E9C752E2_R_5, 
                     1, 
                     _Multiply_6D0246C3_Out_2);

Feels pretty overkill right? Well, the downside to a fairly generic template system, is everything fits the same pattern. It makes sense, but doesn’t make it any more readable.

Also worth noting, none of ShaderGraph’s functions have a return statement, they just write to the Out argument passed to the function. You have to follow this same format when you write your own custom code as well. It makes sense in the context, but it’s not my preferred style of writing HLSL.

Functions like “Multiply” and “Negate” I remove completely, others which are a little more verbose, like “Remap” I’ll keep and reformat to a more familiar pattern and move to another “utils” include file as they are always handy to have across the project.

void Unity_Remap_float(float In, 
                       float2 InMinMax, 
                       float2 OutMinMax, 
                       out float Out)
 {
    Out = OutMinMax.x + (In - InMinMax.x) * 
          (OutMinMax.y - OutMinMax.x) / (InMinMax.y - InMinMax.x);
}

// Becomes ...

float Remap(float In, 
            float2 InMinMax, 
            float2 OutMinMax)
 {
    return OutMinMax.x + (In - InMinMax.x) * 
          (OutMinMax.y - OutMinMax.x) / (InMinMax.y - InMinMax.x);
}

Another big part of making this code readable, is variable naming and assignment. Everything seems to get it’s own generic name which describes what node socket it was used in, rather then a human-understandable variable name. Once again, makes sense given the context, but not good for me going forward.

It’s also common to have patterns like a texture read into a single variable, to have that variable split into 4 new variables, one for each colour channel, for then only one of these subsequent channels being used elsewhere.

I don’t want to create long and unreadable single-line statements in my shaders, but at the same time, a lot of 5-line sections of code can easily become 1 or 2 with little degradation on readability, or often even an improvement.

Digging into Specific Functionality

Now my shader is almost ready for continued development, I’m going to cover examples of adding functionality after generating the “template” shader.

In the previous post in the series, I briefly mention that ShaderGraph offers some a lot of functionality, of which relies on the SRP Core package and it’s rather large library of include files.

Take the Transform Node, if we wanted to add this functionality later, we can refer to the documentation on ShaderGraph’s nodes to see Unity has included a snippet of what code is generated.

float3x3 tangentTransform_World = float3x3(
                                       IN.WorldSpaceTangent,
                                       IN.WorldSpaceBiTangent,
                                       IN.WorldSpaceNormal);

float3 _Transform_Out = TransformWorldToTangent(
                              TransformObjectToWorld(In),
                              tangentTransform_World);

Simple right? Create a rotational matrix where there Tangent, BiTangent and Normal are each of the 3 axis, then send them to the function TransformWorldToTangent() along with what appears to be a stuct of some kind.

There’s two elements from this you’re not necessarily going to have. The In struct with the worldspace versions of the surface axis, and then the mentioned function. How do we go about getting these equivalents outside of ShaderGraph?

Interpolator Structs

In my include file of code copied from the original shader, there’s a struct named SurfaceDescriptionInputs which is passed to the SurfaceDescriptionFunction as the IN argument. It turns out this is what we need.

// Pixel Graph Inputs
struct SurfaceDescriptionInputs
{
    float3 ObjectSpaceNormal; // optional
    float3 WorldSpaceNormal; // optional
    float3 TangentSpaceNormal; // optional
    float3 ObjectSpaceViewDirection; // optional
    float3 WorldSpaceViewDirection; // optional
    float3 ObjectSpacePosition; // optional
};

I can see here WorldSpaceNormal already there, but not the Tangent or BiTangent, so I add two new float3 properties. This will compile, however these values won’t have been set.

This is where we need to go back to our base shader, leaving behind the confines of our smaller, sleeker include. The setting of SurfaceDescriptionInputs in the the main shader happens in the “Interpolator Packing And Struct Declarations” block.

The place where the SurfaceDescriptionInputs struct is populated, which becomes our In struct from the prior example.

I can see again, output.WorldSpaceNormal available, however the Tangent and BiTangent lines are commented out.

output.WorldSpaceTangent =           input.tangentToWorld[0].xyz;
output.WorldSpaceBiTangent =         input.tangentToWorld[1].xyz;

After uncommenting those, I then check what’s being assigned to them. In this case, it’s columns from the tangentToWorld matrix coming in from the input struct. Scrolling up a little, I can see this input is of the type FragInputs. In this case, I already have this property available as output.WorldSpaceNormal already required it, so ShaderGraph had it ready.

This is quite a fiddly process, which requires a lot of back and forth, chasing around variable declarations and following where their usage leads.

Digging Around the “Include Jungle”

SRP core and the HDRP define a lot of helper functions and macros, similar to Unity’s Built-in RP. Just there are a lot more of them, and they seem to be spread out further. For starters, some are declared in the SRP core, and some in the HDRP itself.

The trick I found, was to opening both of these packages in VSCode (any text editor which can open folders will do), which allowed me to use the “Find all” function to search these directories for various definitions. Sadly, I’ve yet to discover an IDE which allows navigation of shaders and their includes in the same way as the likes of VS or Rider allow for navigation of C# solutions.

Continuing with the previous example, I now want to find where the function TransformWorldToTangent() lives. A few minutes of searching reveal that it’s contained within the ‘render-pipelines.core’ package and is located in SpaceTransforms.hlsl.

real3 TransformWorldToTangent(real3 dirWS, real3x3 tangentToWorld)
{
    // Note matrix is in row major convention with left 
    // multiplication as it is build on the fly
    float3 row0 = tangentToWorld[0];
    float3 row1 = tangentToWorld[1];
    float3 row2 = tangentToWorld[2];
    
    // these are the columns of the inverse matrix but scaled by 
    // the determinant
    float3 col0 = cross(row1, row2);
    float3 col1 = cross(row2, row0);
    float3 col2 = cross(row0, row1);
    
    float determinant = dot(row0, col0);
    float sgn = determinant<0.0 ? (-1.0) : 1.0;
    
    // inverse transposed but scaled by determinant
    // Will remove transpose part by using matrix as the first arg 
    // in the mul() below
    // this makes it the exact inverse of what 
    // TransformTangentToWorld() does.
    real3x3 matTBN_I_T = real3x3(col0, col1, col2);
    
    return SafeNormalize( sgn * mul(matTBN_I_T, dirWS) );
}

What is “real”?

The real type is SRP replaced with platform-specific precision. [Find source and elaborate]

I don’t need to do anything with this function, it can be called it just as it would have been from generated code (it’ll already be referenced in the spaghetti of includes somewhere). The point of tracking it down however, is seeing what it does. Knowing how to do this is really important as it helps when debugging.

Adding Keyword Features

The last area I’m going to touch on here (for now at least), is adding features which are behind keyword definitions. This is another interesting part of this shader’s architecture as it means it’s relatively straight forward to enable whole trees of logic – as you’d want to from ShaderGraph, with the compiler easily able to strip this out if it’s not needed.

These “features” can be seen in the “Variant Definitions” block of the template file, and mostly correspond to the options exposed to you when editing the Master node in ShaderGraph.

Say I want to enable refraction, there’s 4 keywords to consider.

Refraction:                         #define _HAS_REFRACTION 1
RefractionBox:                      #define _REFRACTION_PLANE 1
RefractionSphere:                   #define _REFRACTION_SPHERE 1
RefractionThin:                     #define _REFRACTION_THIN 1

Enabling them all isn’t a valid use of these features. Looking at the documentation on Screen Space Refraction, I can see the bottom 3 keywords correspond to the refraction model, so each of this would likely be mutually exclusive. The first keyword, _HAS_REFRACTION is the switch for the whole system.

So if I want spherical refraction, I’d go back to my shader, and in each pass (remembering “Find and replace”), uncomment the lines defining _HAS_REFRACTION and _REFRACTION_SPHERE. Now these keywords are being defined, I need to check where they are used. Thankfully there’s only one place in the template (but once per pass).

surfaceData.ior = surfaceDescription.RefractionIndex;
surfaceData.transmittanceColor = srfaceDescription.RefractionColor;
surfaceData.atDistance = surfaceDescription.RefractionDistance;

The BuildSurfaceData function has the data copied from to the SurfaceDescription to the SurfaceData struct I need, but it’s commented out, so again, I uncomment these.

The final thing left to is then assign some values to these new properties in my SurfaceDescriptionFunction in the new include file, and my shader will start rendering into the refraction pass.

If the shader had been set to “Opaque” not “Transparent”, it would have been a little fiddlier. Had I been converting an opaque shader then there’d be a few more things to change. The best bet here is to generate a new shader out of ShaderGraph with the desired settings, compare that to your original shader, and use the differences as your guide.

Differences between opaque and transparent shaders out of ShaderGraph.

Aside from some different property values, _RenderQueueType, _SurfaceType, _ZWrite, a different render queue, “Transparent+0” as opposed to “Geometry+0”, there’s only 3 differences per pass, with one exception.

_BLENDMODE_PRESERVE_SPECULAR_LIGHTING 1 is defined for the transparent version, but is not accessed any more in the generated shader. It’s used in “Material.hlsl” in the HDRP package with the acompanying comment.

_BLENDMODE_PRESERVE_SPECULAR_LIGHTING for correct lighting when blend mode are use with a Lit material

The AlphaFog and BlendMode.PreserveSpecular comments in the transparent shader in the “Graph Defines” block, and as far as I can tell serve no purpose in the final shader.

The only pass-specific change I’ve got here is for the Forward pass. RAYTRACING_SHADER_GRAPH_HIGH is defined for the transparent version, a reference to which I’ve not yet found in SRP Core or HDRP packages, but as I’d already mentioned, I’m not using raytracing for now.

// Transparent
#define USE_CLUSTERED_LIGHTLIST
// Opaque
#pragma multi_compile USE_FPTL_LIGHTLIST USE_CLUSTERED_LIGHTLIST

The other change is a difference in some of the lighting. The transparent version of the shader just has use of the clustered light list, whereas the opaque’s multi_compile seems to give the shader the choice between clustered, and FPTL lists.

These two keywords are related to HDRPs lighting pipeline, referred to as the “LightLoop”, and their usage can be found in HDRP’s “LightLoopDef.hlsl”. I think that the transparent version only has access to the clustered variation, as transparent renderers are rendered using clustered lighting, whereas opaque renderers typically use the tile renderer where possible (more on this).

Now this was quite a specific example for “adding” features to converted HLSL shaders, but the general approach holds. There’s a lot of manual work, trial, error, and digging through HDRP source. I would be great if this wasn’t necessary, however it’s a great way to learn how HDRP works under the hood.

Next Time

Next time I’m back to regular programming, and I’m going to be looking at how I add external forces to the fluid simulation.

Look at that little guy go