A high level shader construction syntax – Part II

This post has been published also in Coherent Labs’s blog – the company I co-founded and work for.

Enhanced shader syntax

As explained in A high level shader construction syntax – Part I the proposed shader syntax is an extension over SM4 and SM5. It is simple enough to be parsed with custom regex-based code.

One of the requirements I had when designing this is that vanilla HLSL shader code going through the translator should remain unchanged.

Usually a shader is a mini-pipeline with predefined steps that only vary slightly. Let’s take for instance the pixel shader that populates a GBuffer with per-pixel depth and normal. It has three distinct steps – take the depth of the pixel, take the normal of the pixel and output both. Now here comes the branching per-material, some materials might have normal maps while others might use the interpolated normals from the vertices. However the shader just has to complete these 3 steps – there is no difference how you get the normal.

Nearly all shader code can be simplified as such simple steps. So here I came up with the idea of what I called ‘polymorphics‘. They are placeholders for functions that perform a specific operation (i.e. fetch normal) and can be varied per-material.

The code for a simple GBuffer pixel shader could look like this:

polymorphic MakeDepth
{
	CalculateProjectionDepth
}

polymorphic GetWorldNormal
{
	NormalFromInput,
	NormalFromMap
}

pixel_shader float4 PS(PS_INPUT input) : SV_Target needs VERTEX_COLOR
{
	context.depth_p = MakeDepth();

	context.normal_w = GetWorldNormal();

	return float4(context.normal_w.x
		    , context.normal_w.y
		    , context.normal_w.z
		    , context.depth_p);
}

The keyword ‘pixel_shader‘ is required so that the translator knows the type of function it is working on. We have two declared polymorphics –  MakeDepth and GetWorldNormal with the functions (called ‘atoms‘) that can substitute them.

If the material has a normal map, after the translation process this shader looks like this:

//texture inputs
Texture2D map_normal : register(t0);
SamplerState sampler_point : register(s0);
//sampler inputs
//input
struct PS_INPUT {
float4 Position : SV_POSITION;
float3 binormal : TEXCOORD0;
float3 normal_o : TEXCOORD1;
float3 normal_t : TEXCOORD2;
float4 projposition : TEXCOORD3;
float3 tangent : TEXCOORD4;
float2 uv : TEXCOORD5;
float3 vertex_color : TEXCOORD6;
};

float4 PS(PS_INPUT input) : SV_Target
{

	struct {
		float3 binormal;
		float depth_p;
		float3 normal_o;
		float3 normal_t;
		float3 normal_w;
		float4 projposition;
		float3 tangent;
		float3x3 tbn;
		float2 uv;
		float3 vertex_color;
	} context;

	//context population
	context.binormal = input.binormal;
	context.normal_o = input.normal_o;
	context.normal_t = input.normal_t;
	context.projposition = input.projposition;
	context.tangent = input.tangent;
	context.uv = input.uv;
	context.vertex_color = input.vertex_color;

{ // CalculateProjectionDepth
	context.depth_p = context.projposition.z / context.projposition.w;
}

{ // ComputeTBN
	context.tbn = float3x3(context.tangent, context.binormal, context.normal_t);
}

{ // NormalFromMap
	float3 normal = map_normal.Sample(sampler_point, context.uv);
	normal = mul(normal, context.tbn);
	normal = normalize(context.normal_o + normal);

	context.normal_w = normal;
}

	return float4(context.normal_w.x
		    , context.normal_w.y
		    , context.normal_w.z
		    , context.depth_p);
}

There is much more code generated by the translator – the polymorphics have been substituted by ‘atoms’ i.e. function that perform the required task – “NormalFromMap” is an atom that fetches the normal vector from a map, “NormalFromInput” fetches it as an interpolated value from the vertices. If the material whose shader we want to create has no normal map we simply tell the translator to use “NormalFromInput” for the polymorphic “GetWorldNormal“.

All these atoms are defined elsewhere and could form an entire library. They look like this:

atom NORMAL_W NormalFromInput(interface context) needs NORMAL_O
{
	return mul(context.normal_o, World);
}

atom NORMAL_W NormalFromMap(interface context) needs NORMAL_O, UV, TBN, MAP_NORMAL, SAMPLER_POINT
{
	float3 normal = map_normal.Sample(sampler_point, context.uv);
	normal = mul(normal, context.tbn);
	normal = normalize(context.normal_o + normal);

	return normal;
}

There are many new keywords here.The all-caps words are called ‘semantics‘, they are declared in an appropriate file and indicate the type of the placeholder name and a HLSL semantic name used in case they should be interpolated between shading stages or come as input in the vertex shader. Semantics are essentially variables that the shader translation system knows of.

A sample semantic file looks like this:

void VOID : void;

float3x3 TBN : TEXCOORD;

float3 NORMAL_T : NORMAL;
float3 NORMAL_O : NORMAL;
float3 NORMAL_W : NORMAL;
float3 TANGENT : TANGENT;
float3 BINORMAL : BINORMAL;

float2 UV : TEXCOORD;

float DEPTH_P : DEPTH;

float3 VERTEX_COLOR : VERTEXCOLOR;

float4 PROJPOSITION : PROJPOSITION;

float4 COLOR : COLOR;
float  ALPHA : TEXCOORD;
float3 ALBEDO : TEXCOORD;
float3 SPECULAR_COLOR : TEXCOORD;

Of course if we just substitute parts of the code with snippets we’d be in trouble as different atoms require different data to work with. If we use the “NormalFromMap” atom we would need a normal map, a sampler and uv coordinates. If we use “NormalFromInput” we just need a normal vector as shader input. All functions with an input – that is atoms and the vertex/pixel shader main functions, have a ‘needs‘ clause where all semantics needed for the computation are enumerated.

The declaration/definition(they are the same) of a sample atom is as follows:

atom NORMAL_W NormalFromMap(interface context) needs NORMAL_O, UV, TBN, MAP_NORMAL, SAMPLER_POINT

‘atom’ is required to flag the function. Then the return semantic and the name of the atom. ‘interface context’ is required. Atoms are not substituted by function calls but are inlined in the shader code – to avoid name clashes with vanilla code in the shader that is not dependent upon the translation system all computed semantics (variables) are put in a special structure called ‘context‘. In the atom declaration the keyword interface is used for an eventual future use. Strictly speaking currently ‘interface context’ is not needed but makes the atom resemble a real function and reminds that all input comes from the context. After the closing brace there is an optional clause ‘needs’ after which all required semantics are enumerated.

Sometimes the needed semantics are straightforward to procure – for instance if a normal map is required the system should simply declare a texture variable before the shader main code. However some computations are much more convolved – like computing the TBN matrix. Here comes the third type of resources needed in the translation process – ‘combinators‘.

When the translator encounters needed semantics it first checks if they are not already computed before and are not in the context (I remind you that all data is saved in the context). If it’s a new semantic it checks all combinators for one that can calculate it. Combinators as atoms are functions – their declarations are almost the same as the ones of atoms:

combinator TBN ComputeTBN(interface context) needs TANGENT, BINORMAL, NORMAL_T
{
	return float3x3(context.tangent, context.binormal, context.normal_t);
}

The only difference is the keyword ‘combinator’ instead of ‘atom’. They encapsulate code to compute a complicated semantic from more simple ones.
If no combinatoris found for a needed semantic it is assumed that it comes as an interpolant or vertex shader input. Needed semantic searches are always conducted so combinators can depend on other combinators.

To recap, the building blocks of the shader translation process are:

  • semantics
  • atoms
  • combinators

While it might seem complicated at first, the system simplifies the shader authoring a lot. The shaders themselves become much more readable with no branches in their logic per-material type – so no #ifdef. An atom and combinatorlibrary is trivial to build after writing some shaders – later on operations get reused. The translation process guarantees that only needed data is computed, interpolated or required as vertex input. The ‘context’ structure used to hold the data incurs no performance penalty as it is easily handled by the HLSL compiler. For convenience expanded atoms and combinators are flagged with comments in the outputted HLSL code and enclosed in scopes to avoid name clashes between local variables.

In the next post I’ll explain some compile-time conditions supported by the translator as well as how the translation process works.

A high level shader construction syntax – Part I

This post has been published also in Coherent Labs’s blog – the company I co-founded and work for.

Shader construction

A challenge in modern graphics programming is the management of complicated shaders. The huge amount of  materials, lights and assorted conditions lead to a combinatorial explosion in shader code-paths.

There are many ways to cope with this problem and a lot of techniques have been developed.

Some engines like Unreal have taken the way lead by 3D modelling applications and allow designers to ‘compose’ shaders from pre-created nodes that they link in shade trees. An extensive description of the technique can be found in the paper “Abstract Shade Trees” by McGuire et al.. This way however the “Material editor” of the application usually has to be some sort of tree editor. Shaders generated this way might have performance issues if the designer didn’t pay attention but of course they are the ones that give major freedom to that said artist.

Another technique is building shaders on-the-fly from C++ code as shown in “Shader Metaprogramming” by McCool et al.. I’ve never tried such a shader definition although I find it very compelling due mostly to it’s technical implementation. You’d have to rebuild and relink C++ code on the fly to allow for interactive iterations when developing or debugging which is not very difficult to achieve but seems a bit awkward to me. The gains in code portability however should not be underestimated.

Über-shaders and SuperShaders usually build upon the preprocessor and enable/disable parts of the code via defines. The major drawback is that the ‘main’ shader in the end always becomes a giant unreadable mess of #ifdefs that is particularly unpleasant to debug.
A small variant of the SuperShader way is to use ‘static const’ variables injected by the native code and plain ‘if’s on them in the shader. All compilers I’ve seen are smart enough to compile-out any branching and essentially the static const variables work as preprocessor macros with the added bonus that if looks better than #ifdef and the code is a bit easier to read. On complex code all the SuperShader problems remain.
Dynamic shader linking introduced in Shader Model 5 allows to have interfaces and some sort of virtual method calls in your shaders and allows for very elegant code.
I’d like to share an idea and sample implementation of an enhanced syntax over HLSL SM4 and SM5. It is heavily influenced by the idea of dynamic linking, ASTs and “Automated Combination of Real-Time Shader Programs” with some additional features and was originally developed in order to support DirectX 10+-level hardware. Although the sample application works only on SM4 and SM5 it could relatively easily be ported to any modern shading language. On sm5 you could just use the built-in dynamic linkage feature.
In essence the program translates the ‘enhanced’ shader to plain HLSL. The translator works like a preprocessor so no AST is built on the code.

In the following posts I’ll explain the syntax and what I tried to achieve with it as well as the implementation of the translator.