Archive for March, 2017

Subsurface Scattering spherical harmonics – pt 2

March 22, 2017

 

This is my 2nd blog post on using spherical harmonics for depth based lighting effects in Unreal 4.

The first blog post focused on generating the spherical harmonics data in Houdini, this post focuses on the Unreal 4 side of things.

I’m going to avoid posting much code here, but I will try to provide enough information to be useful if you choose to do similar things.

SH data to base pass

The goal was to look up the depth of the object from each light in my scene, and see if I could do something neat with it.

In UE4 deferred rendering, that means that I need to pass my 16 coefficients from the material editor –> base pass pixel shader -> the lighting pass.

First up, I read the first two SH coefficients out of the red and green vertex colour channels, and the rest out of my UV sets (remembering that I kept the default UV set 0 for actual UVs):

SHBaseMatUVs

Vertex colour complications

You notice a nice little hardcoded multiplier up there… This was one of the annoyances with using vertex colours: I needed to scale the value of the coefficients in Houdini to 0-1, because vertex colours are 0-1.

This is different to the normalization part I mentioned in the last blog post, which was scaling the depth values before encoding them in SH. Here, I’m scaling the actual computed coefficients. I only need to do this with the vertex colours, not the UV data, since UVs aren’t restricted to 0-1.

The 4.6 was just a value that worked, using my amazing scientific approach of “calculate SH values for half a dozen models of 1 000 – 10 000 vertices, find out how high and low the final sh values go, divide through by that number +0.1”. You’d be smarter to use actual math to find the maximum range for coefficients for normalized data sets, though… It’s probably something awesome like 0 –> 1.5 pi.

Material input pins

Anyway, those values just plug into the SH Depth Coeff pins, and we’re done!!

Unreal 4 SH depth material

Ok.
That was a lie.
Those pins don’t exist usually… And neither does this shading model:

SHDepthShadingModel

So, that brings me to…

C++ / shader side note

To work out how to add a shading model, I searched the source code for a different shading model (hair I think), and copied and pasted just about everything, and then went through a process of elimination until things worked.
I took very much the same approach to the shader side of things.

This is why I’m a Tech Artist, and not a programmer… Well, one of many reasons 😉
Seriously though, being able to do this is one of the really nice things about having access to engine source code!

The programming side of this project was a bunch of very simple changes across a wide range of engine source files, so I’m not going to post much of it:

P4Lose

There is an awful lot of this code that really should be data instead. But Epic gave me an awesome engine and lets me mess around with source code, so I’m not going to complain too much 😛

Material pins (continued…)

So I added material inputs for the coefficients, plus some absorption parameters.

Sh coeffs

The SH Coeffs material pins are new ones, so I had to make a bunch of changes to material engine source files to make that happen.
Be careful when doing this: Consistent ordering of variables matters in many of these files. I found that out the easy way: Epic put comments in the code about it 🙂

Each of the SH coeffs material inputs is a vector with 4 components, so I need 4 of these to send my 16 coefficients through to the base pass.

Custom data (absorption)

The absorption pins you might have noticed from my material screenshot are passed as “custom data”.
Some of the existing lighting models (subsurface, etc) pass additional data to the base pass (and also through to lighting, but more on that later).

These “custom data” pins can be renamed for different shading models. So you can use these if you’d rather not go crazy adding new pins, and you’re happy with passing through just two extra float values.
Have a look at MaterialGraph.cpp, and GetCustomDataPinName if that sounds like a fun time 🙂

Base pass to lighting

At this point, I’d modified enough code that I could start reading and using my SH values in the base pass.

A good method for testing if the data was valid was using the camera vector to look up the SH depth values. I knew things were working when I got similar results to what I was seeing in Houdini when using the same approach:

BasePassDebug

That’s looking at “Base Color” in the buffer visualizations.

I don’t actually want to do anything with the SH data in the base pass, though, so the next step is to pass the SH data through to the lighting pass.

Crowded Gbuffer

You can have a giant parameter party, and read all sorts of fun data in the base pass.
However, if you want to do per-light stuff, at some point you need to write all that data into a handful of full screen buffers that the lighting pass uses. By the time you get to lighting, you don’t have per object data, just those full screen buffers and your lights.

These gbuffers are lovingly named GBufferA, GBufferB, GBuffer… You get the picture.

You can visualize them in the editor by using the various buffer visualizers, or explicitly using the “vis” command, e.g: “vis gbuffera”:

visGbuffers

There are some other buffers being used (velocity, etc), but these are the ones I care about for now.

I need to pass an extra 16 float values through to lighting, so surely I could just add 4 new gbuffers?

Apparently not, the limit for simultaneous render targets is 8 🙂

I started out by creating 2 new render targets, so that covers half of my SH values, but what to do with the other 8 values?

Attempt 1 – Packing it up

To get this working, there were things that I could sacrifice from the above existing buffers to store my own data.

For example, I rarely use Specular these days, aside from occasionally setting it to a constant, so I could use that for one of my SH values, and just hard code Specular to 1 in my lighting pass.

With this in mind, I overwrote all the things I didn’t think I cared about for stylized translucent meshes:

  • Static lighting
  • Metallic
  • Specular
  • Distance field anything (I think)

Attempt 2 – Go wide!

This wasn’t really ideal. I wasn’t very happy about losing static lighting.

That was about when I realized that although I couldn’t add any more simultaneous render targets, I could change the format of them!

The standard g-buffers are 8 bits per channel, by default. By going 16 bit per channel, I could pack two SH values into each channel, and store all my SH data in my two new g-buffers without the need for overwriting other buffers!

Well, I actually went with PF_A32B32G32R32F, so 32 bits per channel because I’m greedy.

It’s probably worth passing out in horror at the cost of all this at this point: 2 * 128bit buffers is something like 250mb of data. I’m going to talk about this a little later 🙂

Debugging, again

I created a few different procedural test assets in Houdini with low complexity as test cases, including one which I deleted all but one polygon as a final step, so that I could very accurately debug the SH values 🙂

On top of that, I had a hard coded matrix in the shaders that I could use to check, component by component, that I was getting what I expected when passing data from the base pass to lighting, with packing/unpacking, etc:

const static float4x4 shDebugValues = 
{
	0.1, 0.2, 0.3, 0.4,
	0.5, 0.6, 0.7, 0.8,
	0.9, 1.0, 1.1, 1.2,
	1.3, 1.4, 1.5, 1.6
};

It seems like an obvious and silly thing to point out, but it saved me some time 🙂

Here are some of my beautiful procedural test assets (one you might recognize from the video at the start of the post):

Houdini procedural test asset (rock thing)testobject3testobject2testobject1

“PB-nah”, the lazy guide to not getting the most out of my data

Ok, SH data is going through to the lighting pass now!

This is where a really clever graphics programmer could use if for some physically accurate lighting work, proper translucency, etc.

To be honest, I was pleasantly surprised that anything was working at this stage, so I threw in a very un-pbr scattering, and called it a day! 🙂

float3 SubsurfaceSHDepth( FGBufferData GBuffer, float3 L, float3 V, half3 N )
{
	float AbsorptionDistance 	= GBuffer.CustomData.x;
	float AbsorptionPower 		= lerp(4.0f, 16.0f, GBuffer.CustomData.y);

	float DepthFromPixelToLight 	= Get4BandSH(GBuffer.SHCoeffs, L);
	float absorptionClampedDepth 	= saturate(1.0f / AbsorptionDistance * DepthFromPixelToLight);
	float SSSWrap 			= 0.3f;
	float frontFaceFalloff 		= pow(saturate(dot(-N, L) + SSSWrap), 2);

	float Transmittance 		= pow(1 - absorptionClampedDepth, AbsorptionPower);

	Transmittance *= frontFaceFalloff;

	return Transmittance * GBuffer.BaseColor;
}
It’s non view dependent scattering, using the SH depth through the model towards the light, then dampened by the absorption distance.
The effect falls off by face angle away from the light, but I put a wrap factor on that because I like the way it looks.
For all the work I’ve put into this project, probably the least of it went into the actual lighting model, so I’m pretty likely to change that code quite a lot 🙂
What I like about this is that the scattering stays fairly consistent around the model from different angles:
GlowyBitFrontGlowyBitSide
So as horrible and inaccurate and not PBR as this is, it matches what I see in SSS renders in Modo a little better than what I get from standard UE4 SSS.

The End?

Broken things

  • I can’t rotate my translucent models at the moment 😛
  • Shadows don’t really interact with my model properly

I can hopefully solve both of these things fairly easily (store data in tangent space, look at shadowing in other SSS models in UE4), I just need to find the time.
I could actually rotate the SH data, but apparently that’s hundreds of instructions 🙂

Cost and performance

  • 8 uv channels
  • 2 * 128 bit buffers

Not really ideal from a memory point of view.

The obvious optimization here is to drop down to 3 band spherical harmonics.
The quality probably wouldn’t suffer, and that’s 9 coefficients rather than 16, so I could pack them into one of my 128 bit gbuffers instead of two (with one spare coefficient left over that I’d have to figure out).

That would help kill some UV channels, too.

Also, using 32 bit per channel (so 16 bits per sh coeff) is probably overkill. I could swap over to using a uint 16 bits per channel buffer, and pack two coefficients per channel at 8 bits each coeff, and that would halve the memory usage again.

As for performance, presumably evaluating 3 band spherical harmonics would be cheaper than 4 band. Well, especially because then I could swap to using the optimized UE4 functions that already exist for 3 band sh 🙂

Render… Differently?

To get away from needing extra buffers and having a constant overhead, I probably should have tried out the new Forward+ renderer:

https://docs.unrealengine.com/latest/INT/Engine/Performance/ForwardRenderer/

Since you have access to per object data, presumably passing around sh coefficients would also be less painful.
Rendering is not really my strong point, but my buddy Ben Millwood has been nagging me about Forward+ rendering for years (he’s writing his own renderer http://www.lived3d.com/).

There are other alternatives to deferred, or hybrid deferred approaches (like Doom 2016’s clustered forward, or Wolfgang Engels culled visibility buffers) that might have made this easier too.
I very much look forward to the impending not-entirely-deferred future 🙂

Conclusion

I learnt some things about Houdini and UE4, job done!

Not sure if I’ll keep working on this at all, but it might be fun to at least fix the bugs.

 

Advertisements

Subsurface Scattering spherical harmonics – pt 1

March 17, 2017

In this post, I’ll be presenting “SSSSH”, which will be the sound made by any real programmer who happens to accidentally read this…

This has been a side project of mine for the last month or so with a few goals:

  • Play around more with Houdini (I keep paying for it, I should use it more because it’s great)
  • Add more gbuffers to UE4, because that sounds like a useful thing to be able to do and understand.
  • Play around with spherical harmonics (as a black box) to understand the range and limitations of the technique a bit better.
  • Maybe accidentally make something that looks cool.

Spherical harmonics

I won’t go too much into the details on spherical harmonics because:
a) There’s lots of good sites out there explaining them and
b) I haven’t taken the time to understand the math, so I really don’t know how it works, and I’m sort of ok with that for now 😛

But at my basic understanding level, spherical harmonics is a way of representing data using a set of functions that take spherical coordinates as an input, and return a value. Instead of directly storing the data (lighting, depth, whatever), you work out a best fit of these functions to your data, and store the coefficients of the functions.

Here is a very accurate diagram:

DataSphere

You’re welcome!
Feel free to reuse that amazing diagram.

SH is good for data that varies rather smoothly, so tends to be used for ambient/bounced lighting in a lot of engines.

The function series is infinite, so you can decide how many terms you want to use, which determines how many coefficients you store.

For this blog post, I decided to go with 4-band spherical harmonics, because I’m greedy and irresponsible.
That’s 16 float values.

Houdini SH

Thanks to the great work of Matt Ebb, a great deal of work was already done for me:

http://mattebb.com/weblog/spherical-harmonics-in-vops/

I had to do a bit of fiddling to get things working in Houdini 15, but that was a good thing to do anyway, every bit of learning helps!

What I used from Matt were two nodes for reading and writing SH data given the Theta and Phi (polar and azimuthal) angles:

SHFunctions

Not only that, but I was able to take the evaluate code and adapt it to shader code in UE4, which saved me a bunch of time there too.

It’s not designed to be used that way, so I’m sure that it isn’t amazingly efficient. If I decide to actually keep any of this work, I’ll drop down to 3 band SH and use the provided UE4 functions 🙂

Depth tracing in Houdini

I’m not going to go through every part of the Houdini networks, just the meat of it, but here’s what the main network looks like:

NetworkOverview

So all the stuff on the left is for rendering SH coefficients out to textures (more on that later), the middle section is where the work is done, the right hand side a handful of debug modes visualizers, including some from the previously mentioned Matt Ebb post.

Hits and misses

I’m doing this in SOPs (geometry operations), because it’s what I know best in Houdini at the moment, as a Houdini noob 🙂
I should try moving it to shops (materials/per pixel) at some point, if that is at all possible.

To cheat, if I need more per-pixel like data, I usually just subdivide my meshes like crazy, and then just do geometry processing anyway 😛

The basic functionality is:

  • For each vertex in the source object:
    • Fire a ray in every direction
    • Collect every hit
    • Store the distance to the furthest away primitive that is facing away from the vertex normal (so back face, essentially)

All the hits are stored in an array, along with the Phi and Theta angles I mentioned before, here’s what that intersection network looks like currently:

IntersectAll

I’m also keeping track of the maximum hit length, which I will use later to normalize the depth data. The max length is tracked one level up from the getMaxIntersect network from the previous screenshot:

GenerateHits

This method currently doesn’t work very well with objects with lots of gaps in them, because the gaps in the middle of an object will essentially absorb light when they shouldn’t.
It wouldn’t be hard to fix, I just haven’t taken the time yet.

Normalizing

Before storing to SH values, I wanted to move all the depth values into the 0-1 range, since there are various other places where having 0-1 values makes my life easier later.

One interesting thing that came up here: when tracing rays out from a point, there are always more rays that miss than hit.

That’s because surfaces are more likely to be convex than concave, so at least half of the rays are pointing out into space:

FurryPlane

Realistically, I don’t really care about spherical data, I probably want to store hemispherical data around the inverse normal.
That might cause data problems in severely concave areas of the mesh, but I don’t think it would be too big a problem.
There are hemispherical basis functions that could be used for that, if I were a bit more math savvy:

A Novel Hemispherical Basis for Accurate and Efficient Rendering

Anyway, having lots of values shooting out to infinite (max hit length) was skewing all of the SH values, and I was losing a lot of accuracy, so I encoded misses as zero length data instead.

Debug fun times!

So now, in theory, I have a representation of object thickness for every vertex in my mesh!

One fun way to debug it (in Houdini) was to read the SH values using the camera forward vector, which basically should give me depth from the camera (like a z buffer):

SHDepth

And, in a different debug mode that Matt Ebb had in his work, each vertex gets a sphere copied onto it, and the sphere is displaced in every direction by the SH value on the corresponding vertex:

vortigauntBalloons

vortigauntBalloons2

This gives a good visual indicator on how deep the object is in every direction, and was super useful once I got used to what I was looking at 🙂

And, just for fun, here is shot from a point where I was doing something really wrong:

vortigauntClicker

Exporting the data

My plans for this were always to bake out the SH data into textures, partially just because I was curious what sort of variation I’d get out of it (I had planned to use displacement maps on the mesh in Houdini to vary the height).

SHImages
And yes, that’s 4 images worth of SH data, best imported as HDR.
But hey, I like being a bit over the top with my home projects…

One of my very clever workmates, James Sharpe, had the good suggestion of packing the coeffs into UV data as I was whining to him over lunch about the lack of multiple vertex color set support in UE4.
So I decided to run with UVs, and then move back to image based once I was sure everything was working 🙂

PixelVSVertex

Which worked great, and as you can probably see from the shot above, per-vertex (UVs or otherwise) is perfectly adequate 🙂

Actually, I ended up putting coefficients 1-14 into uvs, and the last two into the red and green vertex color channels, so that I could keep a proper UV set in the first channel that I could use for textures.

And then, all the work…

Next blog post coming soon!

In it, I will discuss all the UE4 work, the things I should have done, or done better, might do in the future and a few more test shots and scene from in UE4!

To be continued!!