This section introduces the technicalities of rendering optimization. It shows how to bake lighting results for better performance, and how the developers of Shadowgun levered high-contrast textures, with lighting baked-in, to make their game look great. If you are looking for general information on what a mobile-optimized game looks like, check out the Graphics Methods page.
Sometimes optimizing the rendering in your game requires some dirty work. All of the structure that Unity provides makes it easy to get something working fast, but if you require top notch fidelity on limited hardware, doing things yourself and sidestepping that structure is the way to go, provided that you can introduce a key structural change that makes things a lot faster. Your tools of choice are editor scripts, simple shaders, and good old-fashioned art production.
Note for Unity Indie users: The editor scripts referenced here use RenderTextures to make production smooth, so they wont work for you right away, but the principles behind them work with screenshotting as well, so nothing is stopping you from using these techniques for a few texture bakes of your own.
First of all, check out this introduction to how shaders are written.
#pragma debug
)
#pragma debug
to the top of your surface shader, when you open the compiled shader via the inspector, you can see the intermediate CG code. This is useful for inspecting how a specific part of a shader is actually calculated, and it can also be useful for grabbing certain aspects you want from a surface shader and applying them to a CG shader.Shadowgun is a great graphical achievement considering the hardware it runs on. While the art quality seems to be the key to the puzzle, there are a couple tricks to achieving such quality that programmers can pull off to maximize their artists’ potential.
In the Graphics Methods page we used the golden statue in Shadowgun as an example of a great optimization; instead of using a normal map to give their statue some solid definition, they just baked lighting detail into the texture. Here, we will show you how and why you should use a similar technique in your own game.
// This is the pixel shader code for drawing normal-mapped
// specular highlights on static lightmapped geometry
// 5 texture reads, lots of instructions
SurfaceOutput o;
fixed4 tex = tex2D(_MainTex, IN.uv_MainTex);
fixed4 c = tex * _Color;
o.Albedo = c.rgb;
o.Gloss = tex.a;
o.Specular = _Shininess;
o.Normal = UnpackNormal(tex2D(_BumpMap, IN.uv_BumpMap));
float3 worldRefl = WorldReflectionVector (IN, o.Normal);
fixed4 reflcol = texCUBE (_Cube, worldRefl);
reflcol *= tex.a;
o.Emission = reflcol.rgb * _ReflectColor.rgb;
o.Alpha = reflcol.a * _ReflectColor.a;
fixed atten = LIGHT_ATTENUATION(IN);
fixed4 c = 0;
half3 specColor;
fixed4 lmtex = tex2D(unity_Lightmap, IN.lmap.xy);
fixed4 lmIndTex = tex2D(unity_LightmapInd, IN.lmap.xy);
const float3x3 unity_DirBasis = float3x3(
float3( 0.81649658, 0.0, 0.57735028),
float3(-0.40824830, 0.70710679, 0.57735027),
float3(-0.40824829, -0.70710678, 0.57735026) );
half3 lm = DecodeLightmap (lmtex);
half3 scalePerBasisVector = DecodeLightmap (lmIndTex);
half3 normalInRnmBasis = saturate (mul (unity_DirBasis, o.Normal));
lm *= dot (normalInRnmBasis, scalePerBasisVector);
return half4(lm, 1);
// This is the pixel shader code for lighting which is
// baked into the texture
// 2 texture reads, very few instructions
fixed4 c = tex2D (_MainTex, i.uv.xy);
c.xyz += texCUBE(_EnvTex,i.refl) * _ReflectionColor * c.a;
return c;
The real-time light is definitely higher quality, but the performance gain from the baked version is massive. So how was this done? An editor tool called Render to Texel was created for this purpose. Note that you will need to be running Unity PRO in order to use this tool. It bakes the light into the texture through the following process:
This is how the best graphics optimizations work. They sidestep tons of calculations by preforming them in the editor or before the game runs. In general, this is what you want to do:
Just like the Bass and Treble of an audio track, images also have high-frequency and low-frequency components, and when you’re rendering, it’s best to handle them in different ways, similar to how stereos use subwoofers and tweeters to produce a full body of sound. One way to visualize the different frequencies of an image is to use the “High Pass” filter in Photoshop. Filters->Other->High Pass. If you have done audio work before, you will recognize the name High Pass. Essentially what it does is cut off all frequencies lower than X, the parameter you pass to the filter. For images, Gaussian Blur is the equivalent of a Low Pass.
This applies to realtime graphics because frequency is a good way to separate things out and determine how to handle what. For example, in a basic lightmapped environment, the final image is obtained by composite of the lightmap, which is low frequency, and the textures, which are high-frequency. In Shadowgun, low frequency light is applied to characters quickly with light probes, high frequency light is faked through the use of a simple bumpmapped shader with an arbitrary light direction.
In general, by using different methods to render different frequencies of light, for example, baked vs dynamic, per-object vs per-level, per pixel vs per-vertex, etc, you can create full bodied images on limited hardware. Stylistic choices aside, it’s generally a good idea to try to have strong variation colors or values at both high and low frequencies.
Note: Usually these decompositions refer to steps in a deferred renderer, but that’s not the case here. All of this is done in just one pass. These are the two relevant shaders which this composition was based on:
Shader "MADFINGER/Environment/Virtual Gloss Per-Vertex Additive (Supports Lightmap)" {
Properties {
_MainTex ("Base (RGB) Gloss (A)", 2D) = "white" {}
//_MainTexMipBias ("Base Sharpness", Range (-10, 10)) = 0.0
_SpecOffset ("Specular Offset from Camera", Vector) = (1, 10, 2, 0)
_SpecRange ("Specular Range", Float) = 20
_SpecColor ("Specular Color", Color) = (0.5, 0.5, 0.5, 1)
_Shininess ("Shininess", Range (0.01, 1)) = 0.078125
_ScrollingSpeed("Scrolling speed", Vector) = (0,0,0,0)
}
SubShader {
Tags { "RenderType"="Opaque" "LightMode"="ForwardBase"}
LOD 100
CGINCLUDE
#include "UnityCG.cginc"
sampler2D _MainTex;
float4 _MainTex_ST;
samplerCUBE _ReflTex;
#ifndef LIGHTMAP_OFF
float4 unity_LightmapST;
sampler2D unity_Lightmap;
#endif
//float _MainTexMipBias;
float3 _SpecOffset;
float _SpecRange;
float3 _SpecColor;
float _Shininess;
float4 _ScrollingSpeed;
struct v2f {
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
#ifndef LIGHTMAP_OFF
float2 lmap : TEXCOORD1;
#endif
fixed3 spec : TEXCOORD2;
};
v2f vert (appdata_full v)
{
v2f o;
o.pos = mul(UNITY_MATRIX_MVP, v.vertex);
o.uv = v.texcoord + frac(_ScrollingSpeed * _Time.y);
float3 viewNormal = mul((float3x3)UNITY_MATRIX_MV, v.normal);
float4 viewPos = mul(UNITY_MATRIX_MV, v.vertex);
float3 viewDir = float3(0,0,1);
float3 viewLightPos = _SpecOffset * float3(1,1,-1);
float3 dirToLight = viewPos.xyz - viewLightPos;
float3 h = (viewDir + normalize(-dirToLight)) * 0.5;
float atten = 1.0 - saturate(length(dirToLight) / _SpecRange);
o.spec = _SpecColor * pow(saturate(dot(viewNormal, normalize(h))), _Shininess * 128) * 2 * atten;
#ifndef LIGHTMAP_OFF
o.lmap = v.texcoord1.xy * unity_LightmapST.xy + unity_LightmapST.zw;
#endif
return o;
}
ENDCG
Pass {
CGPROGRAM
#pragma vertex vert
#pragma fragment frag
#pragma fragmentoption ARB_precision_hint_fastest
fixed4 frag (v2f i) : COLOR
{
fixed4 c = tex2D (_MainTex, i.uv);
fixed3 spec = i.spec.rgb * c.a;
#if 1
c.rgb += spec;
#else
c.rgb = c.rgb + spec - c.rgb * spec;
#endif
#ifndef LIGHTMAP_OFF
fixed3 lm = DecodeLightmap (tex2D(unity_Lightmap, i.lmap));
c.rgb *= lm;
#endif
return c;
}
ENDCG
}
}
}
Shader "MADFINGER/Environment/Lightprobes with VirtualGloss Per-Vertex Additive" {
Properties {
_MainTex ("Base (RGB) Gloss (A)", 2D) = "white" {}
_SpecOffset ("Specular Offset from Camera", Vector) = (1, 10, 2, 0)
_SpecRange ("Specular Range", Float) = 20
_SpecColor ("Specular Color", Color) = (1, 1, 1, 1)
_Shininess ("Shininess", Range (0.01, 1)) = 0.078125
_SHLightingScale("LightProbe influence scale",float) = 1
}
SubShader {
Tags { "RenderType"="Opaque" "LightMode"="ForwardBase"}
LOD 100
CGINCLUDE
#pragma multi_compile LIGHTMAP_OFF LIGHTMAP_ON
#include "UnityCG.cginc"
sampler2D _MainTex;
float4 _MainTex_ST;
float3 _SpecOffset;
float _SpecRange;
float3 _SpecColor;
float _Shininess;
float _SHLightingScale;
struct v2f {
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
float3 refl : TEXCOORD1;
fixed3 spec : TEXCOORD3;
fixed3 SHLighting: TEXCOORD4;
};
v2f vert (appdata_full v)
{
v2f o;
o.pos = mul(UNITY_MATRIX_MVP, v.vertex);
o.uv = v.texcoord;
float3 worldNormal = mul((float3x3)_Object2World, v.normal);
float3 viewNormal = mul((float3x3)UNITY_MATRIX_MV, v.normal);
float4 viewPos = mul(UNITY_MATRIX_MV, v.vertex);
float3 viewDir = float3(0,0,1);
float3 viewLightPos = _SpecOffset * float3(1,1,-1);
float3 dirToLight = viewPos.xyz - viewLightPos;
float3 h = (viewDir + normalize(-dirToLight)) * 0.5;
float atten = 1.0 - saturate(length(dirToLight) / _SpecRange);
o.spec = _SpecColor * pow(saturate(dot(viewNormal, normalize(h))), _Shininess * 128) * 2 * atten;
o.SHLighting = ShadeSH9(float4(worldNormal,1)) * _SHLightingScale;
return o;
}
ENDCG
Pass {
CGPROGRAM
#pragma vertex vert
#pragma fragment frag
#pragma fragmentoption ARB_precision_hint_fastest
fixed4 frag (v2f i) : COLOR
{
fixed4 c = tex2D (_MainTex, i.uv);
c.rgb *= i.SHLighting;
c.rgb += i.spec.rgb * c.a;
return c;
}
ENDCG
}
}
}
Some GPUs, particularly ones found in mobile devices, incur a high performance overhead for alpha-testing (or use of the discard and clip operations in pixel shaders). You should replace alpha-test shaders with alpha-blended ones if possible. Where alpha-testing cannot be avoided, you should keep the overall number of visible alpha-tested pixels to a minimum.
Some images, especially if using iOS/Android PVR texture compression, are prone to visual artifacts in the alpha channel. In such cases, you might need to tweak the PVRT compression parameters directly in your imaging software. You can do that by installing the PVR export plugin or using PVRTexTool from Imagination Tech, the creators of the PVRTC format. The resulting compressed image file with a .pvr extension will be imported by the Unity editor directly and the specified compression parameters will be preserved. If PVRT-compressed textures do not give good enough visual quality or you need especially crisp imaging (as you might for GUI textures) then you should consider using 16-bit textures instead of 32-bit. By doing so, you will reduce the memory bandwidth and storage requirements by half.
All Android devices with support for OpenGL ES 2.0 also support the ETC1 compression format; it’s therefore encouraged to whenever possible use ETC1 as the prefered texture format.
If targeting a specific graphics architecture, such as the Nvidia Tegra or Qualcomm Snapdragon, it may be worth considering using the proprietary compression formats available on those architectures. The Android Market also allows filtering based on supported texture compression format, meaning a distribution archive (.apk) with for example DXT compressed textures can be prevented for download on a device which doesn’t support it.
Download Render to Texel. Bake lighting on your model. Run the High Pass filter on the result in Photoshop. Edit the “Mobile/Cubemapped” shader, included in the Render to Texel package, so that the missing low-frequency light details are replaced by vertex light.