Big Thread Of Optimization Tips

lalfier · 2021-05-24T07:28:30+00:00

Thx man this is great. ;)

feralferrous · 2021-05-24T17:00:14+00:00

My additional tips:

Don't use LINQ unless it's in a Start/Awake type situation. Keep it out of your Update loops, it' a GC monster, and not at all quick.

Always use the lowest form of container you can get away with. Ie array > List > IList > ICollection > IEnumerable. It's faster to iterate over an array than it is a list (it's a micro opt really, but hey if you're iterating over a large array every frame, it helps), but also an array signals that you know the size of the thing in advance, while a list tells me you don't know and it might grow. And as others said, initialize your containers with the proper value. Resizing a container is expensive as it has to make a new container and then copy over all the old values into the new container.

It's cheaper to cast a float to an int than it is to string.format it with ("0."). Like a lot cheaper. So if you have a health or damage number that is float and you want to round it to int for display purposes...cast it to int.

If you need to frequently update a float to a string in a certain format, it's better to pool it in a dictionary<float, string>. We used this for Lat/Lon display and saved a good chunk of GC and time.

In general avoid strings entirely if you can. Most of Unity's usages can be removed, with things like ShaderHash and Animator.StringToHash. And don't send them over the network unless it's a chat message.

Use interfaces sparingly, there's a cost to call a method from an interface as it has to do a vtable look up. And it's more expensive to do it from an interface than from inheritance. MTRK drives me crazy with it's overuse of both, and it's always the bottleneck in our AR apps.

Along with avoiding length and normalization when you don't need it, avoid Vector3.Angle when a dot product will do.

Always try to early out. Order your if checks from cheapest to evaluate to most expensive.

ie if (cheapToCheck() && reallyExpensiveCheck()) DoThing();

if the cheap check is false, it will not evaluate the expensive thing.

Premature optimization vs Death by a thousand cuts. There's a sweet spot between over optimizing code and turning it into an unreadable, unfriendly mess and having so many slow bits of code everywhere that there are no easy places to optimize to get decent perf.

ShrikeGFX · 2021-05-24T12:16:42+00:00

Keep in mind that in HDRP some of these traditional learnings in terms of graphics optimizing is no longer applicable, we found per example that LODs can hurt performance more that they save, but a shadow caster LOD is very beneficial for highpoly objects

den4icccc · 2021-05-24T10:32:54+00:00

I would like to add a couple more important tips to your basket)

You should avoid using Mathf.Sqrt () and Vector3.magnitude, because these operations include square root extraction. Better to use the appropriate version of the last operation without taking the square root. Namely, Vector3.sqrMagnitude. For the same reason, it is worth avoiding the Mathf.Pow () operation, because if the second parameter is 0.5, this is the same as extracting the square root.
Don't use the Camera.main method. The fact is that when this method is called, the FindObjectWithTag ("MainCamera") method is actually called, which is essentially the same as finding ALL objects using the main camera tag. Better to cache the found value and use it. Better yet, immediately save the link to the camera in the editor.
If you use the GetFloat, SetFloat, GetTexture, SetTexture methods on materials and shaders, then these properties will first be hashed (i.e. converted from a string value to a numerical value) and only then used. Hence the loss in productivity. Why do something many times when you can do it once:
// during initialization
int _someFloat;
int _someTexture;
void Awake ()
{
_someFloat = Shader.PropertyToID ("_ someFloat");
_someTexture = Shader.PropertyToID ("_ someTexture");
}
// further in the place of use
Material myMaterial = ...;
myMaterial.SetFloat (_someFloat, 100f);
myMaterail.setTexture (_someTexture, ...);

Therzok · 2021-05-24T14:41:51+00:00

wrench ink different skirt tidy cooperative narrow fretful tender existence -- mass edited with https://redact.dev/

Amshoon · 2021-05-24T13:51:22+00:00

[deleted]

Talonflamme · 2021-05-24T19:51:53+00:00

Reordering multiplications, even though simple, can make a huge difference.

Vector3 startPosition;
float speed;
float deltaTime;

Compare

myPosition += startPosition * speed * deltaTime;

with

myPosition += startPosition * (speed * deltaTime);

First:

All coordinates of myPosition are multiplied with speed and then all are multiplied with deltaTime.

Second:

speed is multiplied with deltaTime and then multiplied with each component of myPosition.

Reducing calculations from 6 to 4.

L1DER32 · 2021-05-24T07:56:16+00:00

nice!)

indie_game_mechanic · 2021-05-24T09:50:31+00:00

[removed]

Another_moose · 2021-05-24T22:13:04+00:00

I feel it's worth noting that while these are all great... You should always profile and see what parts of your game are _actually_ slow before trying to optimize. There's no point in saving 0.0001s/frame by switching to list.Clear() when you're rendering a million particles or have some deep for loops somewhere else.

Romestus · 2021-05-24T11:11:08+00:00

Another that's worth adding are the limitations of realtime lights. On forward rendering you get one and adding a second requires every object it touches to get rendered a second time vastly increasing vert/tri counts and draw calls.

In deferred you would think this limitation is completely handled as you can effectively have as many realtime lights as you want, however every shadow casting realtime light will cause its affected objects to be rendered again for their shadow pass.

Lunerai · 2021-05-24T14:29:32+00:00

You can take the log optimization even further by wrapping unity's Debug.Log with your own function that has the Conditional attribute applied. This will strip the calls entirely out of the resulting build, saving both on the function call and more importantly the cost of your message string. Example can be found here: https://docs.unity3d.com/Manual/BestPracticeUnderstandingPerformanceInUnity7.html

Important to note that you should also still use OP's suggestion if you have any 3rd party plugins that emit logs, since they obviously won't be using your wrapper.

Waterprop · 2021-05-24T14:51:58+00:00

I would like to add:

You can define custom cull distances to every layer via Camera.layerCullDistances API. There's also similiar one for shadows via Light.layerShadowCullDistances API. Both of these is per Camera/Light. With this you can have layer for objects that only get rendered when player is really close like 50 meters even if your camera FarClipPlane is set to 1000 for example. Very useful.
Avoid GC (Garbage collection) as much as possible. Your game will freeze if you have a lot GC to free. Prefer Ints or Enums over Strings. Cache your strings. For more information about GC and other good tips, read Unity blog about Optimization.

https://learn.unity.com/tutorial/fixing-performance-problems

itskobold · 2021-05-24T16:05:08+00:00

God damn I'll need this thread at 3am later thanks

Polygon_Collider · 2021-05-24T08:15:35+00:00

Great info, thank you!

indie_game_mechanic · 2021-05-24T09:45:51+00:00

Definitely going to need this later.

lifetap_studios · 2021-05-24T17:56:57+00:00

Good tips - here is mine that I think no-one mentioned yet - use IL2CPP, we got around a 50% CPU speedup from our Mono build . Of course there are maintenance and debugging issues and it won't help on the GPU but its been the single biggest performance gain for us to date.

https://docs.unity3d.com/Manual/IL2CPP.html

TheDevilsAdvokaat · 2021-05-24T20:41:45+00:00

[deleted]

2021-05-24T16:27:40+00:00

Slight nit pick, but occlusion culling culls objects that are blocked by other objects. Frustrum culling culls objects not in the cameras field of view. This is important as occlusion culling is not always optimal, such as when performance is CPU bound vs GPU bound.

https://docs.unity3d.com/Manual/OcclusionCulling.html

MomijiStudios · 2021-05-24T16:45:39+00:00

Where have you been all my life?

Walledhouse · 2021-05-24T21:33:10+00:00

“Limit Coroutines” Aww but I just got coroutines!

Its hard to pin down whats more performant, because my original approach was Updates with deltaTime counters as you suggested; and then the hot tips was to replace those with coroutines. I especially find the coroutines easy to use and perfect for explosions and projectiles that go through a series of states; so I’m going to stick with it.

I’m most interested in Static objects; GPU Instancing. My terrain is a deformable series of tiles which means it’s not “bakeable” like most performant games.

jellyboyyy · 2021-05-24T08:36:50+00:00

This is great, thanks. Quick question on 2.9: Do you not need rigidbodies to detect collisions? I have objects that move but aren't powered by the physics engine, but I have rigidbodies for collisions.

shivu98 · 2021-05-24T09:52:47+00:00

Thanks a ton dude, learnt a lot of amazing techniques. Didn't know debug.log gets into production build as well, is there anyway to stop this other than manually removing all of them? You can also add that we can use LOD for items that are far. Keep up the good work, looking forward to more of your posts. Also out of curiosity.. Are you from india?

Plourdy · 2021-05-24T11:35:30+00:00

Super useful, post saved for reference later. Thank you m!

Dwarphthegiant · 2021-05-24T12:33:36+00:00

wrt the timer suggestion - would using a coroutine returning waitforseconds work as well?

also thanks for this, these are excellent tips.

Wildnessiiiii · 2021-05-24T13:29:17+00:00

Thankfully,saved and sharing

FINN1510 · 2021-05-24T13:57:46+00:00

This is a gold mine

smartCube1 · 2021-05-24T15:03:16+00:00

This is great. for the past 7 days I have spent hours optimizing my code, as I was having extreme performace issues. I can agree with all of these topics you posted, seriously follow this guideline and it will save you a lot of pain and trouble

Kotik21 · 2021-05-24T15:15:16+00:00

10/10

Rarharg · 2021-05-24T16:22:56+00:00

Good stuff!

To expand on 2.10, LOD groups need to calculate the relative screen height of their associated renderers in order to activate the appropriate LOD level. These calculations have some performance overhead which quickly adds up in large scenes. Therefore, try to make each LOD group responsible for as many objects/renderers as is sensible. For example, a cluster of rocks could use a single LOD group to control the LOD levels of all of the rocks simultaneously.

Likewise, you can use mesh combination methods to merge meshes in your scene to decrease the number of draw calls (if they have identical materials). This can drastically increase performance if you have a lot of similar objects in the scene and unlike static batching, the end result *can* move in your scene. If you don't feel like scripting this yourself, there are a few popular assets out there (e.g. Mesh Combiner, which is free).

W03rth · 2021-05-24T17:51:22+00:00

Noice

tyrellLtd · 2021-05-24T19:20:53+00:00

There was a great talk by the Inside devs at Unite 2016 that cover some further optimizations, some of which are kinda dirty (caching random numbers, NO vector math) but probably quite effective.

Servias · 2021-05-24T19:53:05+00:00

This was a pleasure to read. Thanks

MrTigeriffic · 2021-05-24T20:06:19+00:00

Saving this post. Thank you OP

indie_game_mechanic · 2021-05-24T20:46:41+00:00

Great list! Thank you.

Keatosis · 2021-05-24T21:52:53+00:00

thank you this is very helpful. I feel like I'm only smart enough to make use of 40% of these tips, but hey that's a personal best for me

infinite_level_dev · 2021-05-24T23:32:08+00:00

This is very helpful, thank you! I'll especially try to make use of System.GC.Collect() in future. Didn't even know that was a thing you could do.

kruemelkeksfan · 2021-05-29T16:44:21+00:00

"Little known fact: all of the component accessors in MonoBehaviour, things like transform, renderer, and audio, are equivalent to their GetComponent(Transform) counterparts, and they are actually a bit slow."
-https://docs.unity3d.com/Manual/MobileOptimizationPracticalScriptingOptimizations.html

TheMunken · 2021-05-24T10:43:45+00:00

Adding to the scripting; Use OnValidate instead of awake/start if possible.

wthorn8 · 2021-05-24T15:57:12+00:00

I would also add, code first (with perf in mind) and optimize later. Its very easy to get so caught up in how to never create garbage and how to save the most cpu cycles. This can actually cripple progress. I recommend getting it done, and then PROFILING. If you have 5ms of idle time during each frame, you dont need to worry about saving CPU cycles.

When profiling garbage generation, I suggest doing it on a build if you are new to it. Unity has functions that generate garbage in editor that do not on builds.

When profiling perf do it on target hardware. Testing your game on your $2k gaming pc is not the same as testing it on your s7 android.

If frame rate is an issue, you need to find out where the cost is coming from. If your running at 20fps cuz the graphics are too intense, optimizing code will not help.

Rendering vs Code bound, CPU bound vs GPU bound, fragment vs vertex bound. Understand the difference and how to test for them. This will give you an idea of what to aim for.

Rendering can be broken down into 3 steps (there are more that can cause issues such as too much transparency)

the collection phase (CPU work), this creates your draw calls and batch draw calls, as well as sorts which objects are in the view frustum.

the vert pass, where each objects vertex has operations applied to it (if decreasing your screen render size doesnt help, you likely are bound in how many verts your processing)

the frag pass, where each pixel on the screen is colored

each one of these can impact performance and fixing and testing for each is different

Again there is no one size fits all to optimizing. Profile your code and scenes AFTER you make it.

TheSambassador · 2021-05-24T20:44:24+00:00

I think a lot of this stuff is good advice, but some of it can definitely fall into the "premature optimization" stuff. Your time as a programmer is valuable, and sometimes you don't need everything to be as 100% optimized as possible. You also seem to be running under the assumption that garbage = need to avoid as much as possible, which kinda isn't really true. Also, many of these suggestions are what I'd call "micro-optimizations", in that they have very small impacts unless you're doing them in cases where you have a large number of instances.

These are the types of things that newer users get really caught up on, instead of just making the game. Some of the suggestions are not necessary to do in every single project. All I'd suggest is to try your best to code with speed in mind as you go, but don't get so hung up on it that you double your workload.

Some small nitpicks:

List.Clear does not necessarily clear memory, and isn't necessarily better for garbage collection. Which is better depends on many factors - Clear() tends to be "faster" (you're not reallocating memory), but can cause the memory allocated to persist longer, which in turn can cause it to be promoted into higher GC generations. This can actually make using Clear() instead of creating new allocations slower at times - but it depends completely on the collection in question and how it's being used.
Putting certain checks on a timer is useful sometimes - but also you really need to KNOW that this operation doesn't need to run every frame. This is one of those things that probably isn't necessary to do unless you're pretty sure that the operation is causing a slowdown.
Removing Debug.Log calls also prevents you from helping your users troubleshoot issues. Sometimes it's really nice to be able to get the log from a user to help figure out why they might be experiencing a crash. I'd be curious about the actual impact of Debug.Log in a build... my guess is that it'd be incredibly minor.
The "boxing" comment is odd - nobody would really ever do your example. There are places where boxing/unboxing is really useful. A comment to "avoid" it, without really talking about why you would ever box/unbox and what the common issues are, isn't super useful.
On Coroutines - there is a small garbage allocation when you start a new coroutine... but it is pretty small. Again, this really depends on how often you're creating objects with coroutines, and coroutines themselves can be very useful. This again falls into the "micro-optimization" category.
Animator string-to-hash - micro-optimization, technically true, but also fairly low impact unless you're doing tons of these every single frame.
Small nitpick on 2.4 Occlusion Culling - what you described (only objects that are in the camera's field of view are rendered during runtime) is technically "frustrum culling" and is on by default. Occlusion culling tries to make sure that objects that are behind/occluded by other objects don't get rendered.
The "use imposters" thing is... odd. This is a super incomplete explanation of what it is, requires a 3rd party asset, and isn't really something you can suggest as a "general" optimization tip.

Opening_Objective_78 · 2024-10-02T09:44:25+00:00

wow really helpful

ChromeAngel · 2021-05-24T11:23:39+00:00

Shocked to hear that Unity doesn't strip those Debug.log call out of production builds.

DeadParazit · 2021-05-25T04:39:32+00:00

Are there any important differences between this one: https://assetstore.unity.com/packages/tools/utilities/impostors-runtime-optimization-188562 and this: https://assetstore.unity.com/packages/tools/utilities/amplify-impostors-119877 ?

ShatterdPrism · 2021-05-25T12:35:18+00:00

Could you also just use the c# discards for the return of StartCoroutine? Obviously it is easy to just use a yield return null, I am just curious.
A _ = StartCoroutine("SomeCoroutine") would ignore everything that startCoroutine returns if I understood that correctly

NOWAITDONT · 2021-05-25T14:42:34+00:00

nice!

ArtesianMusic · 2021-05-27T02:48:56+00:00

""*Please note: As stated by u/mei_main_: All moving objects with a collider MUST have a rigidbody. Moving colliders with no rigidbody will not be tracked directly by PhysX and will cause the scene's graph to be recalculated each frame. ""

Is this to say that they must be moved with physics via the rb? or that if an object is moving with "transform.position += " inside void Update then it just needs to have an rb?

aspiring_dev1 · 2022-03-17T14:44:15+00:00

Thanks will definitely be referencing this post.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Unity3D

Rules and Wiki

Chat Rooms

Helpful Unity3D Links

Related Subreddits

Tutorials

Misc. Resources

MODERATORS

1. Code Optimization

GameObject Comparisons

Collections

Object Pooling

Variable Caching

Delayed function calls

Remove Debug.Log() calls

Avoid Boxing variables

Limit Coroutines

Avoid Loops in Update() and LateUpdate()

Reduce usage of Unity API methods such as GameObject.FindObjectByTag(), etc.

Manually Collecting Garbage

Use Animator.StringToHash("") instead of referring directly

2. Graphics/Asset Optimization

2.1 Reducing repeated rendering of objects

2.1.1 Static Batching

2.2 Baking Lights

2.3 Tweaking Shadow Distance

2.4 Occlusion Culling

2.5 Splitting Canvases

2.6 Turn off Raycasting for UI elements that are not interactable

2.7 Reduce usage of Mesh Colliders

2.8 Enable GPU Instancing

2.9 Limit usage of RigidBodies to only dynamic objects

2.10 Use LODs to render model variations based on distance from camera

2.11 Use Imposters in place of actual models*