Mobile Unity development, performance recommendations for engine PowerVR

If you are doing mobile-side Unity development, then you must not miss this blog! Recently, we have provided PowerVR with performance recommendations for the Unreal 4 game engine. If you use the Unity game engine more often, we will share with you some of the ways to improve Unity's performance in the PowerVR performance recommendation document . .

Most of the optimizations are applicable to mobile platforms, but some of them are specifically targeted for PowerVR platforms. In other words, no matter what your target application platform is, almost all situations can bring about the improvement of performance. It is usually a good practice to adopt these suggestions.

Compilation settings

Texture compression

First, be sure to use texture compression, which not only compresses storage space but also saves runtime bandwidth, which is one of the best ways to improve performance and save battery life. The advantage of texture compression is that they are always compressed until they are used for a fragment processing operation.

Unity supports a variety of texture compression methods, the default method is ETC, of ​​course, you can also choose other methods, as shown below:

The compression format options are described below:

ETC supports all devices as a texture compression format, but is replaced by ETC2 in terms of compression quality and size. Although ETC is simple and has extensive support, it does not support Alpha channels and the compression ratio is not ideal .

PVRTC this texture compression format supports only the PowerVR hardware platform, supports Alpha channel, has the best compression ratio and compression quality, can meet the quality / size requirements through a highly flexible configuration.

ASTC is an open source compression format that supports most platforms. It supports Alpha channels and has similar compression and configurability as PVRTC.

DXT is widely supported by desktop applications as a compression format. Due to licensing issues in the mobile space, only Nvidia Tegra supports it.

ATC is only supported by Qualcomm Adreno platform

Although the texture format setting is global, the texture compression quality of each frame can be adjusted as shown in the following figure:

Quality settings

One of the most common optimization methods is to reduce the amount of light per pixel. The amount of light allowed will directly affect the load on the GPU because each object in the scene needs to be rendered in different light conditions. This means that each grid needs to pass a single light, and the maximum number of lights in the pixel light quantity option setting will affect the grid's effect. If the amount of light affecting the grid exceeds the upper limit, only the most important light will be used to render each grid. If the light is evenly distributed and does not overlap, then you can increase it at any given time. The amount of light available.

How to change the amount of light allowed per pixel is shown in the figure below:

Make sure that the shadow map resolution is "just right" and that using the lowest setting still looks good, if the shadow map resolution is too high, this will not only waste memory and bandwidth but will also affect the efficiency of the cache.

As shown in the figure below, the shadow resolution can also be set for each light, and the respective light shadow resolution always takes precedence over the global quality setting unless the “Use Quality Setting” option is selected.

If the camera angle allows and the cascade settings do not cover many areas, it is also a good idea to adjust the number of shadow cascades appropriately. In this case, you can reduce the number of cascades, from 4 to 2 or no cascade.

It is of course possible to increase the effective shadow map resolution by changing the shadow cascade settings from "close" to "stable", although it is possible to increase the effective shadow map resolution at the expense of inconstant flicker.

Close fit means that the shadow rendering algorithm will use the set shadow map resolution as efficiently as possible. The result is a higher quality shadow effect, but this also causes some flicker problems, especially when the camera or light moves. We use the "soft shadow" option to hide these flashes.

Stable fit means that shadow rendering will stabilize the edge of the shadow as much as possible, which means that the shadow will not flicker when the camera or light moves, which of course will also make the shadow quality degrade.

The balance between "Close fit" and "Stable fit" requires careful consideration. "Close fit" requires lower resolution shadows to look higher quality, but more filtering (soft shadows) to hide Flicker, on the other hand, "stable fit" requires higher resolution shadows appear to be of higher quality, but it does not require too much filtering because it does not have flicker problems.

Grid settings

LOD (level of detail) is a very useful tool for managing geometry complexity. You can use them to specify the degree of detail of a geometry when the camera is away from a given object. This does not make the geometry on the screen. More than a certain amount, but the quality is sufficient.

Note that LOD is a very good way to optimize the geometric content of the mobile. Setting a higher LOD, such as a lower resolution grid, can be used.

The following screenshot shows how to set the LOD bias in the quality settings:

Note that reducing the geometry load will help reduce the amount of calculations so that the device does not suffer from severe heating and has a longer battery life.

Graphic settings

If you need HDR rendering, it would be better to use FP16 instead of RG11B10 because it will provide better accuracy and quality.

Delayed rendering vs forward rendering

For light settings Unity provides both deferred rendering and forward rendering options. Deferred rendering usually provides better performance for scenarios that use many overlapping light sources, whereas forward rendering on the mobile side provides better performance because some Advanced features that accelerate deferred rendering such as PLS (Pixel Local Storage) are not supported on the mobile side by Unity. For small light scenes, forward rendering has better performance because it has less overhead. The following screenshot shows how Choose between deferred rendering and forward rendering:

Player settings

The latest Android devices (including PowerVR) are able to support the Vulkan API, Vulkan is really good, it allows you to reduce the CPU load, and better control the synchronization operation. It is more suitable for multi-core CPUs because it supports the submission of multi-threaded commands to the GPU.

When choosing the API to use, Unity will try to use the most advanced. However, if the hardware platform does not support the API of your choice, Unity will try other APIs to get the most extensive support.

If you choose to use OpenGL, it might be a good idea to use multi-threaded rendering. This choice will ensure that the rendering and other operations are run in separate threads, which means that different operations will be better distributed to different kernels for execution. Although this is not as good as Vulkan's multi-threaded command submission feature, it still greatly improves the performance of the CPU.

Shader

Alpha blending can be used instead of the alpha test/clip() method for transparent surfaces in shaders. Note that the alpha test will damage the overall architecture as well as the previous depth testing method. On the PowerVR platform, the underlying primitive performs a deep write before the fragment processing pipeline stage, which allows the PowerVR device to not overwrite the operation of the underlying primitives, thereby saving a lot of computing time and bandwidth.

However, the PowerVR's alpha tested/discard primitive can only perform data write caching operations after the fragment shader is executed. This delayed deep write can affect performance, because after the alpha-test primitive finishes updating the deep buffer operation, Subsequent primitive operations can only be performed.

On all mobile architecture platforms, it is advantageous not to use partial color coverage. It is best to set it to RGBA or 0. If you use partial color coverage, the previous data frame has to be read again. This is a full-screen primitive. The operation to perform, it will be treated as a texture, and the texture needs to be partially cleared, by submitting another full-screen primitive command to complete.

Although all mobile architecture platforms are very good at half-precision calculations, PowerVR is very suitable, so be sure to use this feature as much as possible. Compared to high-precision (FP32), using half-precision (FP16) in the shader will Bring significant performance improvements, thanks to a dedicated FP16 product-sum (SOP) computation pipeline, which can process two SOP operations in parallel within each cycle, theoretically doubling the throughput of floating-point operations. The FP16 SOP pipeline is used on most PowerVR Rougue graphics cores, depending on the model.

Clear sign

It is necessary to ensure that the screen is cleared for each frame so that the GPU does not need to load the image content from the previous frame buffer. Use the "solid color" or "skybox" erase mode to ensure that the GPU is no longer in memory. Load previously used texture data.

On the PowerVR platform, deep preparation is a very inefficient matter, because the GPU has to perform two deep test operations to save the deep buffer into memory, thus completing the clear operation of redundancy. You can ensure that Unity does not do this by setting the camera's clear flag to "depth only" or "don't clear". We strongly recommend that you do not use both modes.

Mipmaps (Texture Mapping)

Similar to the LOD solution of the mesh (LOD group), the textured LOD solution is called mipmapping. As the camera distance increases, mipmapping can automatically reduce the resolution of the displayed texture. This significantly increases the resolution. Cache efficiency, improve performance, reduce bandwidth, etc., set mipmaps as follows:

One thing to keep in mind when using mipmaps is that they can only be used for 3D elements inside the scene. For 2D elements such as the UI is 1 to 1 mapped to the screen, so it is not needed, but if they are zoomed then you may still need to use Mipmaps.

To ensure seamless transitions between different mipmap levels, be sure to use trilinear filtering on mipmapped objects as shown in the following figure:

Grid optimization

First make sure you enable the grid optimization option. This setting will rearrange the vertices and indexes to better utilize the GPU cache space, as shown in the following screenshot:

Mobile content optimization

The mobile devices themselves have some special constraint restrictions. They must ensure that the battery power is at least one day old, and that there are no serious problems with the user's hands. This means that when the game on the computer is ported to the mobile terminal, We must do content optimization.

First of all, the most important geometric complexity needs to be optimized. The number of geometric figures on a computer screen is usually visible from 2 to 3 million. On mobile devices, this number is generally 200,000 to 300,000 polygons. After the number of optimizations is complete, the developer also needs a configuration file and verifies if the vertex shader is too complex.

Followed by the use of texture resolution and bandwidth, for example, post-processing effects need to be adjusted to adapt to the mobile device, the GPU memory bandwidth on the computer is 200B/s to 300GB/s, the available memory bandwidth on the mobile device is The CPU and GPU share about 20GB/s to 30GB/s, although this means that the texture resolution needs to be reduced by half, but you also need to consider that the mobile screen is much smaller (approximately 20+ inches of computer screen, mobile device screen About 5 inches), so smaller textures are usually sufficient, and adjustments need to be profiled to verify the results.

to sum up

As we have seen above, Unity provides a series of options to optimize mobile content and games. These can be applied well on PowerVR, although it will not bring immediate performance improvement, but it will at least be effective. Reduce GPU load to save battery power.

Golf Cart Charger

Golf Cart Charger,Electric Forklifts Charger,Electric Motorcycles Fast Charger,Battery Charger For Ebike

HuiZhou Superpower Technology Co.,Ltd. , https://www.spchargers.com

Posted on