What is Graphics Processing Unit (GPU)?
Rose   utmel.com   2021-10-21 17:37:04

Topics covered in this article:
Ⅰ. What is GPU?
Ⅱ. The GPU Graphics Processing Flow 
Ⅲ. Applications for the GPU
Ⅳ. The GPU's Evolution
Ⅴ. Frequently Asked Questions about GPU

Ⅰ. What is GPU?

The graphics processing unit (GPU), also known as the visual processor, display core, or display chip, is a type of microprocessor that specializes in image processing on personal computers, workstations, gaming consoles, and mobile devices (such as tablets and smartphones). The graphics card, being an integral component of the computer host, is in charge of outputting and displaying graphics. For people who work in professional graphics design, the graphics card is crucial. The graphics card's primary function is to convert and drive the system's display information as well as send line scan signals to the display. Controlling the monitor's accurate display is an important part of connecting the monitor to the computer. GPUs are no longer restricted to performing simply graphics processing tasks. The GPU's general-purpose computing technology is also gaining traction in the industry. In terms of floating-point and parallel computation, GPUs can outperform CPUs by tens or even hundreds of times.

The graphics processing unit (GPU) is separated into two kinds.

One type is a discrete graphics card, which is a portion of the computer that is specifically built up for picture processing. Its algorithm and structure are both optimized. It is more efficient at completing more difficult imaging jobs, and the performance effect is also improved.

The integrated graphics processor (or built-in display core) is a graphics processor that is located on the motherboard or CPU and borrows a portion of the computer's system memory while in use. In 2007, PCs with integrated screens accounted for around 90% of all shipments. This approach may be less expensive than discrete graphics solutions, but it may have worse performance. Integrated graphics processors were once thought to be unsuited for playing 3D games or doing complex graphics-based calculations.

Ⅱ. The GPU Graphics Processing Flow 

The GPU graphics processing flow can be roughly divided into the following steps:

1. Geometry processing

2. Rasterization;

3. Pixel processing;

4. Render output;

1. Geometry processing

The CPU generates vertex information, which is used to start GPU operations. Develop using high-level programming languages (C, C++, JAVA) and deal with CPU and memory in the application phase. The basic goal is to find possibly visible grid instances and offer them to the graphics hardware together with their materials.For the purpose of rendering. Geometry data, such as vertex coordinates, normal vectors, texture coordinates, textures, and so on, will be generated at the end of this stage and transferred to the graphics hardware through the data bus for rendering (time bottleneck) before proceeding to the geometry stage.

Vertex coordinate transformation, lighting, clipping, projection, and screen mapping are all handled by the geometry stage. This level is dependent on graphics processing unit (GPU) calculations. After transformation and projection, the vertex coordinates, colors, and texture coordinates are obtained at the end of this stage.

As depicted in the following diagram, geometry processing starts with the vertex information delivered from the CPU to the GPU:


For the image we wish to process, thousands of triangles will suffice:


Triangles are also known as basic primitives since they may be used to create an image. Why not divide the figure into squares or other figures instead of innumerable triangles?

The following are the key reasons:

1. The simplest figure is the triangle; other figures can be decomposed;

2. A plane can be determined solely by a triangle's three fixed points, and more than three fixed points are not always on the same plane.

The GPU just needs to compute these triangles in parallel after decomposing the visuals into numerous triangles. The vertex shader will have to deal with this.


A Vertex Processor

Modern GPUs have a geometric unit that can perform a wide range of tasks, including entire setup, vertex correction, variability adjustment, basic lighting adjustment, and related material adjustment. The abstract mathematics created by the programmer will be restored to real visual space geometry by the action of the geometric unit, which will then be further adjusted to the appropriate position according to the program requirements, achieving the goal of fine-tuning the shape of the model to achieve different effects.

When it comes to genuine geometric processing capabilities, GPUs did not have them at first. We all know that the graphics card, which preceded the GPU, was little more than a paintbrush. The graphics card can be drawn anywhere the CPU points, and the graphics card can be drawn as the CPU allows. Despite the fact that the first-generation GPU revolution forced the CPU to relinquish control over transform and lighting, the first-generation GPU still lacks the capacity to entirely alter the geometry. The CPU still controls the vertices' location. The CPU does all geometric shape computations, and once they've been decided, they can't be changed.

Later, as the model's accuracy improved, the number of vertices climbed in lockstep with the expansion of polygons, eventually reaching a point where the CPU could no longer handle the load.


As a result, the CPU's ability to alter geometry through the Vertex Shader was fully removed with the second-generation GPU. The CPU currently just has the most basic function of producing vertices.

The Vertex Shader offered the GPU the capacity to alter the vertices in the model independently for the first time, and it also gave rise to new features in the GPU other than pixel processing and Triangle Setup.


The geometry processing procedure is as follows:

The first is the "tracing point," in which the geometric unit creates a visual space based on what we can see, namely the position of the screen as the line of sight's starting point. The first action of the geometric process is the process of positioning these points according to the coordinate criteria because the information about these points given by the CPU is merely the coordinates of their positions in space.


Process of generating vertices

The setup unit must then link the vertices when they have been appropriately placed. The shape of the object is determined after connecting the vertices according to the correct principles. When you look up at the night sky, there is always only one star visible. The constellations can only be seen by linking them. This is also why the setup output triangle exists.


Process of generating polygons

After we've finished forming the polygon, we'll move on to the most important part of the process: the vertices operation. Changes in the image's features, such as expressions, are merely modest alterations that will not affect the majority of the image. It is evident that specifying and generating a new model for varied expressions and even different forms of each frame is not cost-effective. Things. The following requirements can be met by "pulling" the polygonal shape by operating the vertices at the appropriate points.

As a result, the programmer will write expressions ahead of time and then use equations to indicate the positions of individual vertices in these expressions. According to the equations, the geometric unit's job is to ensure that the majority of the other vertices and the model's position remain unchanged. Operate these vertices to the proper position, which not only reduces the quantity and number of vertices generated by the CPU but also allows the programmer to achieve the form effect he or she desires in a very flexible and free manner. As a result, the vertex operation phase can be considered the most basic component of the complete geometric processing process.


Skin with a light texture

The image now has a shape after completing the above processes. Skinning and lighting are the next steps. Vertex Texture and Vertex Lighting will access the material library in advance and connect the most basic lighting information to the entire model based on the program requirements. The bottom layer's texture transforms the frame model into a plaster statue. The traditional geometric approach is virtually finished at this phase. From the abstract realm of equations and symbols, the graphics are subsequently concretized.

2. Rasterization

Rasterization is the process of converting two-dimensional images from geometric primitives. There are two stages to the process. Determine which integer grid areas in the window coordinates are occupied by basic primitives in the first part of the job; assign a color value and a depth value to each area in the second part of the work.

Fragments are created during the rasterization process. Color, depth, and texture data are stored at each point on the two-dimensional image. A fragment is a collection of connected information and a single point. A pixel in the frame buffer corresponds to each piece.

What is your motivation for rasterizing it? That's because the geometry's points and triangles are all three-dimensional data, and the graphics can only be presented on a two-dimensional screen in the end, necessitating the conversion from three-dimensional to two-dimensional.

The goal of rasterization is to discover the pixels that a geometric unit covers (such as a triangle).


Based on the position of the triangle's vertices, rasterization will calculate how many pixels are required to make the triangle and what information each pixel should get, such as the UV coordinates. The vertex data is interpolated to achieve this.

Rasterization is the process of transforming geometric data into pixels by a sequence of transformations and then displaying them on a display device, as demonstrated in the diagram below:


Coordinate transformation and geometric discretization are at the heart of rasterization.

Each art model in Unity3D is determined by the vertices and triangles created by the vertices. The process of filling each pixel (grid) covered by the triangle according to the three vertices of each triangle is termed rasterization when the 3D model is drawn on the screen.

3. Processing of pixels

The method of processing each pixel is known as pixel processing. Pixels make up the two-dimensional image created by rasterization. The surface lighting is realized after the pixels are mapped, and visual effects such as semi-reflection between the interfaces, diffuse reflection, and refraction are all part of the pixel processing.


People's manipulations on pixels could only be dubbed coloring before the advent of programmed shaders. We were not as pleased as we are now at the time. Special effects processing could only be done directly through fixed units at the time. Specific unique effects that can be achieved with each API generation must be codified into fixed instructions ahead of time. The range of pixels that can be processed in hardware is limited by the curing instructions. The programmer loses control once the pixels enter the pipeline. As a result, the introduction of programmable shaders, particularly Pixel Shader, was a huge deal at the time. Pixel Shader gave programmers the opportunity to accurately and arbitrarily manage what they intended to do at the pixel level for the first time. Special effects' capability.

It is not difficult to desire pixel-level precision control. The mixing degree and transparency of the three fundamental hues that make up color determine its performance, and the three primary colors and transparency can be precisely measured and stated by numbers in the computer world.


Why do we need to cover the object's surface with a layer of pre-baked material as a basis if the pixel processing method is to process pixels? Regardless, the majority of the colors on these materials are incorrect. After all, the ALU must still calculate and alter them, so why not allow the pixel processing unit to generate the correct pixels in the correct positions directly? In this manner, you can avoid making a seemingly insignificant error, and you may use the transistors from the original material operating unit to further improve the ALU section, giving it more powerful functionalities. 

The reason for this is simple: the existing ALU is incapable of handling the amount of computation required to directly generate pixels. 

The pixel processing procedure is straightforward. The amount of calculations required by the mathematical connection calculations on a vast number of pixels is the source of its enormous execution difficulties. It is only because of people's desire for the execution unit to handle mathematical connections more efficiently that the shader has been tossed so many times in the last ten years.

4. Create the output

Render Output Processor is the rendering output unit.

The ROP unit's job is to process special effects like fogging, execute sampling and anti-aliasing operations, and blend all image parts into the final picture before output.

Rendering is the initial step in the output rendering process. Many people consider the term rendering to be sacrosanct, although it is not. The rendering process is nothing more than combining multiple previously produced image parts. The seemingly intricate task that the geometry processing unit, rasterization, TMU, and Shader unit accomplish is simply to lay the groundwork and conditions for the rendering process. Rendering's actual mixing process, like other graphics processing procedures, is actually quite basic.


Aside from the Z-value-related tests to assure the output pixels' accuracy, the mixed output process rarely includes any other more difficult procedures. The ROP unit has a solid reason and justification to expand because of the comparatively peaceful and prosperous working environment. As people's graphical needs rise, the ROP unit will soon have a new function: assisting in the completion of full-screen anti-aliasing (FSAA) operations.


Anti-main Aliasing's content of mixing, neutralizing, and giving pixels new colors isn't ROP's most traditional task. Because of the anti-aliasing concept, this work had to be done through the ROP unit.


The enlarging of the image is the starting point for the Anti-Aliasing process. We must first increase the full image (Super Sampling, SSAA) or the more precise object edge (Multi-Sampling, MSAA), and then enlarge the object's edge with great color contrast. The pixel and its neighbors are removed and blended to provide a more natural, yet fuzzy, color transition than the original. Finally, the image is shrunk back to its original size to remove the color transition's blurring.


The following is a typical ROP structure:


The texture picked up by the TMU and the pixels processed by the shader will first be transmitted to the corresponding z/stencil buffer, and then the ROP unit will execute z/stencil checks on these textures and pixels, despite the fact that the model will be rasterized afterward. Although the actual Z axis is no longer available, the depth information it contains will be kept. Judging the depth and template information can help the ROP decide whether or not to display certain pixels, which can help to avoid displaying pixels that are entirely occluded. The incorrect display is at the front, and it can lower the pressure on the color output section that follows. Many people believe that the rasterization process is completed at the ROP unit because of the depth of judgment and removal operation, as well as the deceptive unique term Raster Operations Units. Rasterization and the ROP unit do not have a direct relationship. of. Rasterization is the model's 3D-2D coordinate projection transformation, while ROP is the pixel mixing and output.

The pixels with a defined range of depth values will be delivered to the alpha unit for transparency check once all of the pixels have finished the depth check and other processes. Transparency and transparency blending effects are critical for effects like fogging and volumetric light. As a result, the alpha unit inspection is almost as significant as the depth inspection. ROP will perform alpha blending operations on particular pixels in Blend units based on the program's requirements.


Following the preceding processes, the remaining pixels will be filled into the 2D model's needed range, which is our standard Pixel Fillrate process. Pixel Fillrate is similar to a large pot. The pixels that appear as raw materials, such as shredded pork, green peppers, winter bamboo shoots, and onion, will be correctly blended here, along with ginger, salt, sugar, and bean paste. Following the mixing, the previously isolated information contained in the graphic elements will be released, much like the aroma produced by the interaction of the ingredients, and eventually form an image that we can accept.

Because the shader has processed the pixel effect in a mathematical form, if there is no AA operation, the graphics rendering task has been done so far, and the mixing and filling of all effects will eventually present the right picture. The screen that has been processed will be delivered to the output buffer to await output. If the program calls for AA, such as MSAA, the ROP's AA unit must also multi-sample the filled screen several times before performing the color mixing operation on the sampled pixels. After the screen is finished, send it to the frame buffer and wait for it to be output to the screen. 


Ⅲ. Applications for the GPU

a. Displays of airborne and other military forces, geographic data, and virtual reality fields

b. The domains of biological engineering display, medical imaging, and exploration imaging

c. Artificial intelligence and supercomputing

Ⅳ. The GPU's Evolution

A. The Evolution of Performance

Parallelism at a higher thread-level

The number of stream processor units is increased.

More general-purpose registers on the chip

A larger pool of shared memory

More CACHE levels on the chip

Memory bandwidth expansion

B. The integration of GPU and CPU is becoming more common.

The semiconductor manufacturing process provides technical support for CPU and GPU integration. The application's demand also encourages the integration of GPU and CPU to advance. After AMD and ATI merged in 2006, the GPU and CPU began to converge.

Advantages of having a single chip with both a CPU and a GPU:

Low latency and power consumption overall

Smaller size and lesser power are more significant in the mobile industry.

In June 2010, AMD introduced the APU for embedded systems, which combines the X86 processing core with the graphics engine on a single silicon chip.

Intel's Ivy bridge core blends Intel's GMA series graphics processors with X86 cores on a single silicon device.

NVIDIA also introduced the Tegra line of chips, which comprise ARM multi-core processors, integrated GPU cores, MPEG codecs, MP3 decoders, picture signal processors, and other components.

ARM introduced a unified processor that combines the CPU and GPU, based on its ARM processor and MALI series GPU.

These processors' core designs are two heterogeneous CPU and GPU processors that can be bridged together in a certain method to drastically cut development costs and speed up development, but their architectures are not fundamentally different from statistical computer structures.

Ⅴ. Frequently Asked Questions about GPU

1. What is a GPU in a computer

GPU is the graphics processor. The GPU is the heart of the graphics card, which is equivalent to the role of the CPU in the computer. It determines the grade and most of the performance of the graphics card.

2. Does every computer have a GPU? 

Definitely both. The GPU is the display chip. The computer must have a GPU whether it is an integrated graphics card or a discrete graphics card, otherwise it will not be able to display

3. What does GPU stand for?

GPU stands for Graphic Processing Unit.