Hello community!
This time I will share an interesting article about a topic that I have not seen much information about in the Forum, and that I consider has great importance in the era of Artificial Intelligence and Big Data. It is about Heterogeneous Computing. Without further ado, I hope you find it interesting.
:::::::::::--------:::::::::::::::--------
In the Internet industry, with the popularization of computerization and the rapid increase in the volume of data, people have new requirements for storage space. At the same time, the rise of machine learning, Artificial Intelligence, controllerless industrial simulation, and other fields has made general-purpose CPUs, when it comes to massive computations and massive data / images, increasingly present bottlenecks. performance, such as low parallelism, insufficient bandwidth, and high latency.
To meet the needs of diversified computing increasingly scenarios have begun to introduce hardware GPU and FPGA for acceleration, and has emerged Heterogeneous Computing. It mainly refers to the calculation method of a system made up of computing units of different types of instruction sets and architectures.

Heterogeneous Computing Architecture
In the 1980s heterogeneous computer technology was born. So-called heterogeneous refer to various computing units such as CPU, DSP, GPU, ASIC, coprocessor, FPGA, etc., computing units that use different types of instruction sets and different architectures to form a hybrid system that performs special calculations. The method is called "heterogeneous computing".
Specially in the field of Artificial Intelligence, heterogeneous computing shows promise. As we all know, AI means extremely high requirements for computing power. Today, heterogeneous computing represented by GPUs has evolved into a new generation of computing architecture that accelerates AI innovation .
Why do we need heterogeneous computing?
When it comes to computing, we generally think of CPUs, but CPUs are general-purpose computing and are subject to Moore's Law. With the development of diversified computing, especially the diversification of types of applications, CPUs are "powerless" to process certain types of calculations. The introduction of specific units turns the computer system into a hybrid structure. Each different type of computing unit can perform the tasks that it is best given.
Although CPU is not good for computing, it is good for management and programming, such as data reading, file management, human-computer interaction, etc. It has many routines and many auxiliary tools; GPU management is weaker and calculations are stronger, but due to the concurrency of multiple processes, it is more suitable Algorithms for processing full block streams of data; FPGA can manage and operate, but the development cycle is long and the development of complex algorithms is difficult. It is suitable for stream processing algorithms, either a complete block of data or one by one. In terms of real-time performance, FPGA is the highest.
When the demand for massive computing power comes, such as artificial intelligence, GPU, FPGA, and CPU will perform the calculations naturally.
The two main factions of heterogeneous computing: GPU and FPGA
The heterogeneous computing platforms that we are most familiar with are the "CPU + GPU" and "CPU + FPGA" architectures . The biggest advantage of these typical heterogeneous computing architectures is that they have higher efficiency and low latency computing performance than traditional CPU parallel computing. Especially when industry demand for computing performance is increasing, heterogeneous computing becomes increasingly important.

CPU + GPU heterogeneous computing architecture
As we all know, compared to CPU, GPU and FPGA have many advantages. The GPU has greater parallelism, a higher single-machine computing peak, and greater computing efficiency; The advantages of FPGA are mainly reflected in its higher watt performance, higher unstructured data computation performance, higher hardware acceleration performance, lower device interconnection delay.
Currently, the most widely used heterogeneous computing is to use GPUs for acceleration. All conventional GPUs use a unified architecture unit. With a powerful line of programmable stream processors, GPUs leave the CPU far behind in single-precision floating point operations. GPU makers represented by Nvidia and AMD touted the GPU to greatly accelerate general computing. Various GPU manufacturers have introduced GPUs suitable for general computing, GPGPU(General Purpose GPU). For a time, the entire industry talked about GPU computing .
In addition to GPUs, FPGAs have become a hot spot in the semiconductor industry in recent years. As a high-performance, low-power programmable chip, FPGA can design specific algorithms according to customer customization. So when it comes to massive amounts of data, FPGA has the advantage over CPU and GPU: FPGA has higher computational efficiency and FPGA is closer to IO.
FPGA does not use instructions or software, it is a device that integrates software and hardware. A hardware description language is used to program an FPGA, and the logic described by the hardware description language can be compiled directly into a combination of transistor circuits. Therefore, FPGA actually uses transistor circuits to implement the user's algorithm directly, without translating the instruction system.

CPU + FPGA heterogeneous computing architecture
Of course, in addition to GPU and FPGA, ASIC is also a computer chip option. ASIC is a kind of special-purpose chip, which is different from the traditional general-purpose chip, it is a specially customized chip for a specific demand. The computing power and computing efficiency of the ASIC chips can be customized according to the needs of the algorithm. Therefore, compared to general-purpose chips, ASIC has the following advantages: small size, low power consumption, high computing performance, high computing efficiency, and chip output. The higher the volume, the lower the cost. But the shortcomings are also obvious: the algorithm is fixed and may not be available once the algorithm changes.
At present, Artificial Intelligence is in a period of great explosion, and a large number of algorithms are constantly emerging. It is far from reaching the algorithm plateau. How to adapt special ASIC chips to various algorithms is the biggest problem.
Different processor chips have their own distinctive features in the construction of heterogeneous computing . There are a lot of open source software and application software in the CPU and GPU fields. Any new technology will first use the CPU to implement algorithms. Therefore, the CPU programming resources are abundant and easy to obtain, and the development cost is low and the development cycle is low. FPGA is implemented using Verilog/VHDL and other low-level hardware description languages, requiring developers to have a deeper understanding of the chip's characteristicsFPGAs , but their high parallelism characteristics can often lead to orders of magnitude improvements in business performance; at the same time, FPGA is dynamic and reconfigurable. After implementation in the data center, different logics can be configured to achieve different hardware acceleration functions according to the business form.
ASIC chips can get the best performance, that is, high area utilization, high speed and low power consumption; But AISC development is extremely risky and a large enough market is needed to guarantee cost pricing, and the time period from R&D to market is very long. It is not suitable for areas where algorithms like deep learning from CNN (Convolutional Neural Networks) are rapidly iterating.
Conclusion
The current trend of computing diversification is unstoppable and a single CPU can no longer demand computing power. In this case, the heterogeneous computing represented by GPU and FPGA is popular. In particular, with the promotion of new technologies such as Artificial Intelligence, Big Data and IoT, the types of applications are diversified and their computing requirements are also differentiated. Whether it is "CPU + GPU" or "CPU + FPGA", they are all intended to better meet custom computing needs.
It is foreseeable that with the evolution of the computing industry, heterogeneous computing will have a wide range of development space, and we will also see increasingly heterogeneous computing architectures playing an increasingly important role in the transport of applications.
Source (in Chinese): https://forum.huawei.com/enterprise/zh/thread-448777-1-1.html
So far the article. To all this I would like to add that today Huawei has its own line of equipment that makes use of the benefits of this technology. This is the FusionServer Pro G series. The Huawei FusionServer Pro G series are heterogeneous servers for data centers. They apply to scenarios such as artificial intelligence (AI), high performance computing (HPC), databases, and video analytics.
I hope you liked it, feel free to leave your impressions in the comments section.
#EnterpriseCommunity
#OneHuawei
#MVE





