Got it

[ACL] Performance behavior of using aclrtMallocHost() vs malloc()

Created: Jan 25, 2021 09:53:04Latest reply: Jan 26, 2021 12:10:28 1024 5 0 0 0
  HiCoins as reward: 0 (problem unresolved)

I developed a single-device micro-benchmark code, implementing a matrix-matrix multiplication operation, based on ACL and targeting Atlas 800-3000. The workflow is the following:


  1. Allocate data on the host

  2. Allocate data on the device

  3. Initialize data on the host

  4. Transfer data from the host to the device

  5. Launch aclblasGemmEx()

  6. Synchronize to wait for the completion of the task in 5)

  7. Copy back data from the device to the host


I measured the elapsed time for the step 5) and I experience a very low performance when I use normal libc malloc() for allocating host memory in step 1) instead of aclrtMallocHost(). This is surprising for me since the time measurement does not take into account the data transfers as I do consider only the step 5). I ran the application through the profiler and found that when I use malloc() the runtime is doing unexpected calls to HostMalloc during kernel launch (see the attachment for a screenshot of the profiling from MindStudio).


Profiling Matrix-Matrix Multiplication


Using aclrtMallocHost leads to 35% of performance improvement compared to when malloc() is used.


Do you have any feedback on this behavior? Especially on what aclrtMallocHost() is doing differently

compared to malloc() ? Why do we have HostMalloc calls during kernel launch (note that these calls happen only when malloc() is used)?


Featured Answers

Recommended answer

little_fish
Admin Created Jan 26, 2021 07:07:38

Daer,
AclrtMallocHost is optimized. The pin operation is performed in advance to facilitate subsequent DMA operations. However, when malloc is applied for, only an address is allocated in advance. When the malloc is used, space is allocated in the heap and then DMA is performed.
Thank you.
View more
  • x
  • convention:

All Answers
Hello, dear!
It's nice to meet you in the community.
We're working on your problem. Please be patient.
View more
  • x
  • convention:

Hi, aclmallochost is optimized. It does some pin operation . otherwise malloc just allocate address.
View more
  • x
  • convention:

Daer,
AclrtMallocHost is optimized. The pin operation is performed in advance to facilitate subsequent DMA operations. However, when malloc is applied for, only an address is allocated in advance. When the malloc is used, space is allocated in the heap and then DMA is performed.
Thank you.
View more
  • x
  • convention:

Posted by little_fish at 2021-01-26 07:07 Daer,AclrtMallocHost is optimized. The pin operation is performed in advance to facilitate subsequen ...
Hi,

Thank you for this comment. I understand that aclrtMallocHost() does memory pinning and therefore memory transfers between Host and Device are accelerated. What I am still not getting is why the Runtime is doing HostMalloc calls during kernel launch, as shown in profiler output screenshot? I am not expecting that since the memory is already allocated and initialized on the Host before calling kernel launch.

Thanks in advance.
View more
  • x
  • convention:

Posted by moostaf at 2021-01-26 08:50 Hi, Thank you for this comment. I understand that aclrtMallocHost() does memory pinning and theref ...
Dear, please see this document:
https://support.huaweicloud.com/intl/en-us/adevg-A500pro_3000/atlasdevelopment_01_0153.html
View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.