I have some problem related to Atlas 300I(Model 3000) card. I completed the first installation step including OS dependency, NPU driver, NPU firmware etc. and I passed the every step without faced any problem but at the end of the installation I faced that I can see only 1 Ascend 310 Chip with “npu-smi info” command but acording to whitepapera, the card has 4 Ascend 310 Chip. It is prety strange.
Here are some queries and their outputs as follows:
Server Model = TaiShan 2280
OS Info = Ubuntu-18.04.6 LTS
Atlas 300I Card Info = Atlas300I - Model 3000
NPU Driver Vesion = 21.0.2
NPU Firmware Version = 1.78.23.33.230
NPU Driver&Firmware Download Link = https://support.huawei.com/enterprise/zh/software/253276357-ESW2000387090
miniD 1 to MCU iic channel status: Fault miniD 3 to MCU iic channel status: OK MCU to miniD 0 iic channel status: OK MCU to miniD 1 iic channel status: OK MCU to miniD 2 iic channel status: OK MCU to miniD 3 iic channel status: OK slot6_card6 health state is Fault Product Type : Atlas 300I Model 3000 2021-12-16 09:39:20 slot6-chip0 query slot6-chip0 pci_switch failed 8
npu-smi info query output =
As you see, I can see only 1 Ascend 310 Chip. I searched this error on "ascend/modelzoo gitee" page, "bbs.huaweicloud" and "Ascend Developer Zone" group but I didn't find any solution related to this.
So, what can I do at this stage? I look forward to your help in this issue. If you have any idea to figure it out this problem, I would be happy to hear.
Hello, Kubilay, you have raised issue(https://gitee.com/ascend/modelzoo/issues/I4MUTQ#note_7909615) and pulled the internal group, WeLink alignment processing.
Hello, dear. It's nice to meet you in the community. We're working on getting the right answer for you. Please rest assured that we'll be back with an answer shortly.
Hello, Kubilay, you have raised issue(https://gitee.com/ascend/modelzoo/issues/I4MUTQ#note_7909615) and pulled the internal group, WeLink alignment processing.
The Atlas 300I(Model 3000) inference card is not compatible with the TaiShan 100 server. That's why, the issue has been fixed. Thanks to everyone who contributed
Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
Politically sensitive content
Content concerning pornography, gambling, and drug abuse
Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."