Got it

FusionMind Common Problems in Traffic Policing and Non-Scenario Deployment

Latest reply: Jun 5, 2019 03:45:52 419 3 12 0 6

Version 6.5.RC2


t_0003.gifQuestion 1:In the new deployment environment, only one P4 card can be allocated to multiple P4 cards.

1.1 Problem Description:

The GPU server displays multiple P4 cards, but only one P4 card is allocated to the task. If more than one P4 card is allocated, a message is displayed, indicating that the task fails.

1.2 Possible Causes:

1. GPU resources are occupied by other processes. As a result, the allocated GPU resources are insufficient.

2. GPU configuration parameters of cluster nodes are incorrect.

1.3 Fault Locating

1. Check the GPU server process in the background and check whether the GPU process is running.

2. Check GPU configuration parameters of nodes in the cluster.

1.4 Solution

1. Log in to the FusionInsight Manager management page and click Service Management, yarn, Service Configuration, and All Configuration.

2. Enter gpu in the search box and modify the yarn.scheduler.maximum-allocation-gpus parameter. The default value is 1. Change the value to a value greater than the number of P4 card blocks configured on the platform.


t_0003.gifQuestion2: The Task Start Time Is Too Long

2.1 Problem Description:

The algorithm is deployed on the FusionMind platform, but the task is started for more than 10 minutes. In addition, logs cannot be viewed during the startup, and the system displays a message indicating that the operation fails.

2.2 Possible Causes:

(1) The FusionMind platform service is abnormal.

2. When a task is created, the allocated resources exceed the remaining resources in the resource pool.

2.3 Fault Locating

1. Log in to the Manager platform and check the status of each service. Ensure that all services are running properly and the bash test algorithm is executed successfully.

2) Stop the task, check the remaining resources in the resource pool, and allocate resources. It is found that the memory resources exceed the remaining memory resources in the resource pool.

2.4 Solution:

Create a task again. Note that the CPU, memory, and GPU resources are allocated to the remaining resource pool.




Do you know how to solve the problem that the task fails to be started after the algorithm upgrade?
View more
  • x
  • convention:

tanjingmeng
tanjingmeng Created Jun 5, 2019 04:49:45 (0) (0)
https://forum.huawei.com/enterprise/en/FusionMind-Common-Problems-in-Traffic-Policing-and-Non-Scenario-Deployment/thread/534915-899
you can move to this one to figure it out.  
BigFI
BigFI Reply tanjingmeng  Created Jun 5, 2019 04:53:50 (0) (0)
Do you know how to upload the algorithm package?  

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.