base

package
v7.3.0+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 8, 2026 License: Apache-2.0, Apache-2.0 Imports: 12 Imported by: 0

Documentation

Overview

Package base is using for HuaWei Ascend pin affinity schedule.

Package base is using for HuaWei Ascend pin affinity schedule.

Package base is using for HuaWei Ascend pin affinity schedule.

Package base is using for HuaWei Ascend pin affinity schedule.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type AscendHandler

type AscendHandler interface {
	plugin.SchedulerPlugin
	SetSchedulerAttr(util.SchedulerJobAttr)
	SetSchedulerEnv(plugin.ScheduleEnv)
	SetMaxNodeNPUNum(int)
	SetMaxCardNPUNum(int)
	SetNpuNumInvalidMap(map[int]struct{})
	SetIsNetworkFaultAttention(bool)
}

AscendHandler ascend npu event handler

func New

func New(name string, opts ...Option) AscendHandler

New return npu plugin

type NPUHandler

type NPUHandler struct {
	plugin.SchedulerBaseAttr
	util.SchedulerJobAttr
	plugin.ScheduleEnv
	IsNetworkFaultAttention bool
	NpuNumInvalidMap        map[int]struct{}
	MaxNodeNPUNum           int
	MaxCardNPUNum           int
}

NPUHandler base npu handler

func (*NPUHandler) CheckNodeNPUByTask

func (tp *NPUHandler) CheckNodeNPUByTask(task *api.TaskInfo, node plugin.NPUNode) error

CheckNodeNPUByTask check nod npu meet task req

func (*NPUHandler) GetCardNumGroupsFromTop

func (tp *NPUHandler) GetCardNumGroupsFromTop(nodeNPUTopology []int) [][]int

GetCardNumGroupsFromTop get the chip for each card from nodeTop

func (*NPUHandler) GetTaskReqNPUNum

func (tp *NPUHandler) GetTaskReqNPUNum(task *api.TaskInfo) (int, error)

GetTaskReqNPUNum get task require npu num

func (*NPUHandler) GetUsableTopFromNode

func (tp *NPUHandler) GetUsableTopFromNode(node plugin.NPUNode, disFlag bool) ([]int, error)

GetUsableTopFromNode Get ascend node usable top.

func (*NPUHandler) InitMyJobPlugin

func (tp *NPUHandler) InitMyJobPlugin(attr util.SchedulerJobAttr, env plugin.ScheduleEnv) error

InitMyJobPlugin set attr and env for plugin

func (*NPUHandler) IsInstanceOfJobGroup

func (tp *NPUHandler) IsInstanceOfJobGroup() bool

IsInstanceOfJobGroup check job is instance of job-group

func (*NPUHandler) IsVaildNpuNum

func (tp *NPUHandler) IsVaildNpuNum(value int) bool

IsVaildNpuNum check the single job require is valid. eg: 16P:1,2,4,8,16;8P 1,2,4,8.

func (*NPUHandler) JudgeNodeAndTaskNPU

func (tp *NPUHandler) JudgeNodeAndTaskNPU(taskNPU int, nodeNPUTopology []int) error

JudgeNodeAndTaskNPU judge node and task npu num

func (*NPUHandler) PreStartAction

func (tp *NPUHandler) PreStartAction(ssn *framework.Session) error

PreStartAction pre-processing actions for rescheduling

func (*NPUHandler) ReleaseAnnotation

func (tp *NPUHandler) ReleaseAnnotation(_ *api.TaskInfo, node plugin.NPUNode) *plugin.NPUNode

ReleaseAnnotation release annotation

func (*NPUHandler) ScoreBestNPUNodes

func (tp *NPUHandler) ScoreBestNPUNodes(task *api.TaskInfo, nodes []*api.NodeInfo, scoreMap map[string]float64) error

ScoreBestNPUNodes score node by calculate task req npu num and node npu top

func (*NPUHandler) SelectNPUFromNode

func (tp *NPUHandler) SelectNPUFromNode(task *api.TaskInfo, node plugin.NPUNode) ([]int, error)

SelectNPUFromNode select npu from node for task

func (*NPUHandler) SetIsNetworkFaultAttention

func (tp *NPUHandler) SetIsNetworkFaultAttention(value bool)

SetIsNetworkFaultAttention set network fault attention

func (*NPUHandler) SetMaxCardNPUNum

func (tp *NPUHandler) SetMaxCardNPUNum(num int)

SetMaxCardNPUNum set max npu num per card

func (*NPUHandler) SetMaxNodeNPUNum

func (tp *NPUHandler) SetMaxNodeNPUNum(num int)

SetMaxNodeNPUNum set max npu num per node

func (*NPUHandler) SetNPUTopologyToPodFn

func (tp *NPUHandler) SetNPUTopologyToPodFn(task *api.TaskInfo, top []int, node plugin.NPUNode)

SetNPUTopologyToPodFn set task select npu to pod annotation

func (*NPUHandler) SetNpuNumInvalidMap

func (tp *NPUHandler) SetNpuNumInvalidMap(value map[int]struct{})

SetNpuNumInvalidMap Set the single job not allow number. eg: 16P:9,10,11,12,13,14,15

func (*NPUHandler) SetSchedulerAttr

func (tp *NPUHandler) SetSchedulerAttr(attr util.SchedulerJobAttr)

SetSchedulerAttr set scheduler attribute for plugin

func (*NPUHandler) SetSchedulerEnv

func (tp *NPUHandler) SetSchedulerEnv(env plugin.ScheduleEnv)

SetSchedulerEnv set scheduler env for plugin

func (*NPUHandler) UpdateNodeInfo

func (tp *NPUHandler) UpdateNodeInfo(node plugin.NPUNode, usedTop []int) *plugin.NPUNode

UpdateNodeInfo update node info

func (*NPUHandler) UseAnnotation

func (tp *NPUHandler) UseAnnotation(task *api.TaskInfo, node plugin.NPUNode) *plugin.NPUNode

UseAnnotation select npu for task from node

func (*NPUHandler) ValidNPUJob

func (tp *NPUHandler) ValidNPUJob() *api.ValidateResult

ValidNPUJob check job req npu num

type Option

type Option func(AscendHandler)

Option the func for AscendHandler add attr

func WithAnnoPreVal

func WithAnnoPreVal(annoPre string) Option

WithAnnoPreVal build AscendHandler WithAnnoPreVal

func WithMaxNodeNum

func WithMaxNodeNum(num int) Option

WithMaxNodeNum build AscendHandler WithMaxNodeNum

func WithNetworkFault

func WithNetworkFault(enable bool) Option

WithNetworkFault build AscendHandler WithNetworkFault

func WithNpuInvalidMap

func WithNpuInvalidMap(m map[int]struct{}) Option

WithNpuInvalidMap build AscendHandler with NpuInvalidMap

Source Files

  • frame.go
  • node.go
  • task.go
  • type.go

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL