tencent cloud

Tencent Kubernetes Engine

Release Notes and Announcements
Release Notes
Announcements
Release Notes
Product Introduction
Overview
Strengths
Architecture
Scenarios
Features
Concepts
Native Kubernetes Terms
Common High-Risk Operations
Regions and Availability Zones
Service Regions and Service Providers
Open Source Components
Purchase Guide
Purchase Instructions
Purchase a TKE General Cluster
Purchasing Native Nodes
Purchasing a Super Node
Getting Started
Beginner’s Guide
Quickly Creating a Standard Cluster
Examples
Container Application Deployment Check List
Cluster Configuration
General Cluster Overview
Cluster Management
Network Management
Storage Management
Node Management
GPU Resource Management
Remote Terminals
Application Configuration
Workload Management
Service and Configuration Management
Component and Application Management
Auto Scaling
Container Login Methods
Observability Configuration
Ops Observability
Cost Insights and Optimization
Scheduler Configuration
Scheduling Component Overview
Resource Utilization Optimization Scheduling
Business Priority Assurance Scheduling
QoS Awareness Scheduling
Security and Stability
TKE Security Group Settings
Identity Authentication and Authorization
Application Security
Multi-cluster Management
Planned Upgrade
Backup Center
Cloud Native Service Guide
Cloud Service for etcd
TMP
TKE Serverless Cluster Guide
TKE Registered Cluster Guide
Use Cases
Cluster
Serverless Cluster
Scheduling
Security
Service Deployment
Network
Release
Logs
Monitoring
OPS
Terraform
DevOps
Auto Scaling
Containerization
Microservice
Cost Management
Hybrid Cloud
AI
Troubleshooting
Disk Full
High Workload
Memory Fragmentation
Cluster DNS Troubleshooting
Cluster kube-proxy Troubleshooting
Cluster API Server Inaccessibility Troubleshooting
Service and Ingress Inaccessibility Troubleshooting
Common Service & Ingress Errors and Solutions
Engel Ingres appears in Connechtin Reverside
CLB Ingress Creation Error
Troubleshooting for Pod Network Inaccessibility
Pod Status Exception and Handling
Authorizing Tencent Cloud OPS Team for Troubleshooting
CLB Loopback
API Documentation
History
Introduction
API Category
Making API Requests
Elastic Cluster APIs
Resource Reserved Coupon APIs
Cluster APIs
Third-party Node APIs
Relevant APIs for Addon
Network APIs
Node APIs
Node Pool APIs
TKE Edge Cluster APIs
Cloud Native Monitoring APIs
Scaling group APIs
Super Node APIs
Other APIs
Data Types
Error Codes
TKE API 2022-05-01
FAQs
TKE General Cluster
TKE Serverless Cluster
About OPS
Hidden Danger Handling
About Services
Image Repositories
About Remote Terminals
Event FAQs
Resource Management
Service Agreement
TKE Service Level Agreement
TKE Serverless Service Level Agreement
Contact Us
Glossary

QoSAgent

PDF
Modo Foco
Tamanho da Fonte
Última atualização: 2024-02-05 16:28:54
QoS Agent is an extended component enhanced by Tencent Cloud based on quality of service, offering an array of capabilities. It ensures stability while increasing the utilization rate of cluster resources.
Note:
QoS capabilities are only supported on native nodes. If your nodes are not native, or your workload does not reside on native nodes, these capabilities will not be effective.

Kubernetes objects deployed in a cluster

Kubernetes Object Name
Type
Default Resource Occupation
Associated Namespaces
avoidanceactions.ensurance.crane.io
CustomResourceDefinition
-
-
nodeqoss.ensurance.crane.io
CustomResourceDefinition
-
-
podqoss.ensurance.crane.io
CustomResourceDefinition
-
-
timeseriespredictions.prediction.crane.io
CustomResourceDefinition
-
-
kube-system
Namespace
-
-
all-be-pods
PodQOS
-
kube-system
qos-agent
ClusterRole
-
-
qos-agent
ClusterRoleBinding
-
-
crane-agent
Service
-
kube-system
qos-agent
ServiceAccount
-
kube-system
qos-agent
Daemonset
-
kube-system

Feature Overview

Feature
Description
Priority of CPU Usage
The feature of setting CPU usage priority ensures a sufficient supply of resources for high-priority tasks during resource competition, thereby suppressing low-priority tasks.
CPU Burst
CPU Burst permits temporary provision of resources beyond the limit for latency-sensitive applications, ensuring their stability.
CPU Hyperthreading Isolation
Preventing L2 Cache of high-priority container threads from being affected by low-priority threads running on the same CPU physical core.
Memory QoS Enhancement
A comprehensive enhancement of memory performance, along with the flexible limitations on the memory usage of the container.
Network QoS Enhancement
A comprehensive enhancement of network performance, along with flexible limitations on the network usage of the container.
Disk IO QoS Enhancement
A comprehensive enhancement of disk performance, along with flexible limitations on the disk usage of the container.

QoS Agent Permission

Note:
The Permission Scenarios section only lists the permissions related to the core features of the components, for a complete permission list, please refer to the Permission Definition.

Permission Description

The permission of this component is the minimal dependency required for the current feature to operate.

Permission Scenarios

Feature
Involved Object
Involved Operation Permission
Reading podqos, nodeqos, time series, and other configurations
podqoss / nodeqoss / avoidanceactions
get/list/watch/update
Viewing the pod information of the current node
pod
get/list/watch
Enabling isolation capability based on Podqos/ Modifying node resources to increase offline resources
pod status
update/patch
Adding a taint to the node
node
get/list/watch/update
Sending events based on the status of isolation and resource interference
event
All Permissions

Permission Definition

rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- pods/status
verbs:
- update
- patch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- nodes/status
- nodes/finalizers
verbs:
- update
- patch
- apiGroups:
- ""
resources:
- pods/eviction
verbs:
- create
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- "*"
- apiGroups:
- "ensurance.crane.io"
resources:
- podqoss
- nodeqoss
- avoidanceactions
verbs:
- get
- list
- watch
- update
- apiGroups:
- "prediction.crane.io"
resources:
- timeseriespredictions
- timeseriespredictions/finalizers
verbs:
- get
- list
- watch
- create
- update
- patch
- apiGroups:
- "topology.crane.io"
resources:
- "noderesourcetopologies"
verbs:
- get
- list
- watch
- create
- update
- patch

Deployment Methods

1. Log into the Tencent Kubernetes Engine Console, and choose Cluster from the left navigation bar.
2. In the Cluster list, click the desired Cluster ID to access its detailed page.
3. Select Add-on management from the left-side menu, and click Create within the Component Management page.
4. On the Create Add-on management page, tick the box for QoS Agent.
5. Click Complete to install the add-on.
Please Note:
With the completion of the deployment, you need to manually select the corresponding driver due to potential differences in cgroup driver of the cluster. The instructions are as follows:
1. Within the Add-on in your cluster, locate the successfully deployed QoS Agent, and click Update configuration on the right.
2. On the add-on configuration page of QoS Agent, select the dropdown box to the right of the cgroupDrive option, and choose cgroupDrive that matches your cluster.
3. Click Complete.

FAQs

How to confirm the cgroupDrive of a cluster?

The cgroupDrive of a cluster can only be either cgroupfs or systemd. The confirmation method is as follows:
Initially, the operation of peekcluster can be viewed in the "basic information" page of the cluster, specifically in the "operating add-on", by determining whether the current cluster serves as a docker or containerd.
If the operating cluster is docker, on any node in the cluster, execute docker info and view the field content of Cgroup Driver.
If the operating cluster is containerd, in the file of /etc/containerd/config.toml on any node in the cluster, the presence of the field: SystemdCgroup = true signifies a systemd, otherwise, it is a cgroup.

How to select the operating business or node?

Choosing a specific resource object via label or scope is supported.
Note:
When both of the following selectors exist concurrently, the operation used is an "and", i.e. all conditions must be met.

labelSelector

The labelSelector filters resources by associating them with the resource labels of the object. The usual method of usage is to attach a specific tag to the designated workloads on the business end. This Tag is then given to the operation team. When creating a PodQOS, the operation team associates this tag through the labelSelector field, effectively granting different QoS capabilities to different businesses.

scopeSelector

The scopeSelector is composed of multiple MatchExpressions. The relationship between these MatchExpressions is an "and". There are three fields in MatchExpressions, namely ScopeName, Operator, and Values corresponding to ScopeName;
The ScopeName includes three types: QOSClass, Priority, and Namespace;
QOSClass refers to a desired Workload that is associated with a specific QOSClass. The Values can be one or more among Guaranteed, Burstable, and BestEffort;
Priority refers to a desired Workload that is associated with a specific Priority. The Values can be specific priority values, such as ["1000", "2000-3000"], supporting a range of priorities;
Namespace refers to a desired Workload that is associated with a specific Namespace. The Values can be one or more.
Operator includes two types, specifically In and NotIn. If left it blank, the default type is In.
As illustrated below, it denotes that the BestEffortPod meets a condition of app-type=offline, with a CPU priority of 7:
apiVersion: ensurance.crane.io/v1alpha1
kind: PodQOS
metadata:
name: offline-task
spec:
allowedActions:
- eviction
resourceQOS:
cpuQOS:
cpuPriority: 7
scopeSelector:
matchExpressions:
- operator: In
scopeName: QOSClass
values:
- BestEffort
labelSelector:
matchLabels:
app-type: offline



Ajuda e Suporte

Esta página foi útil?

comentários