tencent cloud

Tencent Kubernetes Engine

Release Notes and Announcements
Release Notes
Announcements
Release Notes
Product Introduction
Overview
Strengths
Architecture
Scenarios
Features
Concepts
Native Kubernetes Terms
Common High-Risk Operations
Regions and Availability Zones
Service Regions and Service Providers
Open Source Components
Purchase Guide
Purchase Instructions
Purchase a TKE General Cluster
Purchasing Native Nodes
Purchasing a Super Node
Getting Started
Beginner’s Guide
Quickly Creating a Standard Cluster
Examples
Container Application Deployment Check List
Cluster Configuration
General Cluster Overview
Cluster Management
Network Management
Storage Management
Node Management
GPU Resource Management
Remote Terminals
Application Configuration
Workload Management
Service and Configuration Management
Component and Application Management
Auto Scaling
Container Login Methods
Observability Configuration
Ops Observability
Cost Insights and Optimization
Scheduler Configuration
Scheduling Component Overview
Resource Utilization Optimization Scheduling
Business Priority Assurance Scheduling
QoS Awareness Scheduling
Security and Stability
TKE Security Group Settings
Identity Authentication and Authorization
Application Security
Multi-cluster Management
Planned Upgrade
Backup Center
Cloud Native Service Guide
Cloud Service for etcd
TMP
TKE Serverless Cluster Guide
TKE Registered Cluster Guide
Use Cases
Cluster
Serverless Cluster
Scheduling
Security
Service Deployment
Network
Release
Logs
Monitoring
OPS
Terraform
DevOps
Auto Scaling
Containerization
Microservice
Cost Management
Hybrid Cloud
AI
Troubleshooting
Disk Full
High Workload
Memory Fragmentation
Cluster DNS Troubleshooting
Cluster kube-proxy Troubleshooting
Cluster API Server Inaccessibility Troubleshooting
Service and Ingress Inaccessibility Troubleshooting
Common Service & Ingress Errors and Solutions
Engel Ingres appears in Connechtin Reverside
CLB Ingress Creation Error
Troubleshooting for Pod Network Inaccessibility
Pod Status Exception and Handling
Authorizing Tencent Cloud OPS Team for Troubleshooting
CLB Loopback
API Documentation
History
Introduction
API Category
Making API Requests
Elastic Cluster APIs
Resource Reserved Coupon APIs
Cluster APIs
Third-party Node APIs
Relevant APIs for Addon
Network APIs
Node APIs
Node Pool APIs
TKE Edge Cluster APIs
Cloud Native Monitoring APIs
Scaling group APIs
Super Node APIs
Other APIs
Data Types
Error Codes
TKE API 2022-05-01
FAQs
TKE General Cluster
TKE Serverless Cluster
About OPS
Hidden Danger Handling
About Services
Image Repositories
About Remote Terminals
Event FAQs
Resource Management
Service Agreement
TKE Service Level Agreement
TKE Serverless Service Level Agreement
Contact Us
Glossary

Monitoring Add-Ons Release Notes

PDF
Modo Foco
Tamanho da Fonte
Última atualização: 2025-02-24 18:16:10

monitor-agent Release Notes

Change Time
Version Number
Change Content
Restrictions and Impacts
2024-11-28
v1.3.17
Added the timeout settings to fix the stuck issue that occurs when standalone-metrics obtains metrics.
The metric port and protocol are made adaptable to cluster upgrade.
Fixed the issue where obtaining mounted disk metrics gets stuck due to NFS failure.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-07-30
v1.3.16
Added the systemd mode for cadvisor.
Fixed the problem of repeated statistics in disk-related metrics calculation for native nodes.
Modified the Job podNormal logic for the Pod in the informer list-watch failed status.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-05-10
v1.3.14
Fixed the problem of excluding Pods in the Succeeded and Failed status during list-watch.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-03-18
v1.3.12
Exposed chart parameters to support onDelete policy upgrade.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-02-29
v1.3.11
Adapted to the GPU metrics, and changed the calculation method of GPU metrics at the Pod and node levels from aggregating container-level metrics to directly using the values exposed by the exporter.
Exposed add-ons tags for GPU metrics to preferentially use gpu-exporter: "true". If this tag is not available, use name: gpu-manager-ds.
Fixed the problem where a program panic is triggered when the GPU driver of a node is abnormal. GPU metrics will not be collected, and this will not affect collection of other basic metrics.
Fixed the problem where the program will get stuck in special cases when HTTP requests are sent to crane to pull data. The HTTP request will be canceled upon timeout.
Fixed the problem where the monitor-agent add-ons will run on different CPU cores at different time on large core nodes. The Pod's working_set metric will be too large over time, leading to OOM errors.
Fixed the problem where the monitoring add-ons fail to collect data due to changes of the /metrics port and protocol for later versions of controller-manager and scheduler, to adapt to the port changes of controller-manager and scheduler.
Modified the calculation method to exclude the iowait time for the high node I/O scenario where the calculated node CPU utilization is too high due to inclusion of the iowait time.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-02-4
v1.3.10
Extracted the monitor-agent privileged mode into a chart parameter. The privileged mode is disabled by default.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-08-17
v1.3.9
Fixed the problem where the workload is normal when the container is in the creating status.
Used the client-go mechanism to automatically refresh the Token to prevent Token expiration When kubeletJob is used to send requests to kubelet.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-04-25
v1.3.7
Fixed the problem where Pod-level GPU utilization (node) and GPU memory utilization (node) metrics fail to be collected properly, and the problem where Pods in the terminating status fail to be deleted due to container mounting to the host directory.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-21
v1.3.6
Added metrics of native nodes, including 1-minute load, total disk capacity, disk utilization, and write bandwidth of nodes.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-01-18
v1.3.5
Optimized the scenario where related monitoring metrics are not reported when cadvisor does not expose the container_fs_usage_bytes and container_fs_limit_bytes metrics.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-01-12
v1.3.4
Fixed the problem where the file system usage is 0 when the runtime is containerd.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-12-13
v1.3.3
Optimized the method of pulling basic monitoring metrics.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-11-08
v1.3.2
Fixed the problem where basic monitoring fail to report monitoring metrics properly.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-10-20
v1.3.1
Fixed the metric drop problem.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-08-25
v1.3.0
Tencent Kubernetes Engine (TKE) basic monitoring supports the following PVC monitoring metrics: PVC cloud disk size, PVC cloud disk utilization, and PVC cloud disk usage.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-08-09
v1.2.2
Updated the GPU metrics calculation method.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-28
v1.2.1
Updated the methods of calculating the node CPU packing rate and node memory packing rate.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-25
v1.2.0
Added the following metrics: Pod CPU optimizable amount, Pod memory optimizable amount, node CPU packing rate, and node memory packing rate.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-21
v1.1.1
Fixed the problem where the basic monitoring add-ons do not complete the collection, calculation, and reporting tasks within the corresponding cycles.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-05
v1.1.0
tke-monitor-agent mounts the host paths /proc/meminfo and /proc/cpuinfo to collect node CPU utilization and memory utilization.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-06-23
v1.0.0
Managed the basic monitoring add-ons by using chart.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.

clustermonitor Release Notes

Change Time
Version Number
Change Content
Restrictions and Impacts
2025-01-08
v1.3.2
Added the control plane add-on monitoring capability.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-11-20
v1.2.0
Added the metric monitoring capability for native node sub-machines, submitting dimension service data.
Fixed the monitoring add-on exceptions in the CDC cluster.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-10-30
v1.1.0
Allowed to enable measurement data reporting through the measure-enabled switch.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-09-24
v1.0.13
Fixed the panic issue caused by clustermonitor failing to initialize proxy in CDC cluster scenarios.
Fixed the issue where the total GPU reported as 0 after the user modifies the node alias in the user cluster.
Fixed the issue where standaloneMetrics uses the new instanceid to report the node-related metrics instead after it supports nodes changing instanceid.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-07-30
v1.0.12
Removed the dependency of master monitoring capability on token, and removed --token from startup parameters.
Fixed SSRF vulnerabilities.
Fixed the concurrency problem when the CPU and memory of cluster nodes are retrieved.
Fixed the problem of no value for the workload GPU utilization because the total cluster GPU is calculated as zero.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-03-27
v1.0.11
Allowed the managed cluster to report cluster storage object quantity metrics (pods, configmaps, and others).
Primarily collected data from gpu-exporter on each node during calculation of the total GPU core and GPU memory for the cluster. If no data can be collected, it can be obtained from the Status field of each node.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-11-05
v1.0.10
Allowed to collect metrics of three major add-ons in the managed cluster by using cluster-monitor.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-08-21
v1.0.9
Fixed the problem where the number of workload replicas associated with the Horizontal Pod Autoscaler (HPA) is scaled out to the maximum due to excessively high CPU usage when the HPA created by the user is based on the CPU usage in core resource metrics.
Allowed deployment of CDC scenarios to nodes.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-08-15
v1.0.8
Allowed deployment of CDC scenarios to user clusters.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-06-20
v1.0.7
Optimized cost metrics reporting logic.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-06-08
v1.0.6
Fixed the problem where the k8s_pod_ping_succeed metric is not reported when the Pod is not in the running status.
Fixed the problem where the data cache is not cleaned up when the number of data entries reported to barad exceeds 1,000.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-04-03
v1.0.5
Added annotation.service.kubernetes.io/qcloud-loadbalancer-multiplex : "true" for clustermonitor service to reuse ENILB in an independent cluster scenario with the inspection add-on.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-29
v1.0.4
Added the Node status, Pod Ready status, and cost metrics collection and reporting.
Optimized metric retrieval for the HPA data source hpa-metrics-server.
Upgraded the metrics-server version.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-24
v1.0.3
Fixed the problem of clustermonitor version upgrade failure.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-16
v1.0.2
Fixed the problem of apiserver CPU/mem utilization drop.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-14
v1.0.1
Managed the basic monitoring add-ons by using chart.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.



Ajuda e Suporte

Esta página foi útil?

comentários