소식 및 공지 사항

릴리스 노트

제품 릴리스 기록

제품 소개

제품 장점

제품 아키텍처

시나리오

제품 기능

리전 및 가용존

빠른 시작

신규 사용자 가이드

표준 클러스터를 빠르게 생성

Demo

클라우드에서 컨테이너화된 애플리케이션 배포 Check List

TKE 표준 클러스터 가이드

Tencent Kubernetes Engine(TKE)

클러스터 관리

네트워크 관리

스토리지 관리

Worker 노드 소개

Kubernetes Object Management

워크로드

클라우드 네이티브 서비스 가이드

Tencent Managed Service for Prometheus

TKE Serverless 클러스터 가이드

TKE 클러스터 등록 가이드

실습 튜토리얼

Serverless 클러스터

네트워크

로그

모니터링

유지보수

DevOps

탄력적 스케일링

자주 묻는 질문

클러스터

TKE Serverless 클러스터

유지보수

서비스

이미지 레지스트리

원격 터미널

Using CLS to Report Abnormal Resources

포커스 모드

폰트 크기

마지막 업데이트 시간: 2025-09-28 20:21:22

Applicable Scenarios
Kubernetes reports the status of cluster resource objects by using events. They typically indicate some status changes in the system. For example, when installing or modifying workloads, you can check for errors in the current resource objects and view the reasons for these errors according to event information. Each event can be retained for only 1 hour in a Tencent Kubernetes Engine (TKE) cluster.
If the event information contains errors, the cluster administrator needs to pay immediate attention. TKE allows you to configure event persistence for all clusters. Once this feature is enabled, TKE will export your cluster events to the configured storage end in real time. For details, refer to Event Storage.
Service/Ingress, as an ingress layer resource object in Kubernetes, significantly impacts business service stability. Therefore, monitoring and reporting Service/Ingress errors have become common requirements. In response, TKE also defines error codes, reasons, and resolutions of common Service/Ingress error events. For details, refer to Common Service & Ingress Errors and Solutions. This document provides alarm practices for Service/Ingress error events in clusters.
Operation Steps
Step 1: Enabling Event Collection for a Cluster
1. Log in to the TKE console.
2. In the left sidebar, select O&M Feature Management.
3. At the top of the Feature Management page, select a region and a cluster type, and then click Set on the right side of the cluster for which event storage needs to be enabled.
4. On the Feature Settings page, click Edit on the right side of Event Storage. Check Enable Event Storage and configure the logset and logtopic. For details, see Event Storage.
Note:
If you have multiple Kubernetes clusters in the same region, it is recommended to enable event storage for multiple clusters and select the same logtopic and logset.
Step 2: Checking for Event Collection
1. Log in to the CLS console and go to the Search and Analysis page.
2. On the Search and Analysis page, select the region, and the logset and logtopic of the cluster with event collection enabled.
3. In "Raw Logs", search for the event.message field, which contains the event information generated by resource objects in the cluster, as shown in the figure below.
﻿
Step 3: Creating an Alarm Policy
Take the Ingress event alarm as an example, similarly for Service.
1. Log in to the CLS console. Choose Monitoring Alarm > Alarm Policy.
2. On the Alarm Policy page, click Create, as shown in the figure below.
﻿
3. On the Create Alarm Policy page, set the policy based on the following information:
Log Topic: Select the topic created in Step 1.
Execution Statement: Add the execution statement (event.message:"Ingress Sync ClientError." OR event.message:"Ingress Sync DependencyError." OR event.message:"IngressError. ErrorCode:") | SELECT count(*) as ErrCount
Note:
It indicates all Ingress event information is obtained.
Trigger Conditions: Add the trigger condition $1.ErrCount > 0.
Note:
It indicates an alarm is triggered immediately upon event information.
Multidimensional Analysis: Select Custom Definition search and analysis.
Name: You can define a name.
Search and Analysis Statement: Add the search and analysis statement: (event.message:"Ingress Sync ClientError." OR event.message:"Ingress Sync DependencyError." OR event.message:"IngressError. ErrorCode:") | SELECT clusterId, event.involvedObject.namespace, event.involvedObject.name, split(split(event.message, 'ErrorCode: ')[2], ' ')[1] as ErrorCode, count(*) as ErrCount group by (clusterId, event.involvedObject.namespace, event.involvedObject.name, ErrorCode)
Notification Content: Add the notification content "Ingress alarm: The following cluster resources encounter an error concurrently:"
For more parameter information, refer to Configuring Alarm Policies.
Step 4: Viewing the Alarm
You need to ensure that new events have occurred in Step 2, and that the execution cycle and alarm notification frequency of the alarm policy in Step 2 are appropriate (for example, set it to once every minute during testing), so that you can view the alarm content in the alarm notification channel. In this example, alarms are set to be sent by emails, and you can refer to the alarm content in the email, as shown in the figure below.
﻿

도움말 및 지원

문제 해결에 도움이 되었나요?

더 자세한 내용은 문의하기 또는 티겟 제출 을 통해 문의할 수 있습니다.

피드백

tencent cloud

Tencent Kubernetes Engine

Using CLS to Report Abnormal Resources

Applicable Scenarios

Operation Steps

Step 1: Enabling Event Collection for a Cluster

Step 2: Checking for Event Collection

Step 3: Creating an Alarm Policy

Step 4: Viewing the Alarm

도움말 및 지원