Release Notes and Announcements

Release Notes

Announcements

Release Notes

Product Introduction

Overview

Strengths

Architecture

Scenarios

Features

Concepts

Native Kubernetes Terms

Common High-Risk Operations

Regions and Availability Zones

Service Regions and Service Providers

Open Source Components

Purchase Guide

Purchase Instructions

Purchase a TKE General Cluster

Purchasing Native Nodes

Purchasing a Super Node

Getting Started

Beginner’s Guide

Quickly Creating a Standard Cluster

Examples

Container Application Deployment Check List

Cluster Configuration

General Cluster Overview

Cluster Management

Network Management

Storage Management

Node Management

GPU Resource Management

Remote Terminals

Application Configuration

Workload Management

Service and Configuration Management

Component and Application Management

Auto Scaling

Container Login Methods

Observability Configuration

Ops Observability

Cost Insights and Optimization

Scheduler Configuration

Scheduling Component Overview

Resource Utilization Optimization Scheduling

Business Priority Assurance Scheduling

QoS Awareness Scheduling

Security and Stability

TKE Security Group Settings

Identity Authentication and Authorization

Application Security

Multi-cluster Management

Planned Upgrade

Backup Center

Cloud Native Service Guide

Cloud Service for etcd

TMP

TKE Serverless Cluster Guide

TKE Registered Cluster Guide

Use Cases

Cluster

Serverless Cluster

Scheduling

Security

Service Deployment

Network

Release

Logs

Monitoring

OPS

Terraform

DevOps

Auto Scaling

Containerization

Microservice

Cost Management

Hybrid Cloud

Troubleshooting

Disk Full

High Workload

Memory Fragmentation

Cluster DNS Troubleshooting

Cluster kube-proxy Troubleshooting

Cluster API Server Inaccessibility Troubleshooting

Service and Ingress Inaccessibility Troubleshooting

Common Service & Ingress Errors and Solutions

Engel Ingres appears in Connechtin Reverside

CLB Ingress Creation Error

Troubleshooting for Pod Network Inaccessibility

Pod Status Exception and Handling

Authorizing Tencent Cloud OPS Team for Troubleshooting

CLB Loopback

API Documentation

History

Introduction

API Category

Making API Requests

Elastic Cluster APIs

Resource Reserved Coupon APIs

Cluster APIs

Third-party Node APIs

Relevant APIs for Addon

Network APIs

Node APIs

Node Pool APIs

TKE Edge Cluster APIs

Cloud Native Monitoring APIs

Scaling group APIs

Super Node APIs

Other APIs

Data Types

Error Codes

TKE API 2022-05-01

FAQs

TKE General Cluster

TKE Serverless Cluster

About OPS

Hidden Danger Handling

About Services

Image Repositories

About Remote Terminals

Event FAQs

Resource Management

Service Agreement

TKE Service Level Agreement

TKE Serverless Service Level Agreement

Glossary

Auto Scaling

PDF

Modo Foco

Tamanho da Fonte

Última atualização: 2024-12-19 21:49:45

This document describes how to use auto scaling, so that services can make full use of available resources based on actual production experience. You can adjust your configurations based on this document.
Coping Abrupt Traffic Spikes
Typically, services have peak and off-peak hours of resource usage. To properly use resources, you can define a Horizontal Pod Autoscaler (HPA) for services to automatically scale out the number of pods during peak hours and scale in the number of pods during off-peak hours. For example, when the traffic of online services is low at night, the HPA can automatically release resources of online services and use them for big data offline tasks.
To use the HPA, you need to install resource metrics (metrics.k8s.io) or custom metrics (custom.metrics.k8s.io) in advance. The HPA controller can then query related APIs to obtain resource use information for services. In this way, Kubernetes obtains resource usage data (metric data) of services in advance.
Previously, the HPA used resource metrics to obtain metric data. After custom metrics became available, the HPA used more flexible metrics to control scaling. To implement HPA, Kubernetes uses metrics-server, communities use prometheus-adapter, and cloud vendors that manage Kubernetes clusters usually use their own APIs. For example, TKE uses HPA to implement CPU, memory, hard disk, and network metrics. You can create an HPA on the web client and convert the metrics to a Kubernetes YAML file, as shown below:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta2
    kind: Deployment
    name: nginx
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: k8s_pod_rate_cpu_core_used_request
      target:
        averageValue: "100"
        type: AverageValue
Reducing costs
HPA implements horizontal pod scaling. When node resources are insufficient, scaled-out pods are in the pending state. If a large number of nodes are prepared in advance, pending pods will not occur, but the cost will be high.
Typically, Kubernetes clusters managed by cloud vendors support cluster-autoscaler. This means nodes can be dynamically added or deleted based on resource usage to maximize computing resource utilization. In addition, pay-as-you-go is used to reduce the cost. For example, TKE uses scaling groups and extended features that contain scaling groups (node pools).
Using vertical scaling
For applications that do not support horizontal scaling or applications with uncertain optimal request and limit ratios, you can use VPA for vertical scaling. In this case, the request and limit values are automatically updated, and pods are restarted. This feature may cause service unavailability for a short period. We do not recommend you use it on a large scale in the production environment. 

Ajuda e Suporte

Esta página foi útil?

Você também pode entrar em contato com a Equipe de vendas ou Enviar um tíquete em caso de ajuda.

comentários

tencent cloud

Tencent Kubernetes Engine

Auto Scaling

Coping Abrupt Traffic Spikes

Reducing costs

Using vertical scaling

Ajuda e Suporte