tencent cloud

TDMQ for CKafka

Release Notes and Announcements
Release Notes
Broker Release Notes
Announcement
Product Introduction
Introduction and Selection of the TDMQ Product Series
What Is TDMQ for CKafka
Strengths
Scenarios
Technology Architecture
Product Series Introduction
Apache Kafka Version Support Description
Comparison with Apache Kafka
High Availability
Use Limits
Regions and AZs
Related Cloud Services
Billing
Billing Overview
Pricing
Billing Example
Changing from Postpaid by Hour to Monthly Subscription
Renewal
Viewing Consumption Details
Overdue Payments
Refund
Getting Started
Guide for Getting Started
Preparations
VPC Network Access
Public Domain Name Access
User Guide
Usage Process Guide
Configuring Account Permission
Creating Instance
Configuring Topic
Connecting Instance
Managing Messages
Managing Consumer Group
Managing Instance
Changing Instance Specification
Configuring Traffic Throttling
Configuring Elastic Scaling Policy
Configuring Advanced Features
Viewing Monitoring Data and Configuring Alarm Rules
Synchronizing Data Using CKafka Connector
Use Cases
Cluster Resource Assessment
Client Practical Tutorial
Log Integration
Open-Source Ecosystem Integration
Replacing Supporting Route (Old)
Migration Guide
Migration Solution Overview
Migrating Cluster Using Open-Source Tool
Troubleshooting
Topics
Clients
Messages
​​API Reference
History
Introduction
API Category
Making API Requests
Other APIs
ACL APIs
Instance APIs
Routing APIs
DataHub APIs
Topic APIs
Data Types
Error Codes
SDK Reference
SDK Overview
Java SDK
Python SDK
Go SDK
PHP SDK
C++ SDK
Node.js SDK
SDK for Connector
Security and Compliance
Permission Management
Network Security
Deletion Protection
Event Record
CloudAudit
FAQs
Instances
Topics
Consumer Groups
Client-Related
Network-Related
Monitoring
Messages
Agreements
CKafka Service Level Agreements
Contact Us
Glossary

Data Compression

PDF
Modo Foco
Tamanho da Fonte
Última atualização: 2026-01-20 17:02:40

Scenarios

Data compression can reduce network I/O throughput and disk usage. By following this document, you can learn about the message formats supported by data compression and configure data compression as needed.

Message Format

Currently, TDMQ for CKafka (CKafka) supports two types of message formats: V1 and V2 (introduced in version 0.11.0.0). CKafka is compatible with the formats for versions 0.9, 0.10, 1.1, 2.4, 2.8, and 3.2.
Different versions correspond to different configurations. The details are as follows:
The purpose of message format conversion is primarily to ensure compatibility with early versions of consumer programs. In a CKafka cluster, multiple versions of message formats (V1/V2) are typically stored simultaneously.
The broker side converts new version messages to an early version format, which involves decompression and recompression of messages.
Message format conversion significantly impacts performance. Besides adding extra compression and decompression operations, it causes CKafka to lose its excellent zero-copy feature. Therefore, it is essential to ensure the uniformity of message formats.
Zero-copy: During data transmission between the disk and network, it avoids expensive kernel-mode data copying, thereby achieving quick data transmission.

Compression Algorithm Comparison

The Snappy algorithm is officially recommended to reduce the impact on and maintain the stability of CPU performance.
The analysis process is as follows:
A compression algorithm is evaluated based on two major metrics: compression ratio and compression/decompression throughput. CKafka versions earlier than 2.1.0 support three compression algorithms: GZIP, Snappy, and LZ4. In actual usage of CKafka, the performance metrics of the three algorithms are compared as follows:
Compression ratio: LZ4 > GZIP > Snappy
Throughput: LZ4 > Snappy > GZIP
The physical resource usage is as follows:
Bandwidth: Snappy occupies the most network bandwidth as it has the lowest compression ratio.
CPU: During compression, Snappy uses more CPU; during decompression, GZIP uses more CPU.
Under normal circumstances, the recommended order of the three compression algorithms is: LZ4 > GZIP > Snappy.
By long-term testing in the live network, it is found that the above conclusion is correct in most cases. However, in certain extreme scenarios, the LZ4 compression algorithm may cause an increase in CPU load.
Analysis shows that different source data of services leads to different performance of compression algorithms. Therefore, users sensitive to CPU metrics are recommended to use the Snappy algorithm that is more stable.
Note:
The GZIP algorithm is not recommended for CKafka. Enabling GZIP compression consumes additional CPU resources on the CKafka server. According to load testing data, if GZIP compression is enabled, it is advised to reserve approximately 75% bandwidth buffer. (The reserved ratio is for reference only. The actual ratio is to be determined based on monitoring data.)
For example, for an instance with a bandwidth of 40 MB/s, after enabling GZIP compression, you are advised to increase the bandwidth to 160 MB/s (40/(1 - 75%) = 160).

Data Compression Configuration

Producers can configure data compression as instructed below:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "all");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

// After the producer starts, each message set produced will be compressed. This effectively saves network transmission bandwidth and disk usage of the Kafka broker.

// Note that different versions correspond to different configurations. Version 0.9 and earlier do not allow compression. Version 1.1 and earlier do not support the GZIP compression format by default.

props.put("compression.type", "snappy");
Producer<String, String> producer = new KafkaProducer<>(props);

In most cases, the broker only stores messages received from the producer without any modifications.

Notes

When data is sent to CKafka, compression.codec cannot be set.
Version 1.1 and earlier do not support the GZIP compression format by default.
GZIP compression involves high CPU consumption. The use of GZIP will cause all messages to be invalid. GZIP compression is not recommended for CKafka.
Once enabled, GZIP consumes significant CPU resources, becoming a bandwidth bottleneck. If GZIP is enabled, it is recommended to increase the values of linger.ms and batch.size for the producer.
Programs cannot run normally when LZ4 is enabled. The possible cause is an incorrect message format. Please check the CKafka version and whether the applied message format is correct.
The SDK settings vary with the CKafka client. You can set the message format version by querying in the open-source community (such as Instructions for the C/C++ Client).

Ajuda e Suporte

Esta página foi útil?

comentários