tencent cloud

Data Lake Compute

Release Notes
Product Introduction
Overview
Strengths
Use Cases
Purchase Guide
Billing Overview
Refund
Payment Overdue
Configuration Adjustment Fees
Getting Started
Complete Process for New User Activation
DLC Data Import Guide
Quick Start with Data Analytics in Data Lake Compute
Quick Start with Permission Management in Data Lake Compute
Quick Start with Partition Table
Enabling Data Optimization
Cross-Source Analysis of EMR Hive Data
Standard Engine Configuration Guide
Configuring Data Access Policy
Operation Guide
Console Operation Introduction
Development Guide
Runtime Environment
SparkJar Job Development Guide
PySpark Job Development Guide
Query Performance Optimization Guide
UDF Function Development Guide
System Restraints
Client Access
JDBC Access
TDLC Command Line Interface Tool Access
Third-party Software Linkage
Python Access
Practical Tutorial
Accessing DLC Data with Power BI
Table Creation Practice
Using Apache Airflow to Schedule DLC Engine to Submit Tasks
Direct Query of DLC Internal Storage with StarRocks
Spark cost optimization practice
DATA + AI
Using DLC to Analyze CLS Logs
Using Role SSO to Access DLC
Resource-Level Authentication Guide
Implementing Tencent Cloud TCHouse-D Read and Write Operations in DLC
DLC Native Table
SQL Statement
SuperSQL Statement
Overview of Standard Spark Statement
Overview of Standard Presto Statement
Reserved Words
API Documentation
History
Introduction
API Category
Making API Requests
Data Table APIs
Task APIs
Metadata APIs
Service Configuration APIs
Permission Management APIs
Database APIs
Data Source Connection APIs
Data Optimization APIs
Data Engine APIs
Resource Group for the Standard Engine APIs
Data Types
Error Codes
General Reference
Error Codes
Quotas and limits
Operation Guide on Connecting Third-Party Software to DLC
FAQs
FAQs on Permissions
FAQs on Engines
FAQs on Features
FAQs on Spark Jobs
DLC Policy
Privacy Policy
Data Privacy And Security Agreement
Service Level Agreement
Contact Us
DocumentationData Lake ComputeGetting StartedQuick Start with Data Analytics in Data Lake Compute

Quick Start with Data Analytics in Data Lake Compute

PDF
Focus Mode
Font Size
Last updated: 2026-01-30 15:52:43
Data Lake Compute allows you to quickly query and analyze COS data. Currently, CSV, ORC, Parquet, JSON, Avro, and text files are supported.
With Data Lake Compute, you can complete data analysis queries on COS in just a minute. It currently supports multiple formats including CSV, ORC, PARQUET, JSON, ARVO, and text files.

Preliminary Preparations

Before initiating a query, you need to activate the internal permissions of Data Lake Compute and configure the path for query results.

Step 1: Establish the necessary internal permissions for Data Lake Compute.

Note
If the user already has the necessary permissions, or if they are the root account administrator, this step can be disregarded.
If you are logging in as a sub-account for the first time, in addition to the necessary CAM authorization, you also need to request any Data Lake Compute admin or root account admin to grant you the necessary Data Lake Compute permissions from the Permission Management menu on the left side of the Data Lake Compute console (for a detailed explanation of permissions, please refer to DLC Permission Overview).
1. Table Permissions: Grant read and write operation permissions to the corresponding catalog, database, table, and view.
2. Engine Permissions: These can grant usage, monitoring, and modification rights to the computation engine.
Note
The system will automatically provide each user with a shared public-engine based on the Presto kernel, allowing you to quickly try it out without the need to purchase a private cluster first.
For detailed steps on granting permissions, please refer to Sub-account Permission Management.

Step 2: Configure the path for query results.

Method 1:

Upon initial use of Data Lake Compute, you must first configure the path for query results. Once configured, the query results will be saved to this COS path.
1. Log in to the Data Lake Compute DLC console and select the service region.
2. Navigate to Data Exploration via the left sidebar menu.
3. In the upper-right corner of the Data Exploration page, enter Storage Configuration, then click Modify configuration to configure the query result path.
Configure the query result path. For example, the DLC internal storage stores SELECT query results in the storage space within the Data Lake Compute (DLC) product, with Cloud Object Storage (COS) as the underlying storage. The results are retained for 36 hours.
If there are no available COS storage buckets in your account, you can create one through the COS console.


Method 2:

1. Log in to the DLC console and select the Region for the service.
2. Enter Storage Configuration from the left sidebar.
3. On the Storage Configuration page, click Modify configuration to configure the query result path.
Configure the query result path. For example, the DLC internal storage stores SELECT query results in the storage space within the DLC product, with Cloud Object Storage (COS) as the underlying storage. The results are retained for 36 hours.
If there are no available COS storage buckets in your account, you can create one through the COS console.


Analysis Steps

Step 1. Create a database

If you are familiar with SQL statements, write the CREATE DATABASE statement in the query and skip the creation wizard.
1. Log in to the Data Lake Compute console and select the service region.
2. Select Data Exploration on the left sidebar.
3. Select Database and Tables, click +, then select Create a database to create a database. Enter the database name and description.

4. After selecting an execution engine in the top-right corner, run the CREATE DATABASE statement.
For detailed operation steps and configuration modes, refer to Database Management Center (DMC).

Step 2. Create an external table

If you are familiar with SQL statements, write the CREATE TABLE statement in the query and skip the creation wizard.
1. Log in to the Data Lake Compute console and select the service region.
2. Select Data Explore on the left sidebar.
3. On the Database and Tables page, click the created database, move the mouse pointer to the three-dot icon on the right side of the table and click the icon, then select Create external table.
Note:
An external table generally refers to a data file stored in a COS bucket under your account. It can be directly created in Data Lake Compute for analysis with no need to load additional data. It is external, so only its metadata will be deleted when you run DROP TABLE, while your original data will remain.

4. Generate the table creation statement according to the wizard. Complete each step in the order of Data path > Data format > Table format configuration item > Partitioning.
Step 1. Select the COS path of the data file (which must be a directory in a COS bucket but not a bucket itself). There is also a quick method to upload a file to COS. The operations require relevant COS permissions.
Step 2. Select the data file format. In the Advanced options, you can select automatic inference, and then the backend will parse the file format and automatically generate the table column information for fast column inference.
Note:
Structure inference is an auxiliary tool for table creation and may not be 100% accurate. You need to check and modify the field names and types as needed.



Step 3: Skip this step if there is no partition. Enabling partitioning can reasonably enhance analysis performance. For more information on partitioning, refer to Querying Partition Table.

5. Click Complete to generate the SQL statement for table creation. Select a data engine and run the generated statement to complete the table creation.


Step 3. Run the SQL analysis

After the data is prepared, write the SQL analysis statement, select an appropriate compute engine, and start data analysis.

Sample

Example Note:
1. In the example, 'demo2' is the database name and 'demo_audit_table' is the table name. You can replace the field values based on the actual business scenario.
2. In the example, the where condition is only for code reference. You can perform SQL analysis as needed based on the actual business scenario.
Write a SQL statement with all data query results being SUCCESS and run the statement after selecting a compute engine.
select * from `DataLakeCatalog`.`dlc_test`.`dlc_test_v2` where c5 = 'successful'






Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback