Python Access

Download

Mode fokus

Ukuran font

Terakhir diperbarui: 2026-04-15 21:29:51

DLC offers tools compliant with the DBAPI 2.0 standard. You can connect to DLC's Presto/Spark engine via Python, allowing for convenient SQL operations on DLC database tables.
Note:
The Presto engine has been deprecated and is only available for existing users.
Environment preparations
1. Python 3.9 or higher version.
2. Install tencentcloud-dlc-connector.
pip install -i https://mirrors.tencent.com/pypi/simple/ tencentcloud-dlc-connector
Usage Examples
Step 1: Connect to the engine
Code:
import tdlc_connector
import datetime
from tdlc_connector import constants
﻿
conn = tdlc_connector.connect(region="<REGION>", 
    secret_id="<SECRET_ID>", 
    secret_key="<SECRET_KEY>",
    token=None,
    endpoint=None,
    catalog=constants.Catalog.DATALAKECATALOG,  
    engine="<ENGINE>",
    engine_type=constants.EngineType.AUTO, 
    result_style=constants.ResultStyles.LIST,
    download=False,
    mode=constants.Mode.LASY,
    database='',
    config={},
    callback=None,
    callback_events=None,
    )
Parameter description:
Parameter
Description
region
Engine Region, such as ap-nanjing, ap-beijing, ap-guangzhou, ap-shanghai, ap-chengdu, ap-chongqing, na-siliconvalley, ap-singapore, ap-hongkong
secret_id
Tencent Cloud SecretID
secret_key
Tencent Cloud SecretKey
token
(Optional) Temporary Secret Token
endpoint
(Optional) Connect to the service node
engine
Engine name used, for example "test_python"
engine_type
(Optional) Engine type: corresponding to the engine type of the engine name, default value constants.EngineType.AUTO
For example: AUTO, PRESTO, SPARK, SPARK_BATCH
result_style
(Optional) Format of the returned result, options are LIST/DICT
download
(Optional) Whether to download the data directly True/False, see Download Mode Description﻿
mode
(Optional) Mode. Supports ALL/LAZY/STREAM
database
(Optional) Default database
config
(Optional) Submit to cluster configuration
callback
(Optional) Callback function, function signature def cb(statement_id, status)
callback_events
(Optional) Callback trigger event, used in conjunction with callback, see callback mechanism description for details
driver_size
(Optional) Driver node size, default value constants.PodSize.SMALL (Only valid for SPARK_BATCH clusters)
Optional values: SMALL, MEDIUM, LARGE, XLARGE, M_SMALL, M_MEDIUM, M_LARGE, M_XLARGE
executor_size
(Optional) Executor node size, default value constants.PodSize.SMALL (Only valid for SPARK_BATCH clusters)
Optional values: SMALL, MEDIUM, LARGE, XLARGE, M_SMALL, M_MEDIUM, M_LARGE, M_XLARGE
executor_num
(Optional) Number of Executor nodes, default value 1 (Only valid for SPARK_BATCH clusters)
executor_max_num
(Optional) Maximum number of Executor nodes, if not equal to executor_num, then enable Dynamic Resource Allocation (Only valid for SPARK_BATCH clusters)
﻿Download Mode Explanation:
﻿
Serial number
download
mode
Description
1
False
ALL
Fetch all data from the interface, can only fetch data after completion
2
False
LASY
Fetch data from the interface, delay fetching data to the server based on the amount fetched
3
False
STREAM
Same as LASY mode
4
True
ALL
Download all results from COS (requires COS read permission) using local temporary storage, recommended for large data volumes
5
True
LASY
Download results from COS (requires COS read permission), delay downloading files based on fetch data volume
6
True
STREAM
Read result stream from COS in real-time (requires COS read permission), slower performance, extremely low local memory disk usage ratio
Step 2: Execute SQL
Code:
# Basic Operations
﻿
cursor = conn.cursor()
count = cursor.execute("SELECT 1")
print(cursor.fetchone())                    # Read one line of data
for row in cursor.fetchall():               # Read the remaining multiple lines of data
    print(row)
﻿
# Use the pyformat format
﻿
cursor.execute("SELECT * FROM dummy WHERE date < %s", datetime.datetime.now())           
cursor.execute("SELECT * FROM dummy WHERE status in %s", (('SUCCESS', 'INIT', 'FAIL'),))
cursor.execute("SELECT * FROM dummy WHERE date < %(date)s AND status = %(status)s", {'date': datetime.datetime.now(), 'status': 'SUCCESS'})
﻿
# Use BULK method
﻿
cursor.executemany("INSERT INTO dummy VALUES(%s, %s)", [('Zhang San', 18), ('Li Si', 20)])
Basic Operation Procedure
The process of the aforementioned code is as follows:
1. A cursor object is created with conn.cursor().
2. A SQL query statement is executed with cursor.execute("SELECT 1"), and the result is assigned to the variable count.
3. A line of data is read through the cursor.fetchone() method and printed out.
Characteristic function
Description of the callback mechanism
import tdlc_connector
import datetime
from tdlc_connector import constants
﻿
﻿
def tdlc_connector_callback(statement_id, state):
    '''
    parmas: statement_id Quest id
    params: state Task status. The enumeration value is constants.TaskStatus
    '''
    print(statement_id, state)
﻿
﻿
conn = tdlc_connector.connect(region="<REGION>", 
    secret_id="<SECRET_ID>", 
    secret_key="<SECRET_KEY>",
    engine="<ENGINE>",
    engine_type=constants.EngineType.SPARK, 
    result_style=constants.ResultStyles.LIST,
    callback=tdlc_connector_callback,
    callback_events=[constants.CallbackEvent.ON_INIT, constants.CallbackEvent.ON_SUCCESS]
    )
﻿
cursor = conn.cursor()
cursor.execute("SELECT 1")
cursor.fetchone()
﻿
# The callback function is called when the task is initialized and the task is successful
﻿
Submit the task to the job cluster
Currently, you can submit tasks to the Spark job cluster. For details, see the following example.
from tdlc_connector import constants
﻿
conn = tdlc_connector.connect(region="<REGION>", 
    secret_id="<SECRET_ID>", 
    secret_key="<SECRET_KEY>",
    engine="<ENGINE>",                            # Select the spark job engine
    result_style=constants.ResultStyles.LIST,
    driver_size=constants.PodSize.SMALL,          # Select Driver Specifications
    executor_size=constants.PodSize.SMALL,        # Select the Executor specification
    executor_num=1,                               # Set the number of Executors
    executor_max_num=1,                           # Set the maximum number of executors. If it is not equal to {executor_num}, enable dynamic resource allocation
)
Note：
Upgrade the connector to >= 1.1.0 to use this feature.
Automatically infer engine type
You do not need to specify the engine type. The connector will automatically infer the engine type. For details, see the following example.
from tdlc_connector import constants
﻿
conn = tdlc_connector.connect(region="<REGION>",
secret_id="<SECRET_ID>",
secret_key="<SECRET_KEY>",
engine="<ENGINE>",
engine_type=constants.EngineType.AUTO  # This parameter can be set to AUTO or not to drive automatic inference
)
Note：
Upgrade the connector to >= 1.1.0 to use this feature.
Null conversion
The current result set is stored in CSV format, the engine will convert the null value into an empty string by default, if you need to distinguish the null value, please specify the null value symbol, such as "\\1", the engine query result will convert the null value into "\\1", while the driver will convert the "\\1" field into None, please refer to the following example.
from tdlc_connector import constants, formats
﻿
formats.FORMAT_STRING_NULL = '\\1'  
﻿
conn = tdlc_connector.connect(region="<REGION>", 
    secret_id="<SECRET_ID>", 
    secret_key="<SECRET_KEY>",
    engine="<ENGINE>",
    result_style=constants.ResultStyles.LIST
)
Note：
Null conversion currently only supports SparkSQL clusters. Upgrade connector to >= 1.1.3.
﻿

Bantuan dan Dukungan

Apakah halaman ini membantu?

Anda juga dapat Menghubungi Penjualan atau Mengirimkan Tiket untuk meminta bantuan.

masukan

tencent cloud

Data Lake Compute

Python Access

Environment preparations

Usage Examples

Step 1: Connect to the engine

Step 2: Execute SQL

Basic Operation Procedure

Characteristic function

Description of the callback mechanism

Submit the task to the job cluster

Automatically infer engine type

Null conversion

Bantuan dan Dukungan

Parameter	Description
region	Engine Region, such as ap-nanjing, ap-beijing, ap-guangzhou, ap-shanghai, ap-chengdu, ap-chongqing, na-siliconvalley, ap-singapore, ap-hongkong
secret_id	Tencent Cloud SecretID
secret_key	Tencent Cloud SecretKey
token	(Optional) Temporary Secret Token
endpoint	(Optional) Connect to the service node
engine	Engine name used, for example "test_python"
engine_type	(Optional) Engine type: corresponding to the engine type of the engine name, default value constants.EngineType.AUTO For example: AUTO, PRESTO, SPARK, SPARK_BATCH
result_style	(Optional) Format of the returned result, options are LIST/DICT
download	(Optional) Whether to download the data directly True/False, see Download Mode Description
mode	(Optional) Mode. Supports ALL/LAZY/STREAM
database	(Optional) Default database
config	(Optional) Submit to cluster configuration
callback	(Optional) Callback function, function signature def cb(statement_id, status)
callback_events	(Optional) Callback trigger event, used in conjunction with callback, see callback mechanism description for details
driver_size	(Optional) Driver node size, default value constants.PodSize.SMALL (Only valid for SPARK_BATCH clusters) Optional values: SMALL, MEDIUM, LARGE, XLARGE, M_SMALL, M_MEDIUM, M_LARGE, M_XLARGE
executor_size	(Optional) Executor node size, default value constants.PodSize.SMALL (Only valid for SPARK_BATCH clusters) Optional values: SMALL, MEDIUM, LARGE, XLARGE, M_SMALL, M_MEDIUM, M_LARGE, M_XLARGE
executor_num	(Optional) Number of Executor nodes, default value 1 (Only valid for SPARK_BATCH clusters)
executor_max_num	(Optional) Maximum number of Executor nodes, if not equal to executor_num, then enable Dynamic Resource Allocation (Only valid for SPARK_BATCH clusters)

Serial number	download	mode	Description
1	False	ALL	Fetch all data from the interface, can only fetch data after completion
2	False	LASY	Fetch data from the interface, delay fetching data to the server based on the amount fetched
3	False	STREAM	Same as LASY mode
4	True	ALL	Download all results from COS (requires COS read permission) using local temporary storage, recommended for large data volumes
5	True	LASY	Download results from COS (requires COS read permission), delay downloading files based on fetch data volume
6	True	STREAM	Read result stream from COS in real-time (requires COS read permission), slower performance, extremely low local memory disk usage ratio