tencent cloud

문서Data Lake Compute

UDF Function Development Guide

포커스 모드
폰트 크기
마지막 업데이트 시간: 2024-07-31 18:03:19

UDF Description

Users can write UDF functions, package them into JAR files, and then use them in query analysis by defining them as functions in Data Lake Compute. Currently, DLC's UDFs are in HIVE format, inheriting from org.apache.hadoop.hive.ql.exec.UDF and implementing the evaluate method. Example: Simple Array UDF Function.
public class MyDiff extends UDF {
public ArrayList<Integer> evaluate(ArrayList<Integer> input) {
ArrayList<Integer> result = new ArrayList<Integer>();
result.add(0, 0);
for (int i = 1; i < input.size(); i++) {
result.add(i, input.get(i) - input.get(i - 1));
}
return result;
}
}
Reference for POM file:
<dependencies>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.1</version>
</dependency>
</dependencies>

Creating function

Note:
If you are creating a UDAF/UDTF function, you need to add the _udaf/_udtf suffix to the function name accordingly.
If you are familiar with SQL syntax, you can create a function by executing the CREATE FUNCTION syntax via Data Exploration, or by using the visual interface. The process is as follows:
1. Log in to the Data Lake Compute Console and select the service region.
2. Enter Data Management through the left sidebar, select the database for the function you need to create. If you need to create a new database, refer to Data Catalog and DMC.


3. Click Function to enter the function management page.
4. Click Create Function to proceed with creation.


UDF's application package can be uploaded locally or a COS path can be selected (requires COS-related permissions), for instance, creating by selecting a COS path. Function Class Name includes "Package Information" and "Function Execution Class Name".

Function Usage

1. Log in to the Data Lake Computing Console and select the service region.
2. Enter Data Exploration via the left navigation menu, select a Compute Engine, and then you can use SQL to invoke the function.




도움말 및 지원

문제 해결에 도움이 되었나요?

피드백