tencent cloud

Tencent Cloud TCHouse-D

Importing Doris from Logstash

PDF
Modo Foco
Tamanho da Fonte
Última atualização: 2024-06-27 11:05:22
Logstash's Doris output plugin is needed to import Doris from Logstash. This plugin interacts with Doris FE HTTP interface using HTTP protocol, and Doris's stream load is used for data import.

Installation and Compilation

1. Download Source Code

Plugin source code is in the Doris source code. Download Doris source code.

2. Compile

Execute in the extension/logstash/ directory of Doris source code:
gem build logstash-output-doris.gemspec
You will get logstash-output-doris-{version}.gem file in the same directory.

3. Plugin Installation

Copy logstash-output-doris-{version}.gem to the logstash installation directory, execute command:
./bin/logstash-plugin install logstash-output-doris-{version}.gem
Install logstash-output-doris plugin.

Configuration

Sample code

Create a new configuration file in the config directory, named logstash-doris.conf, specific configuration is as follows:
output {
doris {
http_hosts => [ "http://fehost:8030" ]
user => user_name
password => password
db => "db_name"
table => "table_name"
label_prefix => "label_prefix"
column_separator => ","
}
}

Configuration Instructions

Connection related configuration:
Configuration
Description
http_hosts
FE's HTTP interaction address. For example: ["http://fe1:8030", "http://fe2:8030"]
user
Username, this user needs to have import permissions for Doris's corresponding library table
password
Password
db
Database name
table
Table name
label_prefix
Import identification prefix, the final identification is {label_prefix}_{db}_{table}_{time_stamp}
See the Stream Load Manual for related import configuration.
Configuration
Description
column_separator
Column delimiter, default is\\t.
columns
Used to specify the relationship between the columns in the import file and the columns in the table.
where
Filter condition specified for the import task.
max_filter_ratio
The maximum tolerance for the import task, default is zero tolerance.
partition
Partition information of the table to be imported.
timeout
Timeout, default is 600s.
strict_mode
Strict mode, default is false.
timezone
Specify the time zone used for this import, default is Eastern Standard Time.
exec_mem_limit
Import memory limit, default is 2GB, units in bytes.
Other configurations:
Configuration
Description
save_on_failure
Whether to save locally if the import fails, default is true
save_dir
Local save directory, default is /tmp
automatic_retries
The maximum number of retries when failing, default is 3
batch_size
The maximum number of events processed in each batch, default is 100,000
idle_flush_time
Maximum interval time, default is 20 (seconds)

Startup

Execute command to start doris output plugin:
{logstash-home}/bin/logstash -f {logstash-home}/config/logstash-doris.conf --config.reload.automatic

Complete Usage Example

1. Compile doris-output-plugin

1. Download ruby compressed package, go to ruby official websiteto download by yourself, the version used here is 2.7.1.
2. Compile and install, configure the environment variables for Ruby.
3. Go to the extension/logstash/ directory of doris source code, and execute:
gem build logstash-output-doris.gemspec
You will get the file logstash-output-doris-0.1.0.gem, until now the compilation is completed.

2. Install and configure filebeat

Note
Filebeat is used as the input source here.
1. On ES official website to download filebeat tar compressed package and decompress it.
2. Enter the filebeat directory and modify the configuration file filebeat.yml as follows:
filebeat.inputs:
- type: log
paths:
- /tmp/doris.data
output.logstash:
hosts: ["localhost:5044"]
3. Start filebeat:
./filebeat -e -c filebeat.yml -d "publish"

3. Install logstash and doris-out-plugin

1. Download the logstash tar compressed package and unpack it from the ES Official Website.
2. Copy the logstash-output-doris-0.1.0.gem obtained in Step 1 to the logstash installation directory.
3. Execute:
./bin/logstash-plugin install logstash-output-doris-0.1.0.gem
Installed plugin.
4. Create a new configuration file in the config directory named logstash-doris.conf. The content is as follows:
input {
beats {
port => "5044"
}
}

output {
doris {
http_hosts => [ "http://127.0.0.1:8030" ]
user => doris
password => doris
db => "logstash_output_test"
table => "output"
label_prefix => "doris"
column_separator => ","
columns => "a,b,c,d,e"
}
}
The configuration here needs to be set according to the configuration instructions.
5. Start logstash:
./bin/logstash -f ./config/logstash-doris.conf --config.reload.automatic

4. Test Features

Add write data to /tmp/doris.data:
echo a,b,c,d,e >> /tmp/doris.data
Observe the logstash log. If the Status of the returned response is Success, the import is successful. At this point, you can view at the imported data in the logstash_output_test.output table.

Ajuda e Suporte

Esta página foi útil?

comentários