Flume如何自定义Sink数据至MySQL-IT俱乐部

Flume自定义Sink数据至MySQL

一、介绍

Sink不断地轮询Channel中的事件且批量地移除它们，并将这些事件批量写入到存储或索引系统、或者被发送到另一个Flume Agent。

Sink是完全事务性的。在从Channel批量删除数据之前，每个Sink用Channel启动一个事务。批量事件一旦成功写出到存储系统或下一个Flume Agent，Sink就利用Channel提交事务。事务一旦被提交，该Channel从自己的内部缓冲区删除事件。

Sink组件目的地包括hdfs、logger、avro、thrift、ipc、file、null、HBase、solr、自定义。官方提供的Sink类型已经很多，但是有时候并不能满足实际开发当中的需求，此时我们就需要根据实际需求自定义某些Sink。

官方也提供了自定义sink的接口：

https://flume.apache.org/FlumeDeveloperGuide.html#sink根据官方说明自定义MySink需要继承AbstractSink类并实现Configurable接口。

实现相应方法：

configure(Context context)//初始化context（读取配置文件内容）
process()//从Channel读取获取数据（event），这个方法将被循环调用。

使用场景：

读取Channel数据写入MySQL或者其他文件系统。

二、需求

使用flume接收(id,name,string)数据，并在Sink端给每条数据进行切分，编写JDBC驱动将数据保存到MySQL数据库。

三、编写MySink

package com.flume.flume;
 
import org.apache.flume.*;
import org.apache.flume.conf.Configurable;
import org.apache.flume.sink.AbstractSink;
 
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
 
public class MySink extends AbstractSink implements Configurable {
 
    private String msgPrefix;
 
    /**
     * 用来保存数据,不断调用次方法
     * @return
     * @throws EventDeliveryException
     */
    @Override
    public Status process() throws EventDeliveryException {
        //获取sink对应的channnel
        Channel channel = getChannel();
        Connection connection = null;
        PreparedStatement statement = null;
        //获取事务对象
        Transaction transaction = channel.getTransaction();
        try{
            //开启事务
            transaction.begin();
            //从channel中获取数据
            Event event = channel.take();
 
            //切割数据
            String data = new String(event.getBody());
            String[] arr = data.split(",");
            String id = arr[0];
            String name = arr[1];
            int age = Integer.parseInt(arr[2]);
 
            //保存到mysql
            //1、获取connect
            connection = DriverManager.getConnection("jdbc:mysql://hadoop102:3306/test?useSSL=false","root","123321");
            statement = connection.prepareStatement("insert into test values(?,?,?)");
            saveToMysql(id,name,age,connection,statement);
            //模拟数据保存
            //System.out.println(msgPrefix+":"+new String(take.getBody()));
            //提交事务
            transaction.commit();
 
            return Status.READY;
        }catch (Exception e){
            transaction.rollback();
        }finally {
            //关闭事务
            transaction.close();
            if(statement!=null)
            //5、关闭
            {
                try {
                    statement.close();
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            }
            if(connection!=null) {
                try {
                    connection.close();
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            }
        }
 
        return Status.BACKOFF;
    }
 
    public void saveToMysql(String id,String name,int age,Connection connection,PreparedStatement statement) throws SQLException {
 
        //2、获取statement对象
        //sql注入 【 select * from table where name='zhangsan' or 1=1】
        //connection.createStatement();
 
        //3、赋值
        statement.setString(1,id);
        statement.setString(2,name);
        statement.setInt(3,age);
        System.out.println(id+","+name+","+age);
        //4、保存
        statement.executeUpdate();
 
 
    }
    /**
     * 获取sink的配置属性
     * @param context
     */
    @Override
    public void configure(Context context) {
 
        msgPrefix = context.getString("msg.prefix");
 
    }
}

四、编写Flume脚本

#定义agent
a1.sources = r1
a1.channels = c1
a1.sinks = k1
 
#定义source
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 9999
 
#定义channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 1000
 
#定义sink
a1.sinks.k1.type = com.atguigu.flume.MySink
a1.sinks.k1.msg.prefix = message
 
#定义source、channel、sink之间的绑定关系
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

五、测试

1.启动flume

[hadoop@hadoop102 ~]$ cd /opt/module/flume/
[hadoop@hadoop102 flume]$ bin/flume-ng agent -c conf/ -n a1 -f job/mysik.config -Dflume.root.logger=INFO,console

2.启动nc端口

[hadoop@hadoop102 ~]$ nc hadoop102 9999
1,ttt,8
OK

3.客户端输出

4.查看MySQL数据库

总结

以上为个人经验，希望能给大家一个参考，也希望大家多多支持IT俱乐部。

IT俱乐部

Flume如何自定义Sink数据至MySQL

目录

Flume自定义Sink数据至MySQL

一、介绍

二、需求

三、编写MySink

四、编写Flume脚本

五、测试

总结

联系我们

微信扫一扫关注我们

微信扫一扫：分享

目录

Flume自定义Sink数据至MySQL

一、介绍

二、需求

三、编写MySink

四、编写Flume脚本

五、测试

总结

微信扫一扫：分享

相关推荐

联系我们

微信扫一扫关注我们