博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
flow的采集与分析---clickhouse的简介和安装
阅读量:7009 次
发布时间:2019-06-28

本文共 8821 字,大约阅读时间需要 29 分钟。

前言

最近在参与网络flow采集和分析的项目。主要是遵循netflow和sflow协议,完成对防火墙和核心交换机的流量的采集和存储以及后续分析。flow并发量和数据量都比较大,存储是瓶颈。最开始存储到prometheus和之后测试的infulxdb的方案均宣告失败。看来prometheus还是不适合大数据量的处理,如果数据量过大,需要考虑联邦模式了。

在调研社区相关项目后,准备测试clickhouse的存储方案。
图片描述
采集端我们并没有采用vflow,而是对telegraf写了专门的针对flow的input插件。然后输出到kafka集群当中。然后消费者从kafka获取数据存储到clickhouse,便于以后的分析。
ClickHouse是一个非常好的分析列式数据库选择,性能比较强劲。官方提供了很对与主流数据库的性能对比,大家可以了解更加详细的。

安装过程

下载所需的rpm包

图片描述
共5个包。

解决依赖

安装server过程中,出现以下错误:

rpm -ivh clickhouse-server-1.1.54236-4.el7.x86_64.rpmerror: Failed dependencies:    libicudata.so.50()(64bit) is needed by clickhouse-server-1.1.54236-4.el7.x86_64    libicui18n.so.50()(64bit) is needed by clickhouse-server-1.1.54236-4.el7.x86_64    libicuuc.so.50()(64bit) is needed by clickhouse-server-1.1.54236-4.el7.x86_64    libltdl.so.7()(64bit) is needed by clickhouse-server-1.1.54236-4.el7.x86_64    libodbc.so.2()(64bit) is needed by clickhouse-server-1.1.54236-4.el7.x86_64

首先通过执行下面语句解决前三个报错

yum install libicu-devel

然后下载

rpm -ivh libtool-ltdl-2.4.2-22.el7_3.x86_64.rpm

然后下载

rpm -ivh unixODBC-2.3.1-11.el7.x86_64.rpm

安装

1

rpm -ivh clickhouse-server-common-1.1.54236-4.el7.x86_64.rpmPreparing...                          ################################# [100%]Updating / installing...   1:clickhouse-server-common-1.1.5423################################# [100%]

2

rpm -ivh clickhouse-server-1.1.54236-4.el7.x86_64.rpmPreparing...                          ################################# [100%]Updating / installing...   1:clickhouse-server-1.1.54236-4.el7################################# [100%]

3

rpm -ivh clickhouse-debuginfo-1.1.54236-4.el7.x86_64.rpmPreparing...                          ################################# [100%]Updating / installing...   1:clickhouse-debuginfo-1.1.54236-4.################################# [100%]

4

rpm -ivh clickhouse-client-1.1.54236-4.el7.x86_64.rpmPreparing...                          ################################# [100%]Updating / installing...   1:clickhouse-client-1.1.54236-4.el7################################# [100%]

5

rpm -ivh clickhouse-compressor-1.1.54236-4.el7.x86_64.rpmPreparing...                          ################################# [100%]Updating / installing...   1:clickhouse-compressor-1.1.54236-4################################# [100%]

启动

启动服务

clickhouse-server --config-file=/etc/clickhouse-server/config.xmlInclude not found: clickhouse_remote_serversInclude not found: clickhouse_compression2018.03.19 17:17:25.113898 [ 1 ] 
Application: Logging to console2018.03.19 17:17:25.117332 [ 1 ]
: Starting daemon with revision 542362018.03.19 17:17:25.117444 [ 1 ]
Application: starting up2018.03.19 17:17:25.118273 [ 1 ]
Application: rlimit on number of file descriptors is 10240002018.03.19 17:17:25.118299 [ 1 ]
Application: Initializing DateLUT.2018.03.19 17:17:25.118307 [ 1 ]
Application: Initialized DateLUT with time zone `Asia/Shanghai'.2018.03.19 17:17:25.120309 [ 1 ]
Application: Configuration parameter 'interserver_http_host' doesn't exist or exists and empty. Will use 'xxxx' as replica host.2018.03.19 17:17:25.120471 [ 1 ]
ConfigReloader: Loading config `/etc/clickhouse-server/users.xml'2018.03.19 17:17:25.125606 [ 1 ]
ConfigProcessor: Include not found: networks2018.03.19 17:17:25.125636 [ 1 ]
ConfigProcessor: Include not found: networks2018.03.19 17:17:25.126753 [ 1 ]
Application: Loading metadata.2018.03.19 17:17:25.127259 [ 1 ]
DatabaseOrdinary (default): Total 0 tables.2018.03.19 17:17:25.127348 [ 1 ]
DatabaseOrdinary (system): Total 0 tables.2018.03.19 17:17:25.127894 [ 1 ]
Application: Loaded metadata.2018.03.19 17:17:25.128699 [ 1 ]
Application: Listening http://[::1]:81232018.03.19 17:17:25.128749 [ 1 ]
Application: Listening tcp: [::1]:90002018.03.19 17:17:25.128783 [ 1 ]
Application: Listening interserver: [::1]:90092018.03.19 17:17:25.128816 [ 1 ]
Application: Listening http://10.xx.xx.136:81232018.03.19 17:17:25.128845 [ 1 ]
Application: Listening tcp: 10.xx.xx.136:90002018.03.19 17:17:25.128872 [ 1 ]
Application: Listening interserver: 10.xx.xx.136:90092018.03.19 17:17:25.129116 [ 1 ]
Application: Ready for connections.2018.03.19 17:17:27.120687 [ 2 ]
ConfigReloader: Loading config `/etc/clickhouse-server/config.xml'2018.03.19 17:17:27.127614 [ 2 ]
ConfigProcessor: Include not found: clickhouse_remote_servers2018.03.19 17:17:27.127701 [ 2 ]
ConfigProcessor: Include not found: clickhouse_compression

客户端连接

clickhouse-client --host=10.xx.xx.136  --port=9000ClickHouse client version 1.1.54236.Connecting to 10.xx.xx.136:9000.Connected to ClickHouse server version 1.1.54236.:):):):):):) show tables;SHOW TABLESOk.0 rows in set. Elapsed: 0.011 sec.:)

简单操作测试

:) select now()SELECT now()┌───────────────now()─┐│ 2018-03-19 17:22:55 │└─────────────────────┘1 rows in set. Elapsed: 0.002 sec.

systemd守护进程服务

/etc/systemd/system/clickhouse.service[Unit]Description=clickhouseAfter=syslog.targetAfter=network.target[Service]LimitAS=infinityLimitRSS=infinityLimitCORE=infinityLimitNOFILE=65536User=rootType=simpleRestart=on-failureKillMode=control-groupExecStart=/usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xmlRestartSec=10s[Install]WantedBy=multi-user.target

性能测试

硬件配置:

  • CPU Intel Core Processor (Haswell, no TSX) cores = 8, 2.6GHz, x86_64
  • Memory 16G
  • Drive SSD in software RAID

图片描述

图片描述

ClickHouse列数据库的Golang 驱动

kafka的消费者需要将数据写入到clickhouse数据库中,由于我们的技术栈主要为golang,所以需要一个golang版本的ClickHouse的驱动。本章就重点介绍一个。

关键特性

  • 使用原生 ClickHouse tcp client-server 协议
  • 兼容 database/sql 库
  • 实现了轮训算法的负载均衡

DSN

  • username/password - auth credentials
  • database - select the current default database
  • read_timeout/write_timeout - timeout in second
  • no_delay - disable/enable the Nagle Algorithm for tcp socket (default
    is 'true' - disable)
  • alt_hosts - comma separated list of single address host for
    load-balancing
  • connection_open_strategy - random/in_order (default random).

    • random - choose random server from set
    • in_order - first live server is choosen in specified order
  • block_size - maximum rows in block (default is 1000000). If the rows
    are larger then the data will be split into several blocks to send
    them to the server
  • debug - enable debug output (boolean value)

SSL/TLS 参数

  • secure - 建立安全连接,默认为false
  • skip_verify - 跳过安全认证 默认是true

example

tcp://host1:9000?username=user&password=qwerty&database=clicks&read_timeout=10&write_timeout=20&alt_hosts=host2:9000,host3:9000

支持的数据类型

  • UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64
  • Float32, Float64
  • String
  • FixedString(N)
  • Date
  • DateTime
  • Enum
  • UUID
  • Nullable(T)

Install

go get -u github.com/kshvakov/clickhouse

示例

package mainimport (    "database/sql"    "fmt"    "log"    "time"    "github.com/kshvakov/clickhouse")func main() {    connect, err := sql.Open("clickhouse", "tcp://127.0.0.1:9000?debug=true")    if err != nil {        log.Fatal(err)    }    if err := connect.Ping(); err != nil {        if exception, ok := err.(*clickhouse.Exception); ok {            fmt.Printf("[%d] %s \n%s\n", exception.Code, exception.Message, exception.StackTrace)        } else {            fmt.Println(err)        }        return    }    _, err = connect.Exec(`        CREATE TABLE IF NOT EXISTS example (            country_code FixedString(2),            os_id        UInt8,            browser_id   UInt8,            categories   Array(Int16),            action_day   Date,            action_time  DateTime        ) engine=Memory    `)    if err != nil {        log.Fatal(err)    }    var (        tx, _   = connect.Begin()        stmt, _ = tx.Prepare("INSERT INTO example (country_code, os_id, browser_id, categories, action_day, action_time) VALUES (?, ?, ?, ?, ?, ?)")    )    for i := 0; i < 100; i++ {        if _, err := stmt.Exec(            "RU",            10+i,            100+i,            clickhouse.Array([]int16{1, 2, 3}),            time.Now(),            time.Now(),        ); err != nil {            log.Fatal(err)        }    }    if err := tx.Commit(); err != nil {        log.Fatal(err)    }    rows, err := connect.Query("SELECT country_code, os_id, browser_id, categories, action_day, action_time FROM example")    if err != nil {        log.Fatal(err)    }    for rows.Next() {        var (            country               string            os, browser           uint8            categories            []int16            actionDay, actionTime time.Time        )        if err := rows.Scan(&country, &os, &browser, &categories, &actionDay, &actionTime); err != nil {            log.Fatal(err)        }        log.Printf("country: %s, os: %d, browser: %d, categories: %v, action_day: %s, action_time: %s", country, os, browser, categories, actionDay, actionTime)    }    if _, err := connect.Exec("DROP TABLE example"); err != nil {        log.Fatal(err)    }}

总结

后续会讲解clickhouse的go版本的客户端库以及flow项目中clickhouse的使用心得。

转载地址:http://vdntl.baihongyu.com/

你可能感兴趣的文章
AirSim的搭建和使用
查看>>
POJ2184 Cow Exhibition(DP:变种01背包)
查看>>
js浮点数加减乘除精度不准确
查看>>
linux下修改/etc/profile文件
查看>>
SpringMvc之集成FreeMarker
查看>>
Java经典实例(第二版)
查看>>
python常用模块目录
查看>>
Linux nohup命令
查看>>
IOS 本地通知
查看>>
Android Studio 使用笔记:记录使用Gradle配置AndroidAnnotations
查看>>
HTML5 + AJAX ( 原生JavaScript ) 异步多文件上传
查看>>
陶哲轩实分析 习题 13.5.1
查看>>
域上多项式的带余除法
查看>>
获取计算机以及本机信息API
查看>>
《结对-贪吃蛇游戏-测试过程》
查看>>
【trie树】HDU1251统计难题
查看>>
微软职位内部推荐-Sr DEV Lead, Bing Search Relevance
查看>>
[转]相频响应与群延迟
查看>>
python-函数
查看>>
Django form组件
查看>>