kafka源码编译
Ⅰ 如何保证kafka 的消息机制 ack-fail 源码跟踪
Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.(Kafka布式、区(partitioned)、基于备份(replicated)commit-log存储服务.提供类似于messaging system特性,设计实现完全同)kafka种高吞吐量布式发布订阅消息系统特性:
(1)、通O(1)磁盘数据结构提供消息持久化种结构于即使数TB消息存储能够保持间稳定性能
(2)、高吞吐量:即使非普通硬件kafka支持每秒数十万消息
(3)、支持通kafka服务器消费机集群区消息
(4)、支持Hadoop并行数据加载
、用Kafka面自带脚本进行编译
载Kafka源码面自带gradlew脚本我利用编译Kafka源码:
1 # wget
2 # tar -zxf kafka-0.8.1.1-src.tgz
3 # cd kafka-0.8.1.1-src
4 # ./gradlew releaseTarGz
运行面命令进行编译现异信息:
01 :core:signArchives FAILED
02
03 FAILURE: Build failed with an exception.
04
05 * What went wrong:
06 Execution failed for task ':core:signArchives'.
07 > Cannot perform signing task ':core:signArchives' because it
08 has no configured signatory
09
10 * Try:
11 Run with --stacktrace option to get the stack trace. Run with
12 --info or --debug option to get more log output.
13
14 BUILD FAILED
bug()用面命令进行编译
1 ./gradlew releaseTarGzAll -x signArchives
候编译功(编译程现)编译程我指定应Scala版本进行编译:
1 ./gradlew -PscalaVersion=2.10.3 releaseTarGz -x signArchives
编译完core/build/distributions/面kafka_2.10-0.8.1.1.tgz文件网载直接用
二、利用sbt进行编译
我同用sbt编译Kafka步骤:
01 # git clone
02 # cd kafka
03 # git checkout -b 0.8 remotes/origin/0.8
04 # ./sbt update
05 [info] [SUCCESSFUL ] org.eclipse.jdt#core;3.1.1!core.jar (2243ms)
06 [info] downloading ...
07 [info] [SUCCESSFUL ] ant#ant;1.6.5!ant.jar (1150ms)
08 [info] Done updating.
09 [info] Resolving org.apache.hadoop#hadoop-core;0.20.2 ...
10 [info] Done updating.
11 [info] Resolving com.yammer.metrics#metrics-annotation;2.2.0 ...
12 [info] Done updating.
13 [info] Resolving com.yammer.metrics#metrics-annotation;2.2.0 ...
14 [info] Done updating.
15 [success] Total time: 168 s, completed Jun 18, 2014 6:51:38 PM
16
17 # ./sbt package
18 [info] Set current project to Kafka (in build file:/export1/spark/kafka/)
19 Getting Scala 2.8.0 ...
20 :: retrieving :: org.scala-sbt#boot-scala
21 confs: [default]
22 3 artifacts copied, 0 already retrieved (14544kB/27ms)
23 [success] Total time: 1 s, completed Jun 18, 2014 6:52:37 PM
于Kafka 0.8及版本需要运行命令:
01 # ./sbt assembly-package-dependency
02 [info] Loading project definition from /export1/spark/kafka/project
03 [warn] Multiple resolvers having different access mechanism configured with
04 same name 'sbt-plugin-releases'. To avoid conflict, Remove plicate project
05 resolvers (`resolvers`) or rename publishing resolver (`publishTo`).
06 [info] Set current project to Kafka (in build file:/export1/spark/kafka/)
07 [warn] Credentials file /home/wyp/.m2/.credentials does not exist
08 [info] Including slf4j-api-1.7.2.jar
09 [info] Including metrics-annotation-2.2.0.jar
10 [info] Including scala-compiler.jar
11 [info] Including scala-library.jar
12 [info] Including slf4j-simple-1.6.4.jar
13 [info] Including metrics-core-2.2.0.jar
14 [info] Including snappy-java-1.0.4.1.jar
15 [info] Including zookeeper-3.3.4.jar
16 [info] Including log4j-1.2.15.jar
17 [info] Including zkclient-0.3.jar
18 [info] Including jopt-simple-3.2.jar
19 [warn] Merging 'META-INF/NOTICE' with strategy 'rename'
20 [warn] Merging 'org/xerial/snappy/native/README' with strategy 'rename'
21 [warn] Merging 'META-INF/maven/org.xerial.snappy/snappy-java/LICENSE'
22 with strategy 'rename'
23 [warn] Merging 'LICENSE.txt' with strategy 'rename'
24 [warn] Merging 'META-INF/LICENSE' with strategy 'rename'
25 [warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
26 [warn] Strategy 'discard' was applied to a file
27 [warn] Strategy 'rename' was applied to 5 files
28 [success] Total time: 3 s, completed Jun 18, 2014 6:53:41 PM
我sbt面指定scala版本:
01 <!--
02 User: 往记忆
03 Date: 14-6-18
04 Time: 20:20
05 bolg:
06 本文址:/archives/1044
07 往记忆博客专注于hadoop、hive、spark、shark、flume技术博客量干货
08 往记忆博客微信公共帐号:iteblog_hadoop
09 -->
10 sbt "++2.10.3 update"
11 sbt "++2.10.3 package"
12 sbt "++2.10.3 assembly-package-dependency"
Ⅱ apache atlas独立部署(hadoop、hive、kafka、hbase、solr、zookeeper)
在CentOS 7虚拟机(IP: 192.168.198.131)上部署Apache Atlas,独立运行时需要以下步骤:
Apache Atlas 独立部署(集成Hadoop、Hive、Kafka、HBase、Solr、Zookeeper)
**前提环境**:Java 1.8、Hadoop-2.7.4、JDBC驱动、Zookeeper(用于Atlas的HBase和Solr)
一、Hadoop 安装
- 设置主机名为 master
- 关闭防火墙
- 设置免密码登录
- 解压Hadoop-2.7.4
- 安装JDK
- 查看Hadoop版本
- 配置Hadoop环境
- 格式化HDFS(确保路径存在)
- 设置环境变量
- 生成SSH密钥并配置免密码登录
- 启动Hadoop服务
- 访问Hadoop集群
二、Hive 安装
三、Kafka 伪分布式安装
- 安装并启动Kafka
- 测试Kafka(使用kafka-console-procer.sh与kafka-console-consumer.sh)
- 配置多个Kafka server属性文件
四、HBase 安装与配置
- 解压HBase
- 配置环境变量
- 修改配置文件
- 启动HBase
- 访问HBase界面
- 解决配置问题(如JDK版本兼容、ZooKeeper集成)
五、Solr 集群安装
- 解压Solr
- 启动并测试Solr
- 配置ZooKeeper与SOLR_PORT
- 创建Solr collection
六、Apache Atlas 独立部署
- 编译Apache Atlas源码,选择独立部署版本
- 不使用内置的HBase和Solr
- 编译完成后,使用集成的Solr到Apache Atlas
- 修改配置文件以指向正确的存储位置
七、Apache Atlas 独立部署问题解决
- 确保HBase配置文件位置正确
- 解决启动时的JanusGraph和HBase异常
- 确保Solr集群配置正确
部署完成后,Apache Atlas将独立运行,与Hadoop、Hive、Kafka、HBase、Solr和Zookeeper集成,提供数据湖和元数据管理功能。