Storm,yahoo!S4比较

hadoop变得越来越热门,但是hadoop的设计是用来处理静态数据和批处理任务,流处理实施起来不是很方便,有些困难。而目前存在的比较普遍的分布式流处理框架有Storm和S4。两者各有特点,以下大致列出了网上对两者的比较,以便根据不同的任务或需求来选择合适的框架。

 

1.目前主要开源大数据解决方案

解决方案开发商类型描述
Storm Twitter 流式处理 Twitter 的新流式大数据分析解决方案
S4 Yahoo! 流式处理 来自 Yahoo! 的分布式流计算平台
Hadoop Apache 批处理 MapReduce 范式的第一个开源实现
Spark UC Berkeley AMPLab 批处理 支持内存中数据集和恢复能力的最新分析平台
Disco Nokia 批处理 Nokia 的分布式 MapReduce 框架
HPCC LexisNexis 批处理 HPC 大数据集群

 PS:目前大数据处理框架的概述http://pan.baidu.com/share/link?shareid=828559866&uk=2248644272

       spark的介绍:http://blog.csdn.net/dellme99/article/details/17076045

2.大致区别

Summary.

There are many other differences, but for sake of brevity I just present a short summary of the pros of each platform that the other one lacks.

S4 pros:

  • Clean programming model.
  • State recovery.
  • Inter-app communication.
  • Classpath isolation.
  • Tools for packaging and deployment.
  • Apache incubation.

Storm pros:

  • Pull model.
  • Guaranteed processing.
  • More mature, more traction, larger community.
  • High performance.
  • Thread programming support.
  • Advanced features (transactional topologies, Trident).

3.Storm is just awesome with a perfect blend of open source technologies used in the architecture. It is very easy to write real time distributed application on storm than S4 with high performance.

参考文献:

[1]:http://www.ibm.com/developerworks/cn/opensource/os-twitterstorm/

[2]:http://gdfm.me/2013/01/02/distributed-stream-processing-showdown-s4-vs-storm/

[3]:http://www.quora.com/What-would-you-choose-between-Flume-Yahoo-S4-and-Backtype-Twitter-Storm-and-why#

posted on 2013-12-15 21:52  hequn8128  阅读(723)  评论(0)    收藏  举报

导航