Got it

Comparison between different system architectures for Big Data Highlighted

1558 0 3 0 0

This post showcases a comparison between different system architectures for Big Data.


Let's compare the architecture of the popular Big data system now, like Apache Hadoop Ecosystem, Google PowerDrill, IBM InfoSphere Streams and huawei FusionInsight.


Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Originally designed for computer clusters built from commodity hardware—still the common use—it has also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.


The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster. It then transfers packaged code into nodes to process the data in parallel. This approach takes advantage of data locality, where nodes manipulate the data they have access to. This allows the dataset to be processed faster and more efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file system where computation and data are distributed via high-speed networking.


hadoop


PowerDrill is based on column storage, with memory computing technology, to achieve query performance of 10,000 cell data per second, which is 10-100 times the performance of traditional column storage.


The PowerDrill can quickly skip unwanted blocks of data, up to 100 times better than full scan performance.

Data memory occupies space optimization, and reduces the memory footprint by compressing and encoding technologies, which can reduce memory by 16 times.


google powerdrill


InfoSphere Streams is designed to uncover meaningful patterns from information in motion (data flows) during a window of minutes to hours. The platform provides business value by supporting low-latency insight and better outcomes for time-sensitive applications, such as fraud detection or network management. InfoSphere Streams also can fuse streams, enabling you to derive new insights from multiple streams. 

The main design goals of InfoSphere Streams are to:

  • Respond quickly to events and changing business conditions and requirements.

  • Support continuous analysis of data at rates that are orders of magnitude greater than existing systems.

  • Adapt rapidly to changing data forms and types.

  • Manage high availability, heterogeneity, and distribution for the new stream paradigm.

  • Provide security and information confidentiality for shared information.


InfoSphere


Huawei’s Big Data Solution consists of two products: FusionInsight HD and FusionInsight LibrA. FusionInsight HD is a Hadoop enterprise edition containing many components: HDFS, Yarn, HBase, Spark, MapReduce, Flink, Storm, Elk, Solr, Kafka, Loader, Flume, and so on. FusionInsight LibrA is a massively parallel processing database that features elastic scalability, excellent performance, rock-solid reliability, and superior cost-effectiveness.


big data AI architecture


Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.