Got it

Topic discussion | What are the advantages of Spark?

Latest reply: Dec 18, 2021 15:39:23 1267 27 8 0 0

Hello, everyone!

Spark is a common concurrent framework similar to the Hadoop MapReduce. It is an open-source framework provided by the UC Berkeley AMP lab and possesses the advantages of the Hadoop MapReduce. The Spark differs from the MapReduce in that intermediate job output results can be stored in the memory instead of the HDFS, making the Spark suitable for MapReduce algorithms that need iterations (such as data mining and machine learning).

In the 21st century, we are surrounded by data. As data grows, big data emerges.

In big data, Spark is a new-generation lightweight big data processing platform that integrates various big data-related capabilities. Today, let's discuss the advantages of Spark.

spark



  • x
  • convention:

olive.zhao
Admin Created Sep 13, 2021 02:01:35

  • x
  • convention:

wissal
wissal Created Sep 13, 2021 16:31:10 (0) (0)
Thanks  
gzzz
Admin Created Sep 13, 2021 02:03:39

Advantage: Compared with MapReduce of Hadoop, Spark's memory-based computing is more than 100 times faster. And disk-based computing is more than 10 times faster. Spark implements an efficient DAG execution engine to efficiently process data flows based on memory.

View more
  • x
  • convention:

piaopiaoran
Created Sep 13, 2021 02:06:33

Four Advantages of Spark 

Fast: Compared with Hadoop MapReduce, Spark's memory-based computing is more than 100 times faster. And disk-based computing is more than 10 times faster. Spark implements an efficient DAG execution engine to efficiently process data flows based on memory. 

Easy to use: Spark supports Java, Python, and Scala APIs and more than 80 advanced algorithms, enabling users to quickly build different applications. Spark supports interactive Python and Scala shells, which means that Spark clusters can be used in these shells to verify solutions instead of packaging, uploading clusters, and verification. This is important for prototyping. 

Universality: Spark provides a unified solution. Spark can be used for batch processing, interactive query (general Spark SQL), real-time streaming (via Spark Streaming), machine learning (via Spark MLlib), and graph computing (via Spark GraphX).
These different types of processing can all be used seamlessly in the same application. Spark's unified solution is attractive because any company wants to use a unified platform to handle problems, reducing the human cost of development and maintenance and the physical cost of deploying the platform. Of course, as a unified solution, Spark does not sacrifice performance. On the contrary, Spark has a huge advantage in terms of performance. 

Convergence: Spark can be easily integrated with other open source products. For example, Spark can use YARN and Apache Mesos of Hadoop as its resource management and scheduler, and can process all data supported by Hadoop, including HDFS, HBase, and Cassanda. This is especially important for users who have deployed Hadoop clusters because they can use Spark's powerful processing capabilities without any data migration. Spark can also be independent of third-party resource managers and schedulers. It implements Standalone as its built-in resource manager and scheduling framework, which further lowers the threshold for using Spark and makes it easy for everyone to deploy and use Spark. In addition, Spark provides a tool for deploying a standalone Spark cluster on EC2. 

---------------- 

Copyright Notice: This is the original article by the CSDN blogger "Explosion of the Small Universe", in accordance with the CC 4.0 BY-SA copyright agreement. Please attach a link to the original source and this notice for reprinting.
Original link: https://blog.csdn.net/yu0_zhang0/article/details/80056951

View more

Rating

Number of participants 1HiCoins +1 Collapse Reasons
olive.zhao olive.zhao + 1 Awesome!

View All scores

  • x
  • convention:

Rumana
Rumana Created Sep 13, 2021 04:30:19 (0) (0)
 
olive.zhao
olive.zhao Created Sep 13, 2021 09:35:19 (0) (0)
Thanks for your sharing!  
Y_T_Z
Admin Created Sep 13, 2021 02:12:59

Spark executes much faster by caching data in memory across multiple parallel operations.
View more
  • x
  • convention:

user_4310111
Created Sep 13, 2021 09:20:39

Spark is a memory-based distributed computing framework. In iterative computing scenarios, data during data processing can be stored in the memory, providing a computing capability 10 to 100 times higher than that of MapReduce. Spark can use HDFS as the underlying storage, enabling users to quickly switch from MapReduce to Spark computing platform. Spark provides one-stop data analysis capabilities, including small-batch streaming processing, offline batch processing, SQL query, and data mining. Users can seamlessly use these capabilities in the same application.

Spark has the following features:
· The distributed in-memory computing and DAG (loopless directed graph) execution engine are used to improve the data processing capability, which is 10 to 100 times higher than that of MapReduce.
Provides multiple language development interfaces (Scala, Java, and Python) and dozens of highly abstract operators, facilitating the construction of distributed data processing applications.
·Integrates SQL and Streaming to form a data processing stack to provide one-stop data processing capabilities.
· Perfect fit for the Hadoop ecosystem. Spark applications can run on standalone, Mesos, or YARN, access multiple data sources, such as HDFS, HBase, and Hive, and support smooth transfer of MapReduce programs.
View more

Rating

Number of participants 1HiCoins +1 Collapse Reasons
olive.zhao olive.zhao + 1 Awesome!

View All scores

  • x
  • convention:

olive.zhao
olive.zhao Created Sep 13, 2021 09:29:09 (0) (0)
Great!  
zaheernew
zaheernew Created Sep 15, 2021 08:30:59 (0) (0)
useful info  
little_fish
Admin Created Sep 13, 2021 09:32:03

Benefits of Apache Spark:
1. Speed
2. Ease of Use
3. Advanced Analytics
4. Dynamic in Nature
5. Multilingual
6. Apache Spark is powerful
7. Increased access to Big data
8. Demand for Spark Developers
9. Open-source community
View more
  • x
  • convention:

olive.zhao
olive.zhao Created Sep 13, 2021 09:38:56 (0) (0)
Great!  
olive.zhao
Admin Created Sep 13, 2021 09:41:24

@MahMush Welcome to discuss this topic!
View more
  • x
  • convention:

MahMush
MahMush Created Sep 13, 2021 18:47:04 (0) (0)
Thanks dear for the tag  
MahMush
MahMush Created Sep 13, 2021 18:47:48 (0) (0)
Very good topic of discussion... Of the king of big data tools  
MahMush
MahMush Created Sep 13, 2021 18:48:54 (0) (0)
Apache Spark, as an open-source Bigdata platform, provides a number of advantages over other big data solutions such as Hadoop.
Apache Spark is
a dynamic system that allows RDDs to be computed in memory.
Reusability,
fault tolerance,
real-time stream processing and many other features are included.  
CyberTec
Created Sep 13, 2021 10:37:32

Very helpful thanks.
View more
  • x
  • convention:

Unicef
MVE Created Sep 13, 2021 11:53:38

Spark focus on
Technology:
AI
IoT
SaaS
Edge Computing
5G
===========
Industry:
eCommerce
Fintech
Energy
Manufacture
Smart City
View more
  • x
  • convention:

12
Back to list

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.
Information Protection Guide
Thanks for using Huawei Enterprise Support Community! We will help you learn how we collect, use, store and share your personal information and the rights you have in accordance with Privacy Policy and User Agreement.