Why is Spark faster than Hadoop Map Reduce

0 votes
Can anyone explain, why certain programs are faster in Spark that in MapReduce?
Apr 30, 2018 in Apache Spark by Data_Nerd
• 2,390 points
1,364 views

1 answer to this question.

0 votes

Firstly, it's the In-memory computation, if the file is present in HDFS, it takes more time to load data from HDFS, do the processing and store the result back to HDFS (in case there are multiple MR jobs). For Spark, data is stored in the cache memory, and when the final transformation is done (action), only then it is stored in HDFS. This saves a lot of time.

Spark uses lazy evaluation with the help of DAG (Directed Acyclic Graph) of consecutive transformations. This reduces data shuffling and the execution is optimized. 

Lastly, Spark has its own SQL, Machine Learning, Graph and Streaming components unlike Hadoop, where you have to install all the other frameworks separately and data movement between these frameworks is a nasty job. 

Hope it helps.

answered Apr 30, 2018 by shams
• 3,670 points

Related Questions In Apache Spark

0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

answered Feb 24, 2019 in Apache Spark by Wasim
1,082 views
0 votes
1 answer

How is Apache Spark different from the Hadoop approach?

In Hadoop MapReduce the input data is ...READ MORE

answered May 7, 2018 in Apache Spark by BD Master
1,153 views
0 votes
1 answer

Is it possible to run Spark and Mesos along with Hadoop?

Yes, it is possible to run Spark ...READ MORE

answered May 29, 2018 in Apache Spark by Data_Nerd
• 2,390 points
824 views
0 votes
1 answer

Is it mandatory to start Hadoop to run spark application?

No, it is not mandatory, but there ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points
832 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,028 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
108,830 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
4,611 views
0 votes
1 answer
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 28, 2018 in Apache Spark by shams
• 3,670 points
43,074 views
+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

answered Mar 21, 2019 in Apache Spark by anonymous
72,377 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP