Spark code takes too much time to run on cluster

+2 votes
I have written a Spark application. My code works fine for smaller size population (dataset) but it takes too much time for larger population (dataset).
Jan 4, 2020 in Apache Spark by asif
• 140 points
1,901 views
Hi @asif,

Can you please share your spark application code and the approach. Also please mention the size of the dataset when the application starts getting slower.

1 answer to this question.

+1 vote
Hi @asif,

Share with us please the application code and some data sample if possible.
answered Jan 22, 2020 by Alexandru
• 510 points

Related Questions In Apache Spark

0 votes
1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points
7,255 views
0 votes
1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 29, 2020 in Apache Spark by GAURAV
• 140 points
4,599 views
0 votes
1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

answered Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,350 points
6,322 views
0 votes
1 answer

Setting textinputformat.record.delimiter in spark

I got this working with plain uncompressed ...READ MORE

answered Oct 10, 2018 in Big Data Hadoop by Omkar
• 69,180 points
3,085 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
13,558 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
116,578 views
+1 vote
1 answer

Spark: java.io.FileNotFoundException

Hello, From the error I get that the ...READ MORE

answered Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
4,884 views
+1 vote
1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

answered Dec 13, 2019 in Apache Spark by Alexandru
• 510 points

edited Dec 13, 2019 by Alexandru 3,337 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP