Trending questions in Apache Spark

0 votes
1 answer

13)Refer the input and identify the output if the below code is run

Option c)  Run time error - A READ MORE

Nov 25, 2020 in Apache Spark by Gitika
• 65,770 points
1,072 views
0 votes
1 answer

What class is declared in the blow code?

Option D: String class READ MORE

Nov 26, 2020 in Apache Spark by Gitika
• 65,770 points
994 views
0 votes
1 answer

From the following graph code ,which code snippet will return the no.of flight routes?

Hey, @Ritu, I am getting error in your ...READ MORE

Nov 25, 2020 in Apache Spark by Gitika
• 65,770 points
983 views
0 votes
1 answer

How to read Avro Partition Data?

Hi@akhtar, When we try to retrieve the data ...READ MORE

Nov 4, 2020 in Apache Spark by MD
• 95,460 points
1,868 views
0 votes
1 answer

What is the output of the following code?

After executing your code, there is an ...READ MORE

Nov 25, 2020 in Apache Spark by Gitika
• 65,770 points
953 views
0 votes
1 answer

What is the output of the following code?

rror: expected class or object definition sc.parallelize(Array(1L,("SFO")),(2L,("ORD")),(3L,("DFW")))) ^ one error ...READ MORE

Nov 26, 2020 in Apache Spark by Gitika
• 65,770 points
885 views
0 votes
0 answers

What is the output of the following code? [closed]

What is the output of the following ...READ MORE

Nov 25, 2020 in Apache Spark by Edureka
• 200 points

closed Nov 26, 2020 by MD 775 views
0 votes
1 answer

2)What will be printed when the below code is executed ?

Hi, @Ritu, List(5,100,10) is printed. The take method returns the first n elements in ...READ MORE

Nov 23, 2020 in Apache Spark by Gitika
• 65,770 points
774 views
0 votes
1 answer

Spark - how the solve the below question?

option d, Runtime error READ MORE

Nov 23, 2020 in Apache Spark by Gitika
• 65,770 points
762 views
+1 vote
1 answer

How to assign a column in Spark Dataframe (PySpark) as a Primary Key?

spark do not have any concept of ...READ MORE

Jan 12, 2020 in Apache Spark by Sirish
• 160 points
13,897 views
0 votes
1 answer

Facing issue while reading tsv file in pyspark

Hi@khyati, You are getting this type of output ...READ MORE

Sep 28, 2020 in Apache Spark by MD
• 95,460 points
2,469 views
0 votes
1 answer

How to implement my clustering algorithm in pyspark (without using the ready library for example k-means)?

Hi@dani, As you said you are a beginner ...READ MORE

Oct 14, 2020 in Apache Spark by MD
• 95,460 points
1,666 views
+1 vote
1 answer

How to convert pyspark Dataframe to pandas Dataframe?

Hi@akhtar, To convert pyspark dataframe into pandas dataframe, ...READ MORE

May 7, 2020 in Apache Spark by MD
• 95,460 points
8,184 views
0 votes
1 answer

File not found exception while processing the spark job in yarn cluster mode with multinode hadoop cluster

Hi@Ganendra, I am not sure what's the issue, ...READ MORE

Jul 30, 2020 in Apache Spark by MD
• 95,460 points
4,597 views
0 votes
1 answer
0 votes
1 answer

Ranger KMS - Curl command

Hi@Shllpa, In general, we get the 401 status code ...READ MORE

Sep 29, 2020 in Apache Spark by MD
• 95,460 points
1,326 views
0 votes
1 answer

Py4JJavaError: An error occurred while calling o310.csv. : java.net.ConnectException: Call From master/192.168.56.101 to master:9000

Hi@akhtar, I think your HDFS cluster is not ...READ MORE

May 7, 2020 in Apache Spark by MD
• 95,460 points
7,435 views
0 votes
2 answers

Error : split value is not a member of org.apache.spark.sql.Row

var d=rdd2col.rdd.map(x=>x.split(",")) or val names=rd ...READ MORE

Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy.
11,932 views
0 votes
1 answer

I am not able to run the apache spark program in mac oc

Hi@Srinath, It seems you didn't set Hadoop for ...READ MORE

Sep 21, 2020 in Apache Spark by MD
• 95,460 points
1,366 views
0 votes
1 answer

Unable to submit the spark job in deployment mode - multinode cluster(using ubuntu machines) with yarn master

Hi@Ganendra, As you said you launched a multinode cluster, ...READ MORE

Jul 29, 2020 in Apache Spark by MD
• 95,460 points
2,279 views
0 votes
1 answer

how can I get all executors' pending jobs and stages of particular sparksession?

Hi@Neha, You can find all the job status ...READ MORE

Aug 19, 2020 in Apache Spark by MD
• 95,460 points
1,174 views
0 votes
0 answers

Unable to get the Job status and Group ID java- spark standalone program with databricks

package com.dataguise.test; import java.io.IOException; import java.util.concurrent.CountDownLatch; import java.util.concurrent.TimeUnit; import org.apache.spark.SparkContext; import org.apache.spark.SparkJobInfo; import ...READ MORE

Jul 23, 2020 in Apache Spark by kamboj
• 140 points

recategorized Jul 28, 2020 by Gitika 2,280 views
0 votes
1 answer

How to create a not null column in case class in spark

Hi@Deepak, In your test class you passed empid ...READ MORE

May 14, 2020 in Apache Spark by MD
• 95,460 points
5,061 views
0 votes
1 answer

env: ‘python’: No such file or directory in pyspark.

Hi@akhtar, This error occurs because your python version ...READ MORE

Apr 7, 2020 in Apache Spark by MD
• 95,460 points
6,407 views
0 votes
1 answer

Spark: java.sql.SQLException: No suitable driver

The missing driver is the JDBC one ...READ MORE

Jul 24, 2019 in Apache Spark by John
17,427 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve "`id`" given input columns

I have used a header-less csv file ...READ MORE

Jul 14, 2019 in Apache Spark by Puneet
17,689 views
0 votes
1 answer

how to run spark job from EC2 to EMR?

Hi, You can follow the below-given steps to ...READ MORE

Jun 25, 2020 in Apache Spark by MD
• 95,460 points
2,469 views
0 votes
2 answers

java.lang.StringIndexOutOfBoundsException: String index out of range: 1

When using the Java substring() method, a ...READ MORE

Mar 13, 2020 in Apache Spark by evanbung
• 180 points
7,644 views
0 votes
1 answer

Can the executor core be greater than the total number of spark tasks?

Hi@Rishi, Yes, it is possible. If executor no. ...READ MORE

Jun 17, 2020 in Apache Spark by MD
• 95,460 points
2,204 views
0 votes
1 answer

Can number of Spark task be greater than the executor core?

Hi@Rishi, Yes, number of spark tasks can be ...READ MORE

Jun 17, 2020 in Apache Spark by MD
• 95,460 points
2,031 views
0 votes
1 answer

How to unzip a folder to individual files in HDFS?

Hi, @Amey, You can go through this regarding ...READ MORE

May 26, 2020 in Apache Spark by Gitika
• 65,770 points
2,926 views
0 votes
1 answer

error: Caused by: org.apache.spark.SparkException: Failed to execute user defined function.

Hi@akhtar, I think you got this error due to version mismatch ...READ MORE

Apr 22, 2020 in Apache Spark by MD
• 95,460 points
4,287 views
+2 votes
1 answer

Type mismatch error in scala

Hello, Your problem is here: val df_merge_final = df_merge .withColumn("version_key", ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
12,405 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

May 23, 2018 in Apache Spark by nitinrawat895
• 11,380 points
34,397 views
0 votes
3 answers

Sorting rows in descending order in Spark SQL

df.orderBy($"col".desc) - this works as well READ MORE

Jul 5, 2020 in Apache Spark by Sai
• 160 points
16,675 views
0 votes
1 answer

"java.lang.ClassNotFoundException" in Spark on Amazon EMR

Hi@akhtar, In /etc/spark/conf/spark-defaults.conf, append the path of your custom ...READ MORE

Apr 29, 2020 in Apache Spark by MD
• 95,460 points
3,702 views
0 votes
1 answer

How to parse a textFile to csv in pyspark?

Hi, Use this below given code, it will ...READ MORE

Apr 13, 2020 in Apache Spark by MD
• 95,460 points
4,119 views
0 votes
1 answer

after installing hadoop 3.0.1 I can's access spark shell or hive shell.

Hi@abdul, Hadoop 3.0.1 has lots of new features. ...READ MORE

Jun 16, 2020 in Apache Spark by MD
• 95,460 points
1,187 views
+1 vote
1 answer

Optimal column count for ORC and Parquet

Hi@Amey, It depends on your use case. Both ...READ MORE

May 8, 2020 in Apache Spark by MD
• 95,460 points
2,475 views
+1 vote
1 answer

How to read .mp4 (video file) stored at HDFS using pyspark?

Hi@Amey, You can enable WebHDFS to do this ...READ MORE

May 29, 2020 in Apache Spark by MD
• 95,460 points
1,857 views
0 votes
1 answer

Not enough space to cache rdd_80_1 in memory!

Hi@akhtar, Currently, you are running with the default ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,460 points
2,750 views
0 votes
1 answer

Difference between map() and mapPartitions() function in Spark.

Hi@ akhtar, Both map() and mapPartitions() are the ...READ MORE

Jan 29, 2020 in Apache Spark by MD
• 95,460 points
6,509 views
0 votes
1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

Mar 29, 2020 in Apache Spark by GAURAV
• 140 points
3,691 views
0 votes
1 answer

ERROR thriftserver.SparkExecuteStatementOperation: Error executing query, currentState RUNNING, org.apache.spark.sql.catalyst.errors.package$TreeNodeException

Hi@akhtar, You may resolve this exception, by increasing the ...READ MORE

Apr 29, 2020 in Apache Spark by MD
• 95,460 points
1,999 views
0 votes
1 answer
0 votes
1 answer

if i want to see my public key after running cat <path> command in gitbash but saying no such file or directory.

Hey, @KK, You can fix this issue may be ...READ MORE

May 26, 2020 in Apache Spark by Gitika
• 65,770 points
859 views
0 votes
1 answer

env : R : No such file or directory

Hi@akhtar, I also got this error. I am able to ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,460 points
1,790 views
0 votes
1 answer

Where can I get best spark tutorials for beginners?

Hi@akhtar There are lots of online courses available ...READ MORE

May 14, 2020 in Apache Spark by MD
• 95,460 points
744 views