Trending questions in Apache Spark

0 votes
0 answers

Do we have any platform where we can submit spark application.

looking for a platform where we can ...READ MORE

May 12, 2020 in Apache Spark by anonymous
• 120 points
854 views
0 votes
1 answer

How can I remove headers from dataframe?

You can use filter to do this. ...READ MORE

Feb 15, 2019 in Apache Spark by Aryan
20,222 views
0 votes
1 answer

What is pageRank in graphX??

Hi@akhtar, The PageRank algorithm outputs a probability distribution ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,460 points
1,167 views
0 votes
1 answer

Caused by: java.lang.NumberFormatException: Empty String

Hi@akhtar, As we know text files are in ...READ MORE

Jan 31, 2020 in Apache Spark by MD
• 95,460 points
4,826 views
0 votes
2 answers

Difference between createOrReplaceTempView and registerTempTable

I am pretty sure createOrReplaceTempView just replaced ...READ MORE

Sep 18, 2020 in Apache Spark by Nathan Mott
13,643 views
0 votes
1 answer

Why do we use sc.parallelize?

Spark revolves around the concept of a ...READ MORE

Jul 11, 2019 in Apache Spark by Suman
13,473 views
0 votes
1 answer

How to create multiple producers in apache kafka?

Hi@akhtar, To create multiple producer you have to ...READ MORE

Feb 6, 2020 in Apache Spark by MD
• 95,460 points
4,157 views
–1 vote
0 answers

How to parse an S3 XML file to find tags using apache spark

How can one parse an S3 XML ...READ MORE

Mar 18, 2020 in Apache Spark by anonymous
• 110 points
2,017 views
0 votes
1 answer

What is the difference between spark streaming and spark structured streaming?

Hi@akhtar Generally, Spark streaming  is used for real time ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,460 points
3,778 views
0 votes
1 answer

What is the use of App class in Scala?

Hi, Scala provides a helper class, called App, that ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
11,375 views
0 votes
1 answer

Cannot load file to spark: "org.apache.spark.sql.AnalysisException: Path does not exist"

Since the file is in HDFS so ...READ MORE

Jul 31, 2019 in Apache Spark by Tina
11,334 views
0 votes
0 answers

One Hot Encoding in Apache Spark

The following code that I wrote for ...READ MORE

Feb 11, 2020 in Apache Spark by Manish
• 120 points
2,682 views
0 votes
1 answer

What is Action in Spark?

Hi, Actions are RDD’s operation, that value returns ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,770 points
11,843 views
+1 vote
0 answers

How to create a list of RDDs(or RDD of RDDs, if possible) from a single JavaRDD<List<Integers>> in Java?

Hi, I have the input RDD as a ...READ MORE

Jan 11, 2020 in Apache Spark by itsroops
• 130 points
2,940 views
0 votes
1 answer

what is Paired RDD and how to create paired RDD in Spark?

Hi, Paired RDD is a distributed collection of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
9,521 views
0 votes
1 answer

Does spark streaming provides checkpoint?

Hi@akhtar, Yes, Spark streaming uses checkpoint. Checkpoint is ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,460 points
1,453 views
0 votes
1 answer

What are Dstreams?

Hi@akhtar, Dstreams are the basic abstraction that is ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,460 points
1,044 views
0 votes
1 answer

Difference between cogroup and full outer join in spark

Please go through the below explanation : Full ...READ MORE

Jul 14, 2019 in Apache Spark by Kiran
9,882 views
0 votes
1 answer

Is Spark Sql provides indexing to improve processing speed?

Hi@akhtar, There is no concept of indexing in ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,460 points
910 views
0 votes
1 answer

Pyspark dataframe with random values

Hey @Esha, you can use this code. ...READ MORE

Aug 1, 2019 in Apache Spark by Zed
8,931 views
0 votes
0 answers

not able to get output in spark streaming??

Hi everyone, I tried to count individual words ...READ MORE

Feb 4, 2020 in Apache Spark by akhtar
• 38,260 points
848 views
0 votes
0 answers

Error: Package: R-core-devel-3.6.0-1el7.x86_64 (epel) Requires: pcre2-devel

Hi, I am getting this error when try ...READ MORE

Jan 31, 2020 in Apache Spark by Hasid
• 370 points
1,009 views
0 votes
1 answer

Spark, Scala: Load custom delimited file

You can load a DAT file into ...READ MORE

Jul 16, 2019 in Apache Spark by Shri
9,570 views
0 votes
1 answer

Cannot create directory /hive/xzxz/_temporary/0. Name node is in safe mode.

Hi@akhtar, Here you are trying to save csv ...READ MORE

Feb 3, 2020 in Apache Spark by MD
• 95,460 points
793 views
+1 vote
0 answers

how to access hive view using spark2

We do not have access to hive ...READ MORE

Dec 29, 2019 in Apache Spark by anonymous
• 130 points
2,038 views
0 votes
2 answers

How to execute a function in apache-scala?

Function Definition : def test():Unit{ var a=10 var b=20 var c=a+b } calling ...READ MORE

Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy
1,028 views
+2 votes
1 answer

Spark code takes too much time to run on cluster

Hi @asif, Share with us please the application ...READ MORE

Jan 22, 2020 in Apache Spark by Alexandru
• 510 points
1,207 views
0 votes
1 answer

Join in RDD using keys

Suppose you have two dataset results( id, ...READ MORE

Aug 2, 2019 in Apache Spark by Trisha
8,313 views
0 votes
1 answer

How fault tolerance is achieved in Apache Spark?

Hey, In Apache Spark, the data storage model is ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,770 points
8,639 views
0 votes
1 answer

Spark: Error while instantiating "org.apache.spark.sql.hive.HiveSessionState"

Seems like you have not started the ...READ MORE

Jul 25, 2019 in Apache Spark by Rohit
8,180 views
0 votes
1 answer

Spark Error: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.

There seems to be a problem with ...READ MORE

May 24, 2019 in Apache Spark by Jishan
10,768 views
+1 vote
2 answers

How do I get number of columns in each line from a delimited file??

Instead of spliting on '\n'. You should ...READ MORE

Aug 7, 2019 in Apache Spark by ashish
5,521 views
0 votes
1 answer

How to work with Matrix Multiplication in Apache Spark?

Hey, You can follow this below solution for ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
7,786 views
0 votes
1 answer

What does the command df.registerTempTable() do?

df.registerTempTable(“airports”) This command is used to register ...READ MORE

Jul 14, 2019 in Apache Spark by James
8,367 views
0 votes
1 answer

How to select all columns with group by?

You can use the following to print ...READ MORE

Feb 19, 2019 in Apache Spark by Omkar
• 69,220 points
14,084 views
+1 vote
1 answer

How to convert a json file structure with values in single quotes to quoteless ?

You can do this by turning off ...READ MORE

Oct 4, 2019 in Apache Spark by Jisha
4,197 views
+1 vote
1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points

edited Dec 13, 2019 by Alexandru 2,677 views
0 votes
1 answer

Spark error: Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable.

Give  read-write permissions to  C:\tmp\hive folder Cd to winutils bin folder ...READ MORE

Jul 11, 2019 in Apache Spark by Rajiv
7,710 views
+1 vote
2 answers

Spark: Can we add column to dataframe?

Yes we can add columns to the ...READ MORE

Oct 24, 2019 in Apache Spark by Siva
• 160 points
4,646 views
+1 vote
1 answer

Spark: java.io.FileNotFoundException

Hello, From the error I get that the ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
4,092 views
+1 vote
1 answer

Primary keys in Apache Spark

import sqlContext.implicits._ import org.apache.spark.sql.Row import org.apache.spark.sql.types.{StructType, StructField, LongType} val df ...READ MORE

Aug 9, 2019 in Apache Spark by ravikiran
• 4,620 points
6,100 views
0 votes
1 answer

How Foreach Operation works in Apache Spark?

Hi, foreach() operation is an action. It does not ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
6,379 views
0 votes
1 answer

Removing the header of a text file in SparkRDD

1) First we loaded the data to ...READ MORE

Jul 31, 2019 in Apache Spark by Namitha
6,383 views
0 votes
1 answer

How SortBykey() operation works in Spark?

Hey, sortByKey() is a transformation. It returns an RDD sorted ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
6,098 views
0 votes
1 answer

"main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream

1. We will check whether master and ...READ MORE

Jul 29, 2019 in Apache Spark by Yogi
6,207 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

Jan 1, 2019 in Apache Spark by anonymous
19,910 views
0 votes
1 answer

How to call the Debug Mode in PySpark?

As far as I understand your intentions ...READ MORE

Jul 26, 2019 in Apache Spark by ravikiran
• 4,620 points
6,168 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5, 2019 in Apache Spark by ravikiran
• 4,620 points
4,388 views
0 votes
1 answer

Date formats : how to cast string to date?

Try this, it should work: > from pyspark.sql.functions ...READ MORE

Jul 29, 2019 in Apache Spark by Niall
5,985 views