Trending questions in Apache Spark

0 votes
1 answer

Spark-shell not working

First, reboot the system. And after reboot, ...READ MORE

Jul 15, 2019 in Apache Spark by Mahesh
5,141 views
0 votes
1 answer

Load .xlsx files to hive tables with spark scala

This should work: def readExcel(file: String): DataFrame = ...READ MORE

Jul 22, 2019 in Apache Spark by Kishan
4,821 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
20,992 views
0 votes
1 answer

How do find Max and Min values in a set in Scala?

Hey, Here is the example of which will return ...READ MORE

Jul 30, 2019 in Apache Spark by Gitika
• 65,770 points
4,263 views
0 votes
1 answer

How to remove the elements with a key present in any other RDD?

Hey, You can use the subtractByKey () function to ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,770 points
4,457 views
0 votes
1 answer

Spark: How can i create temp views in user defined database instead of default database?

You can try the below code: df.registerTempTable(“airports”) sqlContext.sql(" create ...READ MORE

Jul 14, 2019 in Apache Spark by Ishan
4,684 views
0 votes
1 answer

How to increase worker timeout in Spark application?

By default, the timeout is set to ...READ MORE

Mar 25, 2019 in Apache Spark by Hari
9,368 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1, 2019 in Apache Spark by Esha
3,763 views
0 votes
1 answer

Spark Installation problem

After downloading Spark, you need to set ...READ MORE

Jul 5, 2019 in Apache Spark by Rishi
4,962 views
0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

Jul 24, 2019 in Apache Spark by Yogi
4,084 views
+1 vote
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23, 2019 in Apache Spark by Karan
2,570 views
0 votes
1 answer

Scala: 30: error: value partitions is not a member of String

Try this code: val rdd= sc.textFile (“file.txt”, 5) rdd.partitions.size Output ...READ MORE

Jul 29, 2019 in Apache Spark by Nijit
3,658 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
4,605 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,770 points
3,197 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
3,756 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11, 2019 in Apache Spark by ravikiran
• 4,620 points
1,509 views
0 votes
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 23, 2019 in Apache Spark by Ritu
3,665 views
0 votes
1 answer

load/save text file in spark

The reason you are able to load ...READ MORE

Jul 22, 2019 in Apache Spark by Giri
3,649 views
0 votes
1 answer

What is RDD Lineage in Spark?

Hey, Lineage is an RDD process to reconstruct ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
4,427 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

Nov 5, 2019 in Apache Spark by Begum
1,421 views
0 votes
1 answer

Scala: error: value unary_+ is not a member of (Int, Int)

All prefix operators' symbols are predefined: +, -, ...READ MORE

Jul 22, 2019 in Apache Spark by karan
3,590 views
+1 vote
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
2,968 views
0 votes
1 answer

Scala: Add user input to array

You can try this:  object printarray { ...READ MORE

Jun 19, 2019 in Apache Spark by Dinesha
4,897 views
+1 vote
0 answers

Difference Between rdd dataframe dataset [closed]

Sep 13, 2019 in Apache Spark by Rajesh pagadala

closed Sep 13, 2019 by Omkar 1,210 views
0 votes
1 answer

Unable to use ml library in pyspark

The error message you have shared with ...READ MORE

Jul 30, 2019 in Apache Spark by Karan
3,081 views
0 votes
1 answer

Error : split value is not a member of org.apache.spark.sql.Row

spark.read.csv is used when loading into a ...READ MORE

Jul 22, 2019 in Apache Spark by Firoz
3,333 views
0 votes
1 answer

How to start spark history server?

Hey, You can use this command to start​ ...READ MORE

Jul 25, 2019 in Apache Spark by Gitika
• 65,770 points
3,241 views
0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
2,845 views
0 votes
3 answers

I don't understand the reason behind Spark RDD being immutable.

There are few reasons for keeping RDD ...READ MORE

Apr 18, 2019 in Apache Spark by santlal561987@gmail.com
13,050 views
0 votes
1 answer

How SparkSQL is different from HQL and SQL?

Hi, SparkSQL is a special component on the ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,770 points
4,097 views
0 votes
1 answer

What are these in scala : _* & @_*

As is widely used, and has different ...READ MORE

Jul 31, 2019 in Apache Spark by Turic
2,874 views
+1 vote
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 25, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 1,789 views
0 votes
1 answer

How to declare a Empty Scala Map?

Hi, You can either declare an empty Scala ...READ MORE

Jul 29, 2019 in Apache Spark by Gitika
• 65,770 points

edited Jul 29, 2019 by Gitika 2,798 views
0 votes
1 answer

Spark: Read from Hive, store in HDFS

Below is an example of reading data ...READ MORE

Jul 26, 2019 in Apache Spark by Lohit
2,935 views
0 votes
1 answer

How to concatenate Maps in Scala?

Hey, You can concatenate/join two Maps in more than ...READ MORE

Jul 29, 2019 in Apache Spark by Gitika
• 65,770 points

edited Jul 29, 2019 by Gitika 2,760 views
0 votes
1 answer

error: reassingment to val

Hi, This error will only generate when you ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,770 points
3,713 views
0 votes
1 answer

Create dataframe for Avro file

Yes, we can work with Avro files ...READ MORE

Jul 22, 2019 in Apache Spark by Rishi
2,830 views
0 votes
1 answer

Passing condition dynamically to Spark application.

You can try this: d.filter(col("value").isin(desiredThings: _*)) and if you ...READ MORE

Feb 19, 2019 in Apache Spark by Omkar
• 69,220 points
9,398 views
0 votes
1 answer

Average function is not commutative and associative?

Hey, I guess the only problem with the ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
2,752 views
0 votes
1 answer

RDD word count with line numbers

df = spark.createDataFrame([("A", 2000), ("A", 2002), ("A", ...READ MORE

Jul 25, 2019 in Apache Spark by Siri
2,609 views
0 votes
1 answer

Spark + Hive connectivity

The problem is probably with the command. ...READ MORE

Aug 1, 2019 in Apache Spark by Rishni
2,294 views
0 votes
1 answer

Spark comparing two big data files using scala

Try this and see if this does ...READ MORE

Apr 2, 2019 in Apache Spark by Omkar
• 69,220 points
7,503 views
0 votes
1 answer

How to concatenate sets in Scala?

Hey, Yes, there are two ways of doing ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
2,245 views
0 votes
1 answer

error:error: only classes can have declared but undefined members.

Hi, This happens in Scala whenever you won't ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,770 points
2,505 views
0 votes
1 answer

What is Piping in Spark?

Hi, Spark provides a pipe() method on RDDs. ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,770 points
3,361 views
–2 votes
1 answer

What is the difference in Java’s “If..Else” and Scala’s “If..Else”? [closed]

Hey, Java’s “If. Else”: In Java, “If. Else” is a statement, ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
2,184 views
0 votes
1 answer

Spark to check if a particular string exists in a file

You can use this: lines = sc.textFile(“hdfs://path/to/file/filename.txt”); def isFound(line): if ...READ MORE

Mar 15, 2019 in Apache Spark by Raj
8,043 views
0 votes
1 answer

Scala: save filtered data row by row using saveAsTextFile

Try this code, it worked for me: val ...READ MORE

Aug 2, 2019 in Apache Spark by Karan
1,842 views
0 votes
0 answers

WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [closed]

Hi All I am running Scala program on ...READ MORE

May 5, 2019 in Apache Spark by Vishal

closed May 6, 2019 by Omkar 5,704 views