Apache Spark Questions | Edureka Community

0 votes

1 answer

Internal work of Spark

Spark revolves around the concept of a ...READ MORE

Oct 11, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,243 views

0 votes

1 answer

Persistence Levels in Spark

Spark has various persistence levels to store ...READ MORE

Jun 8, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 6,555 views

0 votes

1 answer

Efficient way to read specific columns from parquet file in spark

As parquet is a column based storage ...READ MORE

Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 8,527 views

0 votes

1 answer

When not to use foreachPartition and mapPartition?

With mapPartion() or foreachPartition(), you can only ...READ MORE

Apr 30, 2018 in Apache Spark by Data_Nerd
• 2,390 points • 7,785 views

0 votes

1 answer

In what kind of use cases has Spark outperformed Hadoop in processing?

I can list some but there can ...READ MORE

Sep 19, 2018 in Apache Spark by zombie
• 3,790 points • 1,538 views

0 votes

1 answer

How to stop INFO messages displaying on Spark console?

Just do the following: Edit your conf/log4j.properties file ...READ MORE

Aug 21, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 2,752 views

0 votes

1 answer

What happens to RDD when one of the nodes goes down?

Whenever a node goes down, Spark knows ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 2,179 views

0 votes

2 answers

Which cluster type should I choose for Spark?

Spark is agnostic to the underlying cluster ...READ MORE

Aug 21, 2018 in Apache Spark by zombie
• 3,790 points • 2,504 views

0 votes

1 answer

Functions of Spark SQL?

Spark SQL is capable of: Loading data from ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,899 views

0 votes

1 answer

Does Spark provide the storage layer too?

No, it doesn’t provide storage layer but ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,836 views

0 votes

1 answer

Ways to create RDD in Apache Spark

There are two popular ways using which ...READ MORE

Jun 19, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 4,488 views

0 votes

1 answer

What do we mean by an RDD in Spark?

The full form of RDD is a ...READ MORE

Jun 18, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 4,471 views

+1 vote

1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 2,727 views

0 votes

1 answer

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE

May 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 5,072 views

+1 vote

3 answers

Which cluster type should I choose for Spark?

According to me, start with a standalone ...READ MORE

Jun 27, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 2,198 views

0 votes

1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 6,084 views

0 votes

1 answer

What makes Spark faster than MapReduce?

Let's first look at mapper side differences Map ...READ MORE

Jul 27, 2018 in Apache Spark by Neha
• 6,300 points • 2,005 views

0 votes

1 answer

How is RDD in Spark different from Distributed Storage Management? Can anyone help me with this ?

Some of the key differences between an RDD and ...READ MORE

Jul 26, 2018 in Apache Spark by zombie
• 3,790 points • 2,007 views

0 votes

1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 4,396 views

0 votes

1 answer

Difference between Spark ML & Spark MLlib package

org.apache.spark.mllib is the old Spark API while ...READ MORE

Jul 5, 2018 in Apache Spark by Shubham
• 13,490 points • 2,739 views

0 votes

1 answer

How to get Spark dataset metadata?

There are a bunch of functions that ...READ MORE

Apr 26, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 5,383 views

0 votes

2 answers

Parquet Files Advantages

Parquet is a columnar format supported by ...READ MORE

Jul 4, 2018 in Apache Spark by zombie
• 3,790 points • 2,589 views

0 votes

1 answer

PySpark Config ?

Mainly, we use SparkConf because we need ...READ MORE

Jul 26, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 1,215 views

0 votes

1 answer

How can I compare the elements of the RDD using MapReduce?

You have to use the comparison operator ...READ MORE

May 24, 2018 in Apache Spark by Shubham
• 13,490 points • 3,850 views

0 votes

1 answer

Spark streaming with Kafka dependency error

Your error is with the version of ...READ MORE

Jul 5, 2018 in Apache Spark by Shubham
• 13,490 points • 1,708 views

0 votes

1 answer

Getting error while connecting zookeeper in Kafka - Spark Streaming integration

I guess you need provide this kafka.bootstrap.servers ...READ MORE

May 24, 2018 in Apache Spark by Shubham
• 13,490 points • 3,184 views

+1 vote

1 answer

Can anyone explain what is RDD in Spark?

RDD is a fundamental data structure of ...READ MORE

May 24, 2018 in Apache Spark by Shubham
• 13,490 points • 3,139 views

0 votes

2 answers

map() and flatmap()

map(): Return a new distributed dataset formed by ...READ MORE

Jul 4, 2018 in Apache Spark by zombie
• 3,790 points • 1,568 views

0 votes

1 answer

What is Sliding Window?

Sliding Window controls transmission of data packets ...READ MORE

May 28, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 2,846 views

0 votes

1 answer

Minimizing Data Transfers in Spark

Minimizing data transfers and avoiding shuffling helps ...READ MORE

Jun 19, 2018 in Apache Spark by Data_Nerd
• 2,390 points • 1,816 views

0 votes

1 answer

cache tables in apache spark sql

Caching the tables puts the whole table ...READ MORE

May 4, 2018 in Apache Spark by Data_Nerd
• 2,390 points • 3,783 views

+1 vote

1 answer

Kafka Feature

Here are some of the important features of ...READ MORE

Jun 7, 2018 in Apache Spark by Data_Nerd
• 2,390 points • 2,276 views

0 votes

1 answer

How RDD persist the data in Spark?

There are two methods to persist the ...READ MORE

Jun 18, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,794 views

0 votes

1 answer

What is Spark Piping?

Spark provides a pipe() method on RDDs. ...READ MORE

May 31, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 2,498 views

0 votes

1 answer

Akka in Spark

Spark uses Akka basically for scheduling. All ...READ MORE

May 31, 2018 in Apache Spark by Data_Nerd
• 2,390 points • 2,435 views

0 votes

1 answer

Spark standalone client mode

spark-submit \ class org.apache.spark.examples.SparkPi \ deploy-mode client \ master spark//$SPARK_MASTER_IP:$SPARK_MASTER_PORT ...READ MORE

Jun 20, 2018 in Apache Spark by Ashish
• 2,650 points • 1,516 views

0 votes

1 answer

Which is better in term of speed, Shark or Spark?

Spark is a framework for distributed data ...READ MORE

Jun 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,268 views

0 votes

1 answer

How to import the dependencies of Spark MLlib into eclipse project?

I would recommend you create & build ...READ MORE

May 31, 2018 in Apache Spark by Shubham
• 13,490 points • 2,346 views

0 votes

1 answer

Spark Driver roles

A Spark driver (aka an application’s driver ...READ MORE

Jun 21, 2018 in Apache Spark by Ashish
• 2,650 points • 1,373 views

0 votes

1 answer

Why is collect in SparkR slow?

It's not the collect() that is slow. ...READ MORE

May 3, 2018 in Apache Spark by Data_Nerd
• 2,390 points • 3,416 views

0 votes

1 answer

Is there any way to uncache RDD?

RDD can be uncached using unpersist() So. use ...READ MORE

May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 2,039 views

0 votes

1 answer

How to set keys & access tokens for Twitter Spark streaming?

Either you have to create a Twitter4j.properties ...READ MORE

May 24, 2018 in Apache Spark by Shubham
• 13,490 points • 2,186 views

0 votes

1 answer

Is it mandatory to start Hadoop to run spark application?

No, it is not mandatory, but there ...READ MORE

Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,189 views

0 votes

1 answer

Can I read a CSV represented as a string into Apache Spark?

You can use the following command. This ...READ MORE

May 3, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 2,977 views

0 votes

1 answer

Convert the given Spar rdd object to Spark DataFrame.

You can create a DataFrame from the ...READ MORE

Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points • 1,526 views

0 votes

1 answer

What is Shark?

Shark is a tool, developed for people ...READ MORE

Jun 8, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 1,368 views

0 votes

1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

May 8, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 2,670 views

0 votes

1 answer

start-master and start-all?

sbin/start-master.sh : Starts a master instance on ...READ MORE

May 7, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 2,672 views

0 votes

1 answer

How does partitioning work in Spark?

By default a partition is created for ...READ MORE

May 31, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,578 views

0 votes

1 answer

Why does sortBy transformation trigger a Spark job?

Actually, sortBy/sortByKey depends on RangePartitioner (JVM). So ...READ MORE

May 8, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 2,559 views

Page:

« prev
1
...
4
5
6
7
8
9
10
11
12
next »

All categories
Generative AI (1,454)
Power BI (1,316)
DevOps & Agile (4,138)
Data Science (100)
ChatGPT (30)
Cyber Security & Ethical Hacking (1,057)
Data Analytics (1,266)
Cloud Computing (4,053)
Machine Learning (337)
PMP (1,069)
Python (3,489)
SalesForce (201)
Selenium (1,624)
Software Testing (58)
Tableau (608)
Web Development (3,972)
UI UX Design (24)
Java (1,358)
Azure (157)
Database (858)
Big Data Hadoop (1,907)
Blockchain (1,673)
Digital Marketing (121)
C# (141)
C++ (272)
IoT (Internet of Things) (390)
Kotlin (8)
Linux Administration (389)
MicroStrategy (7)
Mobile Development (395)
Others (2,387)
RPA (653)
Talend (73)
TypeSript (124)
Apache Kafka (84)
Apache Spark (596)
Career Counselling (1,091)
Events & Trending Topics (28)
Ask us Anything! (71)

Trending questions in Apache Spark

Most popular tags

Subscribe to our Newsletter, and get personalized recommendations.

CATEGORIES

TRENDING BLOG ARTICLES