Spark memory processing on a not temporary table

0 votes
When not using a temporary table, I am assuming the data is written in hdfs file. Spark does in memory processing, so this will be different from Spark's regular approach since the data will now be read from the file for further processing. Is this assumption correct?
Jul 14, 2019 in Apache Spark by Kunal
1,546 views

1 answer to this question.

0 votes
Temporary table is more like an index for which the spark doesn't even create meta-data that is the reason why its called temporary and we always create a temp table from a dataframe. Now once temp table data is stored in to a hive table its stores it in hive warehouse which is hdfs only so that part is absolutely correct. And whenever you are reading it the sqlcontext object reads it from HDFS instead of in-memory. Now since you are only reading the data like "select * from table_hive" there won't be much difference in processing time whether it reads from in-memory or from hdfs but if we consider the difference in microseconds then we will find the difference and in-memory processing takes the lead here.
answered Jul 14, 2019 by Suri

Related Questions In Apache Spark

+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,490 points
8,460 views
0 votes
3 answers

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3,1) df.filter(col("uid").isin(notFollowingList:_*)) You can ...READ MORE

answered Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points
92,736 views
0 votes
1 answer

What is Executor Memory in a Spark application?

Every spark application has same fixed heap ...READ MORE

answered Jan 5, 2019 in Apache Spark by Frankie
• 9,830 points
6,507 views
0 votes
1 answer
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,033 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,540 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
108,853 views
0 votes
1 answer

Difference between cogroup and full outer join in spark

Please go through the below explanation : Full ...READ MORE

answered Jul 14, 2019 in Apache Spark by Kiran
9,888 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve "`id`" given input columns

I have used a header-less csv file ...READ MORE

answered Jul 14, 2019 in Apache Spark by Puneet
17,691 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP