How can I import zip files and process the excel files inside the zip files by using pyspark connecting with pymongo

+2 votes
How can I import zip files and process the excel files ( inside the zip files ) by using pyspark connecting with pymongo ?

I was install spark and mongodb and python to process the files (excel, csv or json)

I used this code to connect pyspark with mmongo :

from pyspark.sql import SparkSession

my_spark = SparkSession \
    .builder \
    .appName("myApp") \
    .config("spark.mongodb.input.uri", "mongodb://127.0.0.1/test.coll") \
    .config("spark.mongodb.output.uri", "mongodb://127.0.0.1/test.coll") \
    .getOrCreate()

but then I was try to import zip files ( I don't need to open every files to process it )
Aug 6, 2019 in Apache Spark by Ahmed
832 views

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In Apache Spark

0 votes
1 answer

How can I compare the elements of the RDD using MapReduce?

You have to use the comparison operator ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,490 points
3,441 views
0 votes
1 answer

How can we optimize and minimize the memory when work with scala use case?

Hi, There is a term in Scala that is ...READ MORE

answered Jul 5, 2019 in Apache Spark by Gitika
• 65,770 points
864 views
0 votes
2 answers

In a Spark DataFrame how can I flatten the struct?

// Collect data from input avro file ...READ MORE

answered Jul 4, 2019 in Apache Spark by Dhara dhruve
6,110 views
0 votes
1 answer

How can I minimize data transfers when working with Spark?

Minimizing data transfers and avoiding shuffling helps ...READ MORE

answered Sep 19, 2018 in Apache Spark by zombie
• 3,790 points
3,116 views
0 votes
1 answer

how can I get all executors' pending jobs and stages of particular sparksession?

Hi@Neha, You can find all the job status ...READ MORE

answered Aug 19, 2020 in Apache Spark by MD
• 95,460 points
1,172 views
0 votes
1 answer
0 votes
1 answer
0 votes
0 answers

Struck to do an application on amazon price tracking

Sir I want to do an application ...READ MORE

Apr 7, 2020 in Selenium by Likhitha
• 120 points
699 views
+1 vote
1 answer
0 votes
1 answer

I've been trying to run this code, but the error says "Expected an indented block" for the line, " word_as_list[index] = guess."

Hi, @Paradox, The error message IndentationError: expected an indented ...READ MORE

answered Nov 21, 2020 in Python by anonymous
• 65,770 points
1,083 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP