How to increase HDFS replication level in Spark

Question

Hi guys. I need some help with Spark HDFS replication level. The level is set to 3 right now by default. I want to know how I can change this and make the replication level as 5. Please help. Thanks

score 0 · Answer 1 · Mar 27, 2019

Hi @Raunak. You can change the replication level as follows:

Open the Spark shell and run the following command:

val sc = new SparkContext(new SparkConf())

./bin/spark-submit <all your existing options> --spark.yarn.submit.file.replication=5

answered Mar 27, 2019 by Yash

How to increase HDFS replication level in Spark

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

How to increase worker timeout in Spark application?

How can I write a text file in HDFS not from an RDD, in Spark program?

How to change the spark Session configuration in Pyspark?

How to save and retrieve the Spark RDD from HDFS?

How do I get number of columns in each line from a delimited file??

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

How to retain Spark jar and app jar after staging?

Increase Yarn wait time for Sparkcontext

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES