42103/how-to-increase-hdfs-replication-level-in-spark
Hi @Raunak. You can change the replication level as follows:
Open the Spark shell and run the following command:
val sc = new SparkContext(new SparkConf()) ./bin/spark-submit <all your existing options> --spark.yarn.submit.file.replication=5
By default, the timeout is set to ...READ MORE
Yes, you can go ahead and write ...READ MORE
You aren't actually overwriting anything with this ...READ MORE
You can save the RDD using saveAsObjectFile and saveAsTextFile method. ...READ MORE
Instead of spliting on '\n'. You should ...READ MORE
Firstly you need to understand the concept ...READ MORE
org.apache.hadoop.mapred is the Old API org.apache.hadoop.mapreduce is the ...READ MORE
Hi, You can create one directory in HDFS ...READ MORE
By default, Spark jar, app jar, and ...READ MORE
The default time that the Yarn application waits ...READ MORE
OR
At least 1 upper-case and 1 lower-case letter
Minimum 8 characters and Maximum 50 characters
Already have an account? Sign in.