Average function is not commutative and associative

Question

Hi,

I am not getting what is wrong with the code while finding the average, the average function is not showing commutative and associative. Can someone help how can I change the code to make it work properly?

Here is the code given:

def sum(x, y):

return x+y;

total = myrdd.reduce(sum);

avg = total / myrdd.count();

Gitika · Answer 1 · Jul 23, 2019

Hey,

I guess the only problem with the code is that the total might become very big thus overflow. So, I would rather divide each number by count and then sum in the following way.

You can use this code to see the result in a better manner:

cnt = myrdd.count();

def devideByCnd(x):

return x/cnt;

myrdd1 = myrdd.map(devideByCnd);

avg = myrdd.reduce(sum);

answered Jul 23, 2019 by Gitika
• 65,730 points

Average function is not commutative and associative

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

When not to use foreachPartition and mapPartition?

Is it possible to run Spark and Mesos along with Hadoop?

What is the difference between rdd and dataframes in Apache Spark ?

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

How do I get number of columns in each line from a delimited file??

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

What is the difference between persist() and cache() in apache spark?

How SparkSQL is different from HQL and SQL?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES