How to select all columns with group by

How to select all columns with group by in spark

df.select(*).groupby("id").agg(sum("salary"))

I tried using select but could not make it work.

Feb 19, 2019 in Apache Spark by Ishan
• 16,577 views

1 answer to this question.

You can use the following to print all the columns:

resultset = df.groupBy("id").sum("salary");
joinedDS = studentDataset.join(resultset, "id");

answered Feb 19, 2019 by Omkar
• 69,180 points

Try

df.select(df("*")).groupby("id").agg(sum("salary"))

answered Sep 17, 2021 by Parimi Pavan

edited Mar 5, 2025

Related Questions In Apache Spark

0 votes

1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 29, 2020 in Apache Spark by GAURAV
• 140 points • 5,233 views

0 votes

1 answer

How to index one csv file with no header , after converting the csv to a dataframe, i need to name the columns in order to normalize in minmaxScaler.

Hi@Manas, You can read your dataset from CSV ...READ MORE

answered Sep 10, 2020 in Apache Spark by MD
• 95,460 points • 3,602 views

+2 votes

14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar • 98,001 views

0 votes

2 answers

How to use RDD filter with other function?

val x = sc.parallelize(1 to 10, 2) // ...READ MORE

answered Aug 17, 2018 in Apache Spark by zombie
• 3,790 points • 11,232 views

+1 vote

1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points • 14,487 views

0 votes

1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points • 5,210 views

+2 votes

11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points • 121,408 views

–1 vote

1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points • 7,523 views

0 votes

1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

answered Mar 1, 2019 in Apache Spark by Omkar
• 69,180 points • 2,032 views

0 votes

1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,180 points • 6,590 views

Subscribe to our Newsletter, and get personalized recommendations.

REGISTER FOR FREE WEBINAR

Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP