How to transpose Spark DataFrame

0 votes

I have Spark 2.1. My Spark Dataframe is as follows:

COLUMN                          VALUE
Column-1                       value-1
Column-2                       value-2
Column-3                       value-3
Column-4                       value-4
Column-5                       value-5

I have to transpose these column & values. It should be look like:

Column-1  Column-2  Column-3  Column-4  Column-5
value-1   value-2   value-3   value-4   value-5

Can anyone help me out with this? Preferably in Scala

May 24, 2018 in Apache Spark by anonymous
19,910 views

3 answers to this question.

0 votes

In this situation, collect all the Columns which will help in you in creating the schema of the new dataframe & then you can collect the Values and then all the Values to form the rows.

val new_schema = StructType(df1.select(collect_list("Column")).first().getAs[Seq[String]](0).map(z => StructField(z, StringType)))
val new_values = sc.parallelize(Seq(Row.fromSeq(df.select(collect_list("Value")).first().getAs[Seq[String]](0))))
sqlContext.createDataFrame(new_values, new_schema).show(false)

Hope this helps.

answered May 24, 2018 by Shubham
• 13,490 points
0 votes
Here's how to do it python:

import numpy as np

from pyspark.sql import SQLContext

from pyspark.sql.functions import lit

dt1 = {'one':[<insert data>],'two':[<insert data>]}

dt = sc.parallelize([ (k,) + tuple(v[0:]) for k,v in dt1.items()]).toDF()

dt.show()
answered Dec 7, 2018 by shri
+1 vote
Please check the below mentioned links for Dynamic Transpose and Reverse Transpose
1. https://dzone.com/articles/how-to-use-reverse-transpose-in-spark
2. https://dzone.com/articles/dynamic-transpose-in-spark
answered Jan 1, 2019 by anonymous
Hey! Thanks for those links. Do you know how to implement a static transpose?

Related Questions In Apache Spark

0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
3,952 views
+1 vote
8 answers

How to replace null values in Spark DataFrame?

Hi, In Spark, fill() function of DataFrameNaFunctions class is used to replace ...READ MORE

answered Dec 15, 2020 in Apache Spark by MD
• 95,460 points
75,360 views
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 88,752 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
20,643 views
+1 vote
1 answer
0 votes
1 answer

How to insert data into Cassandra table using Spark DataFrame?

Hi@akhtar, You can write the spark dataframe in ...READ MORE

answered Sep 21, 2020 in Apache Spark by MD
• 95,460 points
3,901 views
+1 vote
2 answers
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
12,768 views
0 votes
1 answer

Different Spark Ecosystem

Spark has various components: Spark SQL (Shark)- for ...READ MORE

answered Jun 4, 2018 in Apache Spark by kurt_cobain
• 9,350 points
870 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP