Hey,
sortByKey() is a transformation.
- It returns an RDD sorted by Key.
- Sorting can be done in (1) Ascending OR (2) Descending OR (3) custom sorting
They will work with any key type K that has an implicit Ordering[K] in scope. Ordering objects already exist for all of the standard primitive types. Users can also define their own orderings for custom types, or to override the default ordering. The implicit ordering that is in the closest scope will be used.
When called on Dataset of (K, V) where k is Ordered returns a dataset of (K, V) pairs sorted by keys in ascending or descending order, as specified in the ascending argument.
Here is an example:
<br />
val rdd1 = sc.parallelize(Seq(("India",91),("USA",1),("Brazil",55),("Greece",30),("China",86),("Sweden",46),("Turkey",90),("Nepal",977)))
<br />
val rdd2 = rdd1.sortByKey()<br /> rdd2.collect();<br />
Output:
Array[(String,Int)] = (Array(Brazil,55),(China,86),(Greece,30),(India,91),(Nepal,977),(Sweden,46),(Turkey,90),(USA,1)