How can I remove duplicated rows in R

Apr 27, 2018 in Data Analytics by zombie
• 3,790 points • 4,062 views

1 answer to this question.

The function distinct() in the dplyr package performs arbitrary duplicate removal

Data:

dt <- data.frame(m = rep(c(1,2),4), n = rep(LETTERS[1:4],2))

Remove rows where specified columns have been duplicated:

library(dplyr)
dat %>% distinct(m, .keep_all = TRUE)

  m n
1 1 A
2 2 B

Remove rows which are complete duplicates of other rows:

dat %>% distinct

  m n
1 1 A
2 2 B
3 1 C
4 2 D

Generaleneral answer for duplicate row removal

m <- c(rep("A", 3), rep("B", 3), rep("C",2))
n <- c(1,1,2,4,1,1,2,2)
df <-data.frame(m,n)

duplicated(df)
[1] FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE

df[duplicated(df), ]
  m n
2 A 1
6 B 1
8 C 2

df[!duplicated(df), ]
  m n
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2

answered Apr 27, 2018 by shams
• 3,670 points

How can I control the size of points in an R scatterplot?

plot(variable, type='o' , pch=5, cex=.3) The pch argument ...READ MORE

answered May 3, 2018 in Data Analytics by shams
• 3,670 points • 2,506 views

0 votes

1 answer

How can I append rows to an R data frame?

Consider a dataSet i.e cicar(present under library ...READ MORE

answered May 9, 2018 in Data Analytics by zombie
• 3,790 points • 12,084 views

0 votes

1 answer

How can I select a CRAN mirror in R ?

There are many ways of doing so ...READ MORE

answered May 9, 2018 in Data Analytics by zombie
• 3,790 points • 2,077 views

0 votes

1 answer

How can I rotate axis labels in R ?

library(ggplot2) p <- data.frame(Day=c("2011-04-11", "2014-05-24","2004-01-12","2014-06-20","2010-08-07","2014-05-28"), Impressions=c(24010,15959,16107,21792,24933,21634),Clicks=c(211,106,248,196,160,241)) p ...READ MORE