Manipulate character string using gsub and perform multivariate data cleaning efficiently in R

0 votes

Manipulate character string using gsub() and perform multivariate data cleaning efficiently in R. 

  • Convert all the million values into Billion as 1 M = 0.001 B
  • Remove unnecessary symbols B, M and have only numerical values.
Nov 13, 2018 in Data Analytics by Ali
• 11,360 points
842 views

1 answer to this question.

0 votes

gsubfn is perfect for this task:

library(gsubfn)
as.vector(sapply(gsubfn("[A-Z]", list(B="* 1", M= "* 1e-3"), x), 
                                      function(x) eval(parse(text=x))))
#[1] 1.200 2.500 0.808

data

x <- c("1.2 B", "2.5 B", "808 M")
answered Nov 13, 2018 by Maverick
• 10,840 points

Related Questions In Data Analytics

0 votes
1 answer

How to forecast season and trend of data using STL and ARIMA in R?

You can use the forecast.stl function for the ...READ MORE

answered May 19, 2018 in Data Analytics by DataKing99
• 8,250 points
2,172 views
0 votes
0 answers

Any suggestions on how to perform multivariate analysis of fields in R using plots?

Any suggestions on how to perform multivariate ...READ MORE

Jul 22, 2019 in Data Analytics by sindhu
649 views
0 votes
1 answer
0 votes
2 answers

How to use group by for multiple columns in dplyr, using string vector input in R?

data = data.frame(   zzz11def = sample(LETTERS[1:3], 100, replace=TRUE),   zbc123qws1 ...READ MORE

answered Aug 6, 2019 in Data Analytics by anonymous
14,090 views
0 votes
1 answer

How do I remove unnecessary redundant data from a dataset?

You can use dimensionality reduction methods such ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
1,586 views
0 votes
1 answer

Clean and standardize words using R

You might want to checkout the stringdist package, e.g.: library(stringdist) toMatch ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
808 views
0 votes
1 answer

Cleaning raw data

Try this using read.fwf d <- read.fwf(textConnection( " ...READ MORE

answered Nov 13, 2018 in Data Analytics by Ali
• 11,360 points
1,037 views
0 votes
1 answer

Getting rid of extra periods - cleaning data using R

Just try removing the periods using sub ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
829 views
0 votes
1 answer

Replace comma with a period in data cleaning using R

You can use the scan function in ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
3,841 views
0 votes
1 answer

Cleaning a Data Frame Using Regexp in R

The simplest way: library(dplyr) library(stringi) df %>% mutate(NUMERO_APPEL.fix = ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
773 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP