I'm doing small project on sentiment Analysis using twitter data. I have the sample csv file containing the data. but before doing the sentiment analysis part. I have to clean up the data. There is one part that I am stuck. Here's the code.
tweets['source'][2] ## Source is an attribute in csv file containing values
Out[51]: u'<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>'
I want to clean the source(data). I don't want the the values to be shown with web links and the tags.
Here's the code for cleaning the source:
tweets['source_new'] = ''
for i in range(len(tweets['source'])):
m = re.search('(?)(.*)', tweets['source'][i])
try:
tweets['source_new'][i]=m.group(0)
except AttributeError:
tweets['source_new'][i]=tweets['source'][i]
tweets['source_new'] = tweets['source_new'].str.replace('', ' ', case=False)
But when I executed the code. I got this error:
Traceback (most recent call last):
File "<ipython-input-50-f92a7f05ad1d>", line 2, in <module>
m = re.search('(?)(.*)', tweets['source'][i])
File "C:\Users\aneeq\Anaconda2\lib\re.py", line 146, in search
return _compile(pattern, flags).search(string)
File "C:\Users\aneeq\Anaconda2\lib\re.py", line 251, in _compile
raise error, v # invalid expression
error: unexpected end of pattern
I got an error saying 'error: unexpected end of pattern". Can any help me with this? I can't find the issue of the code that I am working on.