src.features
.preprocessDataFrame
¶
-
src.features.
preprocessDataFrame
( df ) [source] ¶ -
Function to run the preprocessing pipeline on all tweets to generate the feature “full_text_processed”: Translating tweets to English, removing stopwords & lemmatization, removing URLs and reserved words, lowercasing & punctuation removal and VADER sentiment analysis.
- Parameters
-
- df DataFrame
-
Transformed DataFrame with original tweets
- Returns:
- df DataFrame
-
DataFrame with processed tweets