Research review - 'Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam'

Review of the research paper 'Clues in Tweets: Twitter-Guided Discovery and Analysis of SMSSpam, 2022'

As is often the case with spam, collecting the dataset is very challenging. This paper proposes a novel idea to collect and keep updating the spam dataset - leverage Social media chatter to track sms spam trends.

The authors have observed that some Twitter users voluntarily post the spam messages that they receive on Twitter and try to warn others. The authors have built an automation pipeline called SpamHunter that compiles such Tweets and extracts the spam message content. Their dataset is published at https://sites.google.com/corp/view/twitterspamsms. With this dataset, the authors have tried to categorize the type of spam. This section was a good read. The authors also try to see if the spammy URLs in their dataset could be used to stop spam in real-time. Seems possible.

I found their evaluation study on anti-spam services faulty. The evaluation framework and test set used here are not really representative of a real-world spam attack. So the inferences made in this section are questionable.

But overall, kudos to the authors for finding a novel way to collect sms spam data.

Arun Narasimhan's Blog

Search This Blog

Research review - 'Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam'