How Machine Learning Can Find Extremists on Social Media [incl. Jytte Klausen]

In October 2015, a British ISIS supporter named Sally Jones posted a tweet with the hashtag #RunRobertRun. The tweet included a link to another message containing the supposed home address of Robert O’Neill, the former Navy SEAL who claims to have killed Osama bin Laden. By the time Twitter suspended Jones’ account, the information had spread to other ISIS supporters.

O’Neill did not live at that address, and he remained unharmed. But Jones’ tweet is a troubling example of how extremist groups exploit social media to rally others to their cause and incite violence. “Twitter used to be a joke, like a fun thing for kids,” says Tauhid Zaman, an associate professor of operations management at Yale SOM. “Now it’s a national security issue.”

In a recent study, Zaman’s team investigated how to identify ISIS affiliates on Twitter so their accounts can be quickly shut down. The researchers used machine learning to predict which users were likely to be extremists, based on features such as who the person followed. Suspended users often sign up again under a slightly different name, so the team also developed strategies to detect these new accounts.

While the study focused on Twitter, Zaman says the method is general enough to apply to other online social networks. And he believes the strategies should work for other extremist groups such as white supremacists, who exhibit similar behavior such as creating duplicate accounts. “They play the same game,” Zaman says.

In 2014, Christopher Marks, a lieutenant colonel in the U.S. Army, was a PhD student in Zaman’s lab and wanted to study social networks. At the time, ISIS had a growing presence on Twitter.

Zaman and Marks decided to try to detect ISIS accounts even before the user posted any messages. By the time the person tweeted damaging content, Zaman says, “it might be too late.”

To investigate, they collaborated with Jytte Klausen, a researcher at Brandeis University who studies Western jihadism. Klausen provided a list of about 100 Twitter users known to be affiliated with ISIS. The team then identified those people’s followers, the people they followed, people connected to the followers, and so on, which yielded more than 1.3 million accounts. However, not all of those users were extremists; for example, some were researchers studying ISIS.

Zaman’s team then tracked about 647,000 of the accounts for several months, and by September 2015, Twitter had suspended roughly 35,000 of them—presumably because those users had posted extremist content. So the researchers used AI to identify typical features of suspended accounts. For example, following certain users or concealing one’s location was linked to a higher likelihood of extremism.

Based on those measures, the researchers could automatically identify about 60% of accounts that were later suspended. About 10% of flagged users were false positives. (The software can be tweaked to set a more stringent threshold, Zaman says, in which case it would identify more ISIS affiliates and produce more false positives.)

Next, the team wanted to detect new accounts created by suspended users. Often, “when you kill an ISIS account, it comes back,” Zaman says.

Software that simply looked for similarities in names and photos worked fairly well. Suspended users often choose a similar screen name and image for their new account because they want previous followers to find them, he says.

But the researchers eventually developed a more efficient search strategy. When a suspended user created a new account, that person would probably re-follow many of the same people they had previously followed. So one way to find that user was to search the networks of accounts previously followed by the suspended account.

Using machine learning, the team assigned each such account a score, which captured the likelihood that a suspended user would re-follow them. The best approach, Zaman says, was to prioritize searching the networks of accounts with a high score and relatively few followers. After combing through an account’s network for a follower similar to the suspended account, the software then moved on to the next friend on the list and repeated the process. “That gives you the fastest way to find these accounts,” Zaman says.

Zaman notes that while law enforcement agencies could use the software to root out extremists, authoritarian governments could do the same to quash resistance. “If you use it improperly, it’s suppression of dissent,” he says.

And a person should always review the output to confirm whether the software made the right call. “You want a human to be a final checkpoint,” Zaman says.

The software likely will not outperform Twitter’s internal methods for flagging extremists because the company has access to more data, such as IP addresses. But Zaman says the team’s method will help any social network combat dangerous groups.

“New types of extremist groups are going to continue popping up in different social networks and use them for propaganda and recruitment,” he says. “Our research provides a set of tools that can detect and monitor these groups no matter what network they are in and what dangerous message they espouse.”

See more on this Topic
George Washington University’s Failure to Remove MESA from Its Middle East Studies Program Shows a Continued Tolerance for the Promotion of Terrorism
One Columbia Professor Touted in a Federal Grant Application Gave a Talk Called ‘On Zionism and Jewish Supremacy’