Center of Excellence Column: Boosting OSINT with NLP
This week, in our Center of Excellence Column, we’re diving into using NLP (Natural Language Processing) in OSINT investigations. To demonstrate the use in practice, we share a case study involving online extremist community research.
So, let’s see the investigation!
Disclaimer: We have changed all the names involved in the original situation for privacy reasons.
Introducing the Suspect
Meet the main character of our investigation, Michael Francis. We already know that he has a few social media accounts and actively participates in online right-wing extremist groups. Our goal is to detect hate speech and offensive language within Michael’s posts and understand if he poses a threat. To do this, we’re equipped with ML-driven NLP modules that will help us analyze a large number of posts to detect the text's tone of voice.
The Investigative Process
Step 1: Identifying Posts
To start, we selected several social media groups in which Michael participates . Next, we extract his posts from those groups and run the NLP module to analyze what he wrote. As a result, we get 45 incoming links marked as “not toxic or offensive.” However, the ML analysis reveals eight posts flagged as “toxic and offensive.”
Step 2: Identifying Partners
In the next stage, we look at the contents of the flagged texts and see the people who liked Michael’s offensive posts—Peggy Roland and Walter Nichols. Then, we start looking into their profiles. We check user groups, educational backgrounds, and workplaces. Finally, we cross-search using facial recognition to see any overlaps with the groups and pages they follow.
Our analysis concludes when we discover that Peggy and Walter are active followers of none other than Michael Francis! When we piece together all the posts, we notice that Michael is aiming for a leadership position in the extremist user group, and Peggy and Walter support him.
So, now we have a clear idea of what is going on. Michael is trying to take the organization in a more violent direction, which members seem to support. Armed with this information, we can keep our eye on Michael Francis and monitor his posts to see if he starts posing a threat.
Step 3: Summary
Our investigation allowed us to identify a potential change in leadership for an extremist organization, which threatens to take the group in a more violent direction. With ML-driven NLP analysis, we quickly gathered the members' names supporting Michael Francis as the group's new leader. In addition, we uncovered the personal networks of two of Michael’s partners, Peggy Roland and Walter Nichols.
Unlocking Further Insights Through NLP
Natural Language Processing techniques allow investigators to analyze large amounts of data in a very short period. The obvious advantage is saving time and effort, as doing all the work manually can take forever. Moreover, using ML-driven NLP modules makes it possible to increase the scope of the inquiry. In the above example, we extended our research with additional social media accounts, phone numbers, and more. All due to a close analysis of Michae's, Peggy's, and Walter’s accounts.