Detection and analysis of Internet scams promoted on social media platforms (SFERA)

In the SFERA project, funded by NCBR's LIDER program, we work on real data from documented cases of online scams. We analyze false advertisements, manipulated statements, and examples of actions by scammers impersonating well-known brands, public figures, and institutions.

Challenge

The SFERA project is responding to the many, often very complex research challenges that arise from the rapidly changing nature of digital scams. We are dealing with content generated or partially modified using artificial intelligence, so-called deepfake techniques. This content is scattered, repeatedly published in different places, often short-lived and contextually diverse. In current marketing trends, the line between real and manipulated content can sometimes be very narrow, especially in the format of fake recommendations, advertisements or editing techniques. The SFERA project confronts these problems with an interdisciplinary approach that combines machine learning with media and social analysis.

What we do?

Our approach is based on a thorough analysis of how scammers operate, what technologies they use, and how they structure their messages to elicit specific actions from users. In the project, we work with real data from documented cases of online fraud. We analyze false advertisements, manipulated statements and examples of actions impersonating well-known brands, public figures and institutions.

The solutions used are multimodal, allowing analysis of audiovisual content on several levels:

  • visual area – we examine whether the recordings have traces of manipulation (e.g., facial substitutions, matching mouth movements to the new voice message, visible artifacts),
  • audio area – we check whether the voice was generated using voice cloning techniques,
  • spoken content – we analyze the language, narratives, social engineering used, speech patterns and characteristic phrases used in deception.

The project also uses the so-called data-centric approach, in which we focus not only on building models, but primarily on the quality, variety and characteristics of the data by subjecting it to in-depth case analysis. We create a set of scam cases that help not only teach the models, but also better understand the very nature of digital manipulation.

Our models use the latest advances in deep learning, natural language processing, audio analysis, pattern recognition, and computer vision techniques. By combining these elements, we are building a system capable of detecting threats early and generating reports that can be used by institutions fighting disinformation and cyber fraud.

The project is funded by the National Centre for Research and Development as part of the 15th edition of its LIDER programme.

Project implementation period: June 1, 2025 – October 1, 2027
Contract number: LIDER15/0362/2024
Funding amount: 1,800,000.00 PLN