Back to Projects

Milena (Innocent Old Lady) - Anti-Scam AI - Gemini Competition 2024

🏆 Gemini API Developer Competition 2024 Submission

Date

June 2024

Team Size

2 members

Technologies

PythonFlaskGemini 2.0 FlashText-to-SpeechSpeech-to-TextTwilioElevenLabs

Project Videos

Overview

An innovative anti-scam application that uses AI to waste scammers' time by mimicking conversations with a virtual persona, protecting potential victims and making scams less profitable.

The Problem

Phone scams targeting elderly and vulnerable populations are a growing problem worldwide, causing billions of dollars in losses annually. Traditional approaches to combat scams are reactive - blocking known numbers or educating potential victims. However, scammers constantly adapt their tactics, and vulnerable populations remain at risk. There's a need for a proactive solution that can actively waste scammers' time, making their operations less profitable and reducing the number of victims they can target.

Our Solution

Milena is an AI-powered system that mimics a real conversation with a scammer through a virtual persona. When a scammer calls, the system seamlessly takes over the call, simulating a conversation that mirrors the typical phases of a scam: Problem, Manipulation, and Extraction. By intelligently analyzing the scammer's speech using the Gemini API, Milena responds in a way that prolongs the interaction, effectively stalling the scammer. This not only prevents them from targeting others during that time but also increases their operational costs, making the scamming process less profitable.

Key Features

  • Real-Time Speech Analysis: Uses Gemini API to analyze scammer's speech patterns and respond appropriately
  • Natural Conversation: Generates convincing responses using Gemini 2.0 Flash that sound like a real person
  • Text-to-Speech: Converts AI-generated responses into natural-sounding speech using ElevenLabs
  • Scam Phase Recognition: Identifies which phase of the scam (Problem, Manipulation, Extraction) the scammer is in and responds accordingly
  • Scammer Engagement Strategy: Strategically prolongs conversations to keep scammers occupied, reducing the time they can spend targeting real victims

The Gemini API's rapid natural language processing capabilities combined with ElevenLabs' realistic text-to-speech ensure that responses are timely and convincing, creating a believable conversational experience.

My Role

As part of a 2-person team, I participated in the idea creation, software development, and AI pipeline design. My contributions included designing the conversational AI logic, integrating the Gemini API for natural language understanding and generation, implementing the speech-to-text and text-to-speech pipelines, and collaborating on the overall system architecture. The project represents our approach to using AI for social good - turning scammers' tactics against them to protect vulnerable populations.