Toxicity and profanity detection for user-generated content

Detect toxic language, hate speech, and profanity in real time across 200+ languages. Lasso's four-layer pipeline classifies harmful text, catches evasion tactics like leet speak and unicode tricks, and uses context-aware AI to distinguish genuine threats from trash talk.

  • Leet-speak evasion in public chat

    A toxic insult disguised with leet speak and symbol swaps tries to slip past a basic filter.

    xX_ShadowRogue:gg
    PixelMage92:nice game everyone
    G4merGoblin:g0 b4ck wh3re u c4me fr0m u H4TE-worthy tr4sh
    Evasion decoded
    CyberNinja:reported lol
    NoobSlayer77:EZ
  • Same phrase, different intent

    “I'll destroy you” reads as trash talk in a gaming lobby and as a genuine threat in a dating DM after rejection.

    "I’ll destroy you"
    Gaming lobby
    "gg I’ll destroy you next round lol"
    ✓ Banter
    Dating DM
    "reject me again and I’ll destroy you"
    ⚠ Threat
    Threat detected
    Same phrase. Different intent. Classified by context.
  • Hate speech scored across protected characteristics

    Each detection returns severity scores across race, religion, gender, sexual orientation, and other protected categories.

    anon_user_4821
    people like you don’t belong here. go back where you came from
    Hate speech
    0.92
    Identity attack
    0.87
    Toxicity
    0.95
    Profanity
    0.12
    Race / ethnicityNational origin
    Hate speech flagged
  • Custom word lists tuned to your platform

    Define platform-specific terms and thresholds. Lasso enforces them in the pipeline alongside the AI models.

    Gaming platform word list247 terms
    n00b, noob, nubAllow
    gg ez, get rektFlag
    kys, neck yourselfBlock
    Auto-blocked
    trash, garbage, botAllow
    swatting, doxxingBlock

Lasso's four-layer toxicity and profanity detection pipeline

1ML classification

ML models classify text in real time across toxicity, severe toxicity, insult, profanity, threat, identity attack, and obscene content.

Each returns a confidence score. Known evasion patterns (leet speak, unicode substitution, character spacing) are decoded before classification.

See Lasso in action
Public chat with leet-speak toxicity message
Toxicity detected
ML Classification
7 categories scored
Toxicity
0.94High
Insult
0.87High
Threat
0.12Low
Classification confidence94%
48msEnglish
2Custom rules

Your platform rules applied as a working layer in the pipeline.

Custom word lists for industry-specific terms, profanity thresholds calibrated to your community, and escalation rules per content type. What's acceptable on an adult platform is not acceptable on a children's game. Configured in the dashboard, enforced automatically.

See Lasso in action
Layer 1 Output
1User: xDarkLord99
2ur all H4TE'd here lol g3t r3kt n00b$
3Channel: #lobby
Custom Rules
BlockHate speech variants
H4TE, H8, HAT3 (leet variants)
kys / neck-yourself phrases
Targeted slur derivatives
Doxxing / swatting triggers
Blocked
Leet variant matched custom word list
Rule #14
3AI moderator

Context-aware AI resolves grey areas in toxicity detection.

Understands sarcasm, coded language, trash talk norms, and community-specific slang. The reason Lasso reaches 99% automation.

See Lasso in action
Context-aware split showing Gaming lobby and Dating DM
Threat detected
AI Moderator
Analyzing
Input text
“I'll destroy you”
Context
Dating DM after rejection
Intent analysis
Targeted threat directed at user
Classification
Severe toxicity threatening behavior
Auto-removed
Threat in private DM after rejection
94%
4Human review

Edge cases queued for human review with full context.

Each decision feeds back into the AI Moderator. Toxicity classification improves over time.

See Lasso in action
Queued for review
AI Moderator reasoning
Phrase is ambiguous — reads as threat in DM after rejection, but tone is closer to venting than direct intent. Confidence below auto-action threshold.
Human Review1 of 3
Flagged text
“reject me again and I'll destroy you”
AI assessment
Possible threat. Ambiguous DM context, community-specific tone.
AI confidence
Severity78%
Below auto-action threshold (90%)
This decision will train the AI for future similar content

Toxicity and profanity detection capabilities

Catches leet speak, unicode tricks, character spacing, phonetic substitutions, and symbol replacements designed to bypass text filters. Decodes the evasion before classifying the content.

Public chat · #lobby
47 online
PX
PixelMage92
gg everyone, nice round
GG
G4merGoblin
y0u $uck k!ddo, g0 b4ck wh3re u c4me fr0m
⚠ EVASION PATTERN DETECTED
XL
xLossLordx
k y s  n0  0  b
⚠ CHAR SPACING + LEET
CN
CyberNinja
reported ¯\_(ツ)_/¯
Evasion decoder
y0u $uck k!ddo
you suck kiddo
Leet 0→oSym $→sSym !→i
Patterns caught
Leet speak128
Unicode tricks96
Character spacing54
Phonetic subs37
Symbol replace81
Zero-width chars22

Distinguishes trash talk from genuine threats, sarcasm from real abuse, and coded language from innocent use. Considers conversation history and community norms.

Classifies hate speech targeting race, religion, gender, sexual orientation, disability, and other protected characteristics. Returns severity and confidence per detection.

Define blocked terms for your platform, region, or industry. Custom word lists work alongside AI models so both known terms and novel variations are caught.

Public chat · #lobby
47 online
PX
PixelMage92
gg everyone, nice round
GG
G4merGoblin
y0u $uck k!ddo, g0 b4ck wh3re u c4me fr0m
⚠ EVASION PATTERN DETECTED
XL
xLossLordx
k y s  n0  0  b
⚠ CHAR SPACING + LEET
CN
CyberNinja
reported ¯\_(ツ)_/¯
Evasion decoder
y0u $uck k!ddo
you suck kiddo
Leet 0→oSym $→sSym !→i
Patterns caught
Leet speak128
Unicode tricks96
Character spacing54
Phonetic subs37
Symbol replace81
Zero-width chars22

Lasso: next-gen AI content moderation

99%

On autopilot, and getting smarter every day.

Three layers of AI handle the volume, your rules, and the grey areas. Your team only sees what truly needs them, and every decision they make improves the system.

Keep users safe without driving them away.

Customizable moderation that lets you find the right balance between safety and user experience. So you protect your community without suppressing the culture that makes it worth joining.

Complexity removed from content moderation.

One API. Clear dashboards. A moderation pipeline built around one-click actions and the right context, right where you need it.

★★★★★
4.9

Highest rated in content moderation on G2.

Toxicity and profanity look different on every platform.

Gaming lobby chat with toxic message detected
Gaming
Gaming

Evasion tactics in chat: leet speak insults, unicode tricks, spaced characters.

  • Trash talk distinguished from genuine threats
  • Profanity in usernames and clan tags
  • Toxicity patterns across lobby chat, forums, and voice channels
More on Gaming
Dating DM showing toxicity after rejection
Dating
Dating

Toxic messages in private conversations.

  • Profanity and threats after rejected connections
  • Off-platform solicitation disguised with coded language
  • Repeat offenders cycling through new accounts with abusive patterns
More on Dating
Social platform with toxic comments flagged
Social
Social

Toxicity across comments, posts, and group discussions.

  • Coded hate speech in community forums
  • Users evading bans with new accounts and altered language
  • Escalating hostility in political and cultural debates
More on Social
Marketplace dispute with toxic seller message
Marketplaces
Marketplaces

Hostile language in transaction disputes.

  • Toxic seller-to-buyer messages pressuring off-platform payment
  • Profanity in product reviews and Q&A
  • Coordinated fake reviews with abusive language targeting competing sellers
More on Marketplaces
News article comments with toxic remarks flagged
Publishing
Publishing

Toxic comments on news articles and opinion pieces.

  • Organized harassment campaigns against writers
  • Profanity and threats in letters and feedback sections
  • Comment quality maintained without suppressing legitimate debate
More on Publishing
Adult entertainment live chat with abusive message flagged
Adult entertainment
Adult entertainment

Abusive messages targeting performers.

  • Hate speech based on identity characteristics in comments
  • Toxicity in live chat during streams
  • Custom profanity rules calibrated to adult entertainment norms where standard thresholds don't apply
More on Adult entertainment
FAQs

Toxicity and profanity detection, answered

Detection categories include toxicity, severe toxicity, insult, profanity, threat, identity attack, obscene, and hate speech across protected characteristics. Each message returns confidence scores per category. Evasion detection decodes leet speak, unicode tricks, and character spacing before classification. Context-aware AI handles sarcasm, coded language, and community-specific slang.

Detection across multiple evasion techniques: leet speak, unicode substitution, character spacing, phonetic swaps, symbol replacements, and repeated characters. ML models decode known evasion patterns automatically. The AI Moderator adds context-aware detection for novel evasion attempts, including coded language and emerging slang that static filters would miss.

Four sequential layers. ML models handle classification and evasion decoding. Custom rules enforce your platform's profanity lists and toxicity standards. The AI Moderator resolves ambiguous content (sarcasm, coded language, trash talk) with context awareness. Human moderators handle the final edge cases and retrain the system. Result: 99% automation that improves over time.

Yes. The AI Moderator considers conversation context, community norms, and platform type when classifying ambiguous messages. Competitive gaming language, sarcastic comments, and reclaimed terms are evaluated by intent. The same phrase can be harmless in one context and threatening in another. Lasso's context-aware classification handles this distinction automatically.

Yes. Custom word lists let you define blocked terms for your platform, region, and industry. Lists are enforced in the pipeline alongside AI models, so both your defined terms and novel variations are caught. Update lists in the dashboard without code. Different platforms have different standards. A word that's fine in adult entertainment may be blocked on a kids' platform. Custom rules handle this.

See what toxicity detection catches in your community

Book a demo and see Lasso's four-layer toxicity and profanity detection pipeline in action.

Book a demo

Protect your brand and safeguard your user experience.

TSPA Logo

© 2026. All rights reserved.