Violation Detection System

AI-Powered

VoxelAI's ChatWarden uses advanced AI to detect and classify chat violations in real-time with sophisticated context understanding.

Advanced AI Analysis

The system uses Google's Gemini AI to understand context, intent, and nuance in chat messages for accurate violation detection.

Violation Types

Spam

Weight: 0.5

• Repeated identical messages

• Excessive caps

• Chat flooding

• Server advertising

Toxicity

Weight: 1.0

• General insults

• Personal attacks

• Negative behavior

• Threats

Harassment

Weight: 1.4

• Targeted behavior

• Persistent attacks

• Following/stalking

• Coordinated targeting

Hate Speech

Weight: 2.0

• Discriminatory language

• Slurs

• Targeted group attacks

• Extremist content

Profanity

Weight: 0.8

• Inappropriate language

• Sexual content

• Explicit material

• Contextual swearing

AI Analysis Process

Batch Processing Configuration

batch_processing:
  enabled: true
  chunk_size: 75               # Messages per batch
  process_interval_seconds: 15  # Processing frequency
  max_batch_delay_minutes: 1    # Maximum wait time
  queue_limit: 300             # Maximum queued messages

Severity Levels

Spam

Weight: 0.5

2-3 repeated messages

4-5 repeated messages

6+ repeated messages

Extreme flooding (10+ messages)

Bot-like behavior or advertising

Toxicity

Weight: 1.0

Mild criticism

General insults

Personal attacks

Severe attacks/threats

Extreme toxicity campaigns

Harassment

Weight: 1.4

Mild targeting

Clear targeting pattern

Persistent harassment

Severe harassment/threats

Extreme coordinated harassment

Hate Speech

Weight: 2.0

Mild discriminatory language

Clear discriminatory intent

Targeted hate speech

Violent hate speech

Extreme hate speech/calls for violence

Profanity

Weight: 0.8

Mild swearing

Moderate profanity

Directed profanity

Extreme profanity

Graphic sexual content

Confidence Thresholds

Threshold Configuration

Minimum confidence required for violation detection

confidence_thresholds:
  toxicity: 0.8       # High threshold
  harassment: 0.85    # Very high
  hate_speech: 0.9    # Highest threshold
  spam: 0.8          # High threshold
  profanity: 0.75    # Moderate
  custom_rules: 0.8   # High threshold

False Positive Prevention

Gaming Context

• Understands competitive banter

• Recognizes gaming terminology

• Considers game events

• Adapts to server culture

Friendly Banter

• Distinguishes intent

• Considers relationships

• Analyzes conversation flow

• Respects friend dynamics

Message Context

• Conversation flow

• Player relationships

• Game events

• Server events

Custom Rules

Server-Specific Rules

Add custom violation types for your server

custom_rules:
  enabled: true
  rules:
    - name: "server_promotion"
      pattern: "play.competitor.com"
      severity: 4
      confidence: 1.0
    - name: "forbidden_words"
      pattern: ["word1", "word2"]
      severity: 2
      confidence: 0.9

Monitoring & Debugging

Debug Mode

core:
  debug_mode: true
  log_violations: true

Enables detailed logging for analysis and debugging

Statistics Command

/voxelai stats

• Detection rates

• False positive rates

• Most common violations

• AI response times

Best Practices

Regular Monitoring

• Check violation patterns

• Review false positives

• Adjust thresholds as needed

• Monitor system performance

Configuration Tuning

• Start with high confidence thresholds

• Adjust based on server needs

• Balance sensitivity vs. accuracy

• Test changes gradually

Staff Training

• Understand detection logic

• Know how to handle appeals

• Monitor for abuse patterns

• Regular system reviews