Violation Detection System
VoxelAI's ChatWarden uses advanced AI to detect and classify chat violations in real-time with sophisticated context understanding.
Advanced AI Analysis
Violation Types
Spam
Weight: 0.5
• Repeated identical messages
• Excessive caps
• Chat flooding
• Server advertising
Toxicity
Weight: 1.0
• General insults
• Personal attacks
• Negative behavior
• Threats
Harassment
Weight: 1.4
• Targeted behavior
• Persistent attacks
• Following/stalking
• Coordinated targeting
Hate Speech
Weight: 2.0
• Discriminatory language
• Slurs
• Targeted group attacks
• Extremist content
Profanity
Weight: 0.8
• Inappropriate language
• Sexual content
• Explicit material
• Contextual swearing
AI Analysis Process
Batch Processing Configuration
batch_processing: enabled: true chunk_size: 75 # Messages per batch process_interval_seconds: 15 # Processing frequency max_batch_delay_minutes: 1 # Maximum wait time queue_limit: 300 # Maximum queued messages
Severity Levels
Spam
2-3 repeated messages
4-5 repeated messages
6+ repeated messages
Extreme flooding (10+ messages)
Bot-like behavior or advertising
Toxicity
Mild criticism
General insults
Personal attacks
Severe attacks/threats
Extreme toxicity campaigns
Harassment
Mild targeting
Clear targeting pattern
Persistent harassment
Severe harassment/threats
Extreme coordinated harassment
Hate Speech
Mild discriminatory language
Clear discriminatory intent
Targeted hate speech
Violent hate speech
Extreme hate speech/calls for violence
Profanity
Mild swearing
Moderate profanity
Directed profanity
Extreme profanity
Graphic sexual content
Confidence Thresholds
Threshold Configuration
Minimum confidence required for violation detection
confidence_thresholds: toxicity: 0.8 # High threshold harassment: 0.85 # Very high hate_speech: 0.9 # Highest threshold spam: 0.8 # High threshold profanity: 0.75 # Moderate custom_rules: 0.8 # High threshold
False Positive Prevention
Gaming Context
• Understands competitive banter
• Recognizes gaming terminology
• Considers game events
• Adapts to server culture
Friendly Banter
• Distinguishes intent
• Considers relationships
• Analyzes conversation flow
• Respects friend dynamics
Message Context
• Conversation flow
• Player relationships
• Game events
• Server events
Custom Rules
Server-Specific Rules
Add custom violation types for your server
custom_rules:
enabled: true
rules:
- name: "server_promotion"
pattern: "play.competitor.com"
severity: 4
confidence: 1.0
- name: "forbidden_words"
pattern: ["word1", "word2"]
severity: 2
confidence: 0.9Monitoring & Debugging
Debug Mode
core: debug_mode: true log_violations: true
Enables detailed logging for analysis and debugging
Statistics Command
/voxelai stats• Detection rates
• False positive rates
• Most common violations
• AI response times
Best Practices
Regular Monitoring
• Check violation patterns
• Review false positives
• Adjust thresholds as needed
• Monitor system performance
Configuration Tuning
• Start with high confidence thresholds
• Adjust based on server needs
• Balance sensitivity vs. accuracy
• Test changes gradually
Staff Training
• Understand detection logic
• Know how to handle appeals
• Monitor for abuse patterns
• Regular system reviews