Violation Weights & Scoring System

Advanced

VoxelAI's ChatWarden uses a sophisticated scoring system that combines violation severity, weights, and special modifiers to determine appropriate punishments.

Sophisticated Scoring

The scoring system balances violation severity, type weights, and escalation multipliers to ensure fair and proportionate punishments.

Core Concepts

Base Score Formula

Base Score = Violation Severity × 
             Violation Weight × 
             Special Modifiers

Final Score Formula

Final Score = Base Score × 
              Escalation Multiplier

Violation Weights

Violation Weight Configuration

violation_weights:
  spam: 0.5        # Lower weight due to frequency
  toxicity: 1.0    # Standard weight
  harassment: 1.4  # Higher due to targeting
  hate_speech: 2.0 # Highest weight
  profanity: 0.8   # Moderate weight

Severity Levels (1-5)

Minor Violation

Level 1

Accidental/unintentional, minimal impact, first occurrence, easy to correct

Low Violation

Level 2

Mild disruption, some intent, minor impact, quick to resolve

Medium Violation

Level 3

Clear intent, moderate impact, repeated behavior, needs intervention

High Violation

Level 4

Strong intent, significant impact, pattern of behavior, requires action

Severe Violation

Level 5

Malicious intent, maximum impact, coordinated action, immediate response

Escalation Calculation Methods

New Severity-Weighted Method

VoxelAI now supports both traditional count-based escalation and a new severity-weighted method for more accurate escalation.

Severity-Weighted Escalation

Escalation based on cumulative severity of violations

escalation:
  calculation_method: "severity"
  base_multiplier: 1.0
  severity_factor: 0.1
  max_multiplier: 3.0

Formula:
Escalation = 1.0 + (0.1 × Sum of Recent Severities)

Example:
Recent violations: severity 3, 2, 4
Escalation = 1.0 + (0.1 × 9) = 1.9×

Advantages: More accurate escalation reflecting violation severity

Fair: Minor violations don't escalate as much as severe ones

Responsive: Severe violations immediately increase future escalation

Special Modifiers

Spam Modifier

Applied to consolidated spam violations

spam_modifier: 1.5

Batch Modifiers

For multiple violations in batch

batch_modifiers:
  enabled: true
  threshold: 5      # Messages in batch
  multiplier: 1.2   # Score multiplier

Time-based Modifiers

For rapid or persistent violations

time_modifiers:
  rapid_repeat: 1.3  # Quick repeated violations
  persistent: 1.4    # Violations over time

Scoring Examples

Simple Spam

Severity:2 (low)

Weight:0.5 (spam)

Modifier:1.5 (spam)

Base Score:1.5

2 × 0.5 × 1.5 = 1.5

Toxic Behavior

Severity:4 (high)

Weight:1.0 (toxicity)

Modifier:1.0 (none)

Base Score:4.0

4 × 1.0 × 1.0 = 4.0

Hate Speech

Severity:5 (severe)

Weight:2.0 (hate speech)

Modifier:1.0 (none)

Base Score:10.0

5 × 2.0 × 1.0 = 10.0

Threshold Mapping

Score to Punishment

punishment_score_thresholds:
  warn: 1.0     # Even low scores
  mute: 3.0     # Medium severity
  tempban: 8.0  # High severity
  ban: 20.0     # Extreme cases

Example Mappings

Score 1.5

Warning

Score 4.0

Mute

Score 9.0

Tempban

Score 21.0

Ban

Configuration Tips

Weight Balancing Guidelines

• Keep weights relative to each other

• Consider community impact of each violation type

• Test combinations thoroughly

• Monitor punishment distribution

Best Practices

Regular Review

• Monitor punishment distribution

• Check false positive rates

• Review edge cases

• Adjust as needed

Testing

• Use test environment

• Try edge cases

• Verify escalation

• Check combinations

Documentation

• Log changes

• Track effectiveness

• Document rationale

• Share findings

Troubleshooting

Scores Too High

• Lower weights for problematic violation types

• Reduce special modifiers

• Adjust punishment thresholds upward

• Check escalation multipliers

Scores Too Low

• Increase weights for under-punished violations

• Add appropriate modifiers

• Lower punishment thresholds

• Review severity assignments

Inconsistent Results

• Check calculation logic

• Verify weight configurations

• Test edge cases manually

• Enable detailed logging