ChatWarden Batch Processing System

Performance

The batch processing system is a critical component of VoxelAI's ChatWarden module, designed to efficiently analyze large volumes of chat messages while maintaining optimal performance and API usage.

System Overview

Message Grouping

Groups chat messages into optimized batches for efficient processing.

API Management

Manages API rate limits and quotas automatically for optimal usage.

Timely Detection

Ensures timely violation detection with configurable delay limits.

Server Performance

Prevents server performance impact through efficient processing.

Configuration

Batch Processing Settings

Core configuration for the batch processing system

batch_processing:
  enabled: true
  chunk_size: 25               # Messages per batch
  process_interval_seconds: 20 # Process every 20 seconds
  max_batch_delay_minutes: 2   # Maximum message delay
  queue_limit: 200            # Maximum queue size

Key Parameters Explained

25
Chunk Size

• Optimized for Gemma 3 27B token limits

• Balances processing speed and accuracy

• Adjustable based on server activity

• Prevents API token overflow

20s
Process Interval

• Regular processing cycles

• Prevents API rate limit issues

• Maintains responsive moderation

• Configurable based on server needs

2m
Max Batch Delay

• Maximum time a message waits

• Ensures timely violation detection

• Prevents delayed moderation

• Emergency processing trigger

200
Queue Limit

• Prevents memory issues

• Handles traffic spikes

• Auto-scales processing

• Overflow protection

System Components

BatchPunishmentScorer

• Calculates violation scores

• Applies escalation rules

• Determines punishment levels

• Manages violation history

MessageChunker

• Groups messages efficiently

• Optimizes batch sizes

• Handles message priorities

• Manages processing order

ViolationCache

• Stores recent violations

• Tracks escalation patterns

• Manages violation expiry

• Optimizes lookup speed

Processing Flow

Message Reception Flow

Player Message → Queue → Batch Formation → AI Analysis → Violation Detection → Punishment

Performance Optimization

API Usage

• Automatic model switching

• Rate limit management

• Quota optimization

• Emergency fallback options

Memory Management

• Efficient message storage

• Automatic queue cleanup

• Cache optimization

• Resource monitoring

Server Impact

• Minimal CPU usage

• Controlled memory usage

• Non-blocking operations

• Async processing

Monitoring

Health Metrics

• Queue size

• Processing latency

• API quota usage

• Error rates

Performance Indicators

• Messages per second

• Average processing time

• Batch completion rate

• Queue wait times

Emergency Handling

Queue Overflow

1. Emergency processing mode

2. Increased batch size

3. Reduced analysis depth

4. Priority message handling

API Issues

1. Model fallback

2. Reduced batch size

3. Increased intervals

4. Local cache usage

Best Practices

Configuration

• Adjust based on server size

• Monitor and tune parameters

• Balance speed vs accuracy

• Consider peak hours

Maintenance

• Regular cache cleanup

• Monitor queue size

• Check error logs

• Update thresholds

Troubleshooting

• Check queue status

• Monitor API errors

• Verify rate limits

• Review batch logs