Moderation Pipeline (Code Level)
C4 Level 4 detail of the PSI moderation system. This page shows the internal structure of the moderation module, its two-pass AI flow, and the standalone moderation service.
Module Structure
The moderation system spans two codebases: the built-in moderation module in the PSI server, and the standalone moderation service (moderation-service/).
classDiagram
class ModerationModule {
+publicFunctions
+adminFunctions
checkCommentWithGpt(store, params)
checkCommentWithGptHeavy(store, params)
recordJudgement(store, params)
getJudgementHistory(store, params)
}
class JigsawModule {
+publicFunctions
measureToxic(store, params)
}
class PremodReviewModule {
+publicFunctions
+adminFunctions
recordPremodData(store, params)
getPremodReviewData(store, params)
annotatePremodData(store, params)
}
class LLMAdapter {
<<interface>>
+chatCompletion(messages, options) Promise~Response~
}
class OpenAIAdapter {
+chatCompletion(messages, options)
}
class AzureOpenAIAdapter {
+chatCompletion(messages, options)
}
class ZDFOpenAIAdapter {
+chatCompletion(messages, options)
}
class JigsawAdapter {
<<interface>>
+analyze(text, languages) Promise~ToxicityScores~
}
class PerspectiveAPIAdapter {
+analyze(text, languages)
}
class ZDFJigsawAdapter {
+analyze(text, languages)
}
ModerationModule --> LLMAdapter : uses for AI checks
ModerationModule --> JigsawAdapter : uses for toxicity
ModerationModule --> PremodReviewModule : stores flagged data
JigsawModule --> JigsawAdapter : direct toxicity measurement
LLMAdapter <|.. OpenAIAdapter
LLMAdapter <|.. AzureOpenAIAdapter
LLMAdapter <|.. ZDFOpenAIAdapter
JigsawAdapter <|.. PerspectiveAPIAdapter
JigsawAdapter <|.. ZDFJigsawAdapter Two-Pass Moderation Flow (Detailed)
sequenceDiagram
participant Client as PSI Frontend
participant Backend as PSI Backend (Hono)
participant Prompts as server/prompts/
participant Light as Light Model (gpt-4o-mini)
participant Heavy as Heavy Model (gpt-4o)
participant Jigsaw as Perspective API
participant Store as ServerStore
participant PremodReview as PremodReview Module
participant Email as Email Adapter
Client->>Backend: POST /api/moderation/checkCommentWithGpt
Backend->>Prompts: Load moderate.txt (system prompt)
Prompts-->>Backend: System prompt with community guidelines
par Parallel AI checks
Backend->>Light: chatCompletion(system + user prompt)
Note over Light: User prompt includes:<br/>comment text, parent context,<br/>community guidelines, response format
Light-->>Backend: JSON: {decision, explanation, confidence}
Backend->>Jigsaw: analyze(commentText, [language])
Jigsaw-->>Backend: {toxicity, severeToxicity, insult, threat, ...}
end
Backend->>Store: Record light model result + toxicity scores
alt decision = "flag" (needs heavy review)
Backend->>Prompts: Load heavy/moderate.txt
Backend->>Heavy: chatCompletion(system + user prompt + light result)
Heavy-->>Backend: JSON: {decision, explanation, confidence}
Backend->>Store: Record heavy model result
end
alt PREMOD_REVIEW_ENABLED = true AND comment rejected
Backend->>PremodReview: recordPremodData(comment, decision)
Note over PremodReview: Stores comment without user ID<br/>30-day TTL for GDPR compliance
end
alt Comment rejected
Backend->>Email: Send rejection notification to user
end
Backend-->>Client: {decision, explanation, toxicityScores} Comment State Machine
stateDiagram-v2
[*] --> Submitted: User posts comment
Submitted --> LightModelCheck: Sent to gpt-4o-mini
LightModelCheck --> Approved: Light model approves
LightModelCheck --> Rejected: Light model rejects
LightModelCheck --> HeavyModelCheck: Light model flags
HeavyModelCheck --> Approved: Heavy model approves
HeavyModelCheck --> Rejected: Heavy model rejects
HeavyModelCheck --> PendingReview: Heavy model uncertain
PendingReview --> Approved: Moderator approves
PendingReview --> Rejected: Moderator rejects
Approved --> Published: Visible to all users
Rejected --> Hidden: Not visible, user notified
Published --> [*]
Hidden --> [*] Pre-Moderation Review Data Flow
graph TD
subgraph "Comment Submission"
A[User submits comment] --> B{AI moderation}
B -->|Approved| C[Published]
B -->|Rejected/Flagged| D[Comment blocked]
end
subgraph "Pre-Mod Review Pipeline"
D --> E[Strip PII: remove user ID]
E --> F[Store with 30-day TTL]
F --> G[Available in Moderation Dashboard]
end
subgraph "Human Review"
G --> H[Moderator reviews AI decision]
H --> I{Correct?}
I -->|Yes| J[Mark as true positive/negative]
I -->|No| K[Mark as false positive/negative]
end
subgraph "Feedback Loop"
J --> L[AI Evaluation Dataset]
K --> L
L --> M[Model accuracy metrics]
L --> N[psi-ai-eval analysis]
end Prompt Architecture
The moderation system uses structured prompts stored as text files:
| Prompt File | Purpose | Used By |
|---|---|---|
moderate.txt | Core moderation: evaluate against community guidelines | Light model (gpt-4o-mini) |
heavy/moderate.txt | Detailed re-evaluation with additional context | Heavy model (gpt-4o) |
moderatePerspective.txt | Perspective-aware moderation variant | Alternative flow |
comment_quality.txt | Score comment quality (constructiveness) | Ranking module |
comment_bridging.txt | Score bridging potential (cross-perspective) | Ranking module |
name_check.txt | Check if display name violates guidelines | Profile module |
name_looks_real.txt | Assess if a name appears realistic | Profile module |
tag_article.txt | Auto-tag articles for categorization | Article module |
translate.txt | Translate user content between languages | Translation module |
conversationhelper_*.txt (5) | AI-assisted conversation guidance | ZDF conversation helper |
Standalone Moderation Service
The standalone service (moderation-service/) mirrors the adapter pattern:
graph TD
subgraph "Moderation Service API (Express 4)"
Routes[Express Routes]
ModerationService[ModerationService class]
BlocklistService[BlocklistService class]
end
subgraph "Adapter Layer"
DBAdapter[Database Adapter]
LLMAdapter2[LLM Adapter]
JigsawAdapter2[Jigsaw Adapter]
EmailAdapter[Email Adapter]
end
subgraph "External Services"
Firestore[(Firebase Firestore)]
MongoDB[(MongoDB 7)]
OpenAI2[OpenAI API]
Perspective[Perspective API]
Sentry[Sentry]
end
Routes --> ModerationService
Routes --> BlocklistService
ModerationService --> LLMAdapter2
ModerationService --> JigsawAdapter2
ModerationService --> DBAdapter
ModerationService --> EmailAdapter
BlocklistService --> DBAdapter
DBAdapter --> Firestore
DBAdapter -.-> MongoDB
LLMAdapter2 --> OpenAI2
JigsawAdapter2 --> Perspective Further Reading
- Moderation Pipeline Overview -- C4 Level 3 view
- Adapter Pattern -- how adapters enable provider switching
- Pre-Moderation Review (psi-product) -- detailed review workflow