POST /api/v2/moderation/check
Runs the content-moderation engine against the supplied text without blocking, returning structured findings so a buyer agent can validate input before submitting it to a blocking surface. Mirrors the engine behind POST /campaigns brief screening — wouldBlock: true predicts a 422 on the real call.
Request
Parameters
| Field | Type | Required | Notes |
|---|---|---|---|
text | string | Yes | Text to evaluate, 1–10,000 chars. Exceeding 10,000 chars returns a 400 VALIDATION_ERROR |
direction | enum | No | input (default) runs brief/jailbreak/policy patterns; output runs LLM-output (refusal/identity/PII) patterns |
surface | string | No | Surface label for metric attribution, 1–100 chars. Defaults to moderation.check |
Response
passed is true only when no patterns matched. wouldBlock is true when at least one finding would trigger a 422 on a blocking surface. findings reports both blocking and non-blocking matches, so medium-severity signals are visible even when wouldBlock is false. Each finding carries a category, a severity (low, medium, high, critical), and a suggestion.
Errors
400 VALIDATION_ERROR— emptytext,textover 10,000 chars, or an unknowndirection.
Related
Moderation tasks
All moderation operations
Moderation overview
What it is and when to use it