How anonymize.today Works

Deterministic, regex-based PII detection that delivers 100% reproducible results. Same input, same output—every time. No AI, no guessing, just transparent pattern matching.

Why Regex, Not AI?

Our Approach

  • 100% reproducible results
  • Fully auditable for compliance
  • No training data required
  • Transparent decision making
  • Fast, predictable performance
  • No model drift over time

AI/ML Approaches

  • Results vary between runs
  • Black box decision making
  • Requires training data
  • Difficult to audit
  • Higher compute costs
  • Model drift over time

The 10-Step Process

From input to output, here's exactly what happens to your document

1

Input Text

Submit your document via web interface, API, or Word Add-in

2

Language Detection

System identifies the document language for optimal processing

3

Tokenization

Text is broken into tokens for pattern matching

4

Pattern Matching

Regex patterns scan for 256 entity types

5

Context Analysis

Surrounding text improves detection accuracy

6

Confidence Scoring

Each detection receives a confidence score

7

Entity Classification

Detected items are categorized by type

8

Review Results

See all detections with positions and scores

9

Apply Anonymization

Choose your method: Replace, Redact, Hash, Encrypt, or Mask

10

Output Document

Download your anonymized document

Frequently Asked Questions

Why does anonymize.today use regex instead of AI for PII detection?
Regex-based detection gives 100% reproducible results — the same input always produces the same output. AI and machine learning models can produce different results between runs, suffer from model drift over time, and operate as black boxes that are difficult to audit. For regulatory compliance under GDPR and ISO 27001, organizations need explainable, repeatable processes, which is exactly what regex-based pattern matching delivers.
How accurate is the PII detection?
anonymize.today provides confidence scores from 0.0 to 1.0 for each detection. Users can set minimum confidence thresholds to control sensitivity. Pattern-based entities like credit card numbers and SSNs achieve 95-99% accuracy, while NLP-based entities like names and locations achieve 85-95% accuracy. The platform supports 256 entity types with carefully crafted patterns for each.
Can I audit how anonymize.today processes my data?
Yes, every detection in anonymize.today shows the exact pattern matched, the confidence score, and the entity type identified. The Analyzer highlights detected entities with category-specific colors and positions within the text. This full transparency makes it straightforward to explain detection decisions to auditors, compliance officers, or data protection authorities.
What happens to my data during processing?
Text submitted to anonymize.today is sent via TLS 1.3 encrypted connections to ISO 27001-certified servers in Germany. The text is processed in memory using Microsoft Presidio, and results are returned immediately. No user content is stored on the servers after processing. Data never leaves the European Union.
How does anonymize.today handle multiple languages in one text?
anonymize.today supports automatic language detection to identify the primary language of a document. For multi-language texts, users can create custom presets that combine entity types across language boundaries. The platform supports 27 PII detection languages using spaCy, Stanza, and Transformer models, enabling detection of country-specific entities like German tax IDs, French NIR numbers, or Japanese My Number IDs within the same document.

See It in Action

Try our PII detection and anonymization free with 300 tokens per month.