Live · Edge-native · Real-time response

Stop bad data before
it enters your pipeline

Schema drift detection and real-time quality checks at the edge. Every payload screened. Every decision returned in milliseconds, not seconds.

Get free API key Try it now — live

Free tier · 500K rows/month · no credit card required

The missing layer

Introducing the Data Firewall

Every data quality tool today runs after ingestion — dbt tests, Great Expectations, Monte Carlo, warehouse constraints. By the time they detect an issue, the damage is already done: bad rows are stored, dashboards are broken, and engineers are paged.

DataScreenIQ is the first real‑time data firewall. It screens every payload before it enters your warehouse — catching schema drift, type mismatches, null spikes, and distribution anomalies at the ingestion boundary.

Bad data never enters your pipeline.

The problem

Your data is breaking silently

Pipelines don't fail loudly. They succeed with corrupt, drifted, or incomplete data. By the time a dashboard breaks, the damage is done.

APIs change without warning

Upstream services add, remove, or rename fields silently. Your ingestion layer has no idea until something breaks downstream.

Null values spike overnight

A field drops from 99% to 40% completeness between batches. No alert fires. Thousands of bad rows enter your warehouse unchecked.

Types shift and corrupt models

A numeric field becomes a string. ML features break. Aggregations silently return wrong results. Downstream effects cascade for hours.

Engineers debug the wrong layer

By the time a dashboard shows NaN, the cause was at ingestion four hours ago. The cost is time, trust, and a war-room call.

// pipeline.log — last 6 hours

02:14✓ orders_feed · 48,221 rows · ok

02:14✓ payments_api · schema stable

03:41✓ user_events · 102,847 rows · ok

03:41✓ crm_export · all fields present

04:22✓ analytics_pipe · 9,441 rows

05:00✓ daily_rollup · complete

06:14⚠ dashboard_revenue · slow query

06:31✗ dashboard_revenue · NaN in totals

06:31✗ model_churn · null feature: 0.61

06:33✗ exec_report · all metrics broken

06:41→ engineer paged. 4h 27m of bad data.

How it works

Three steps to clean data

One endpoint between your sources and your storage. No dashboards. No manual rules. No delays.

01 — SEND

Point your pipeline at our endpoint

One POST per payload. Native Python SDK available. Works with any language, framework, or data format. Integrate in under 30 minutes.

02 — ANALYZE

We run real-time statistical checks

Schema fingerprinting, null rates, type stability, percentile distribution — all computed in-memory at the edge in real-time.

03 — DECIDE

PASS, WARN, or BLOCK — instantly

A structured JSON verdict with health score, issue breakdown, and drift flags. Your pipeline acts on it. Bad data never reaches storage.

Live API tester

Try it now — no signup needed

Use the public test key below. Edit the JSON, hit Send, and see a real quality decision returned instantly.

POST /v1/screen

DataScreenIQ API · v1

POST https://api.datascreeniq.com/v1/screen

Headers

Content-Type application/json

X-API-Key dsiq_pub_test_k3y_00000 Copied!

Request body — edit me

Response

Hit Send to screen your payload

Detection engine

What we catch instantly

18 quality signals computed in real-time at the edge. No rules to write. No thresholds to configure. Works from the first request.

Schema drift

Fields added, removed, or renamed between batches. Catches upstream API changes before they silently break your models.

Type instability

A numeric field suddenly contains strings. A boolean becomes an integer. Caught per column, per batch — before your warehouse casts it wrong.

Null rate spikes

Null rates and empty strings tracked against baselines. When a field drops from 99% complete to 40%, you know instantly — not four hours later.

Distribution shifts

Values outside expected ranges flagged automatically. Revenue of $-999, ages of 350, timestamps from the future — all caught.

Unexpected duplicates

Row count deviations, duplicate records, and unexpected cardinality changes. Batch that normally has 10K rows shows up with 47? Blocked.

Instant verdict

All signals combined into a single health score and a clear decision — PASS, WARN, or BLOCK. Globally, at the edge, without touching your storage.

Built-in features

More than just a screening API

Everything you need to run data quality in production — alerts, baseline management, per-source configuration, and a full dashboard.

Alerting

Webhook & Slack alerts on BLOCK

When a source BLOCKs, DataScreenIQ fires a webhook instantly — to Slack, PagerDuty, or any endpoint you configure. No polling. No dashboard watching.

POST your-endpoint.com/alerts
{ "source": "orders", "status": "BLOCK",
"reason": "Type mismatch in 'amount'",
"health_score": 0.34 }

✓ Built & live

Baseline management

Reset baselines when your schema changes intentionally

Intentional schema changes shouldn't fire drift alerts forever. Reset the baseline for any source with one API call — next screen builds a clean new baseline.

DELETE /v1/schema/orders

{ "ok": true, "reset": "orders",
"message": "Baseline cleared" }

✓ Built & live

Threshold control

Per-source thresholds — not one size fits all

Set different WARN and BLOCK thresholds per source. Payments can block at 1% null. Events can tolerate 40%. Configured in the dashboard or inline in the API call.

options: { thresholds: {
null_rate_warn: 0.01,
null_rate_block: 0.02
} }

✓ Built & live

Integration

One call. Any language.

REST API or Python SDK. Drop one request into your existing pipeline and you're live.

cURL

Python

Node.js

screen.sh

curl -X POST https://api.datascreeniq.com/v1/screen \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"source":"orders","rows":[...]}'

import requests

response = requests.post(
  "https://api.datascreeniq.com/v1/screen",
  headers={"X-API-Key": "YOUR_API_KEY"},
  json={"source": "orders", "rows": [...]}
)

decision = response.json()["status"]
# "PASS" | "WARN" | "BLOCK"

const res = await fetch(
  "https://api.datascreeniq.com/v1/screen",
  { method: "POST",
    headers: { "X-API-Key": "YOUR_API_KEY",
               "Content-Type": "application/json" },
    body: JSON.stringify({ source: "orders", rows: [...] }) }
);

const { status } = await res.json();
// "PASS" | "WARN" | "BLOCK"

Real-time response

Computed at the edge in milliseconds. Adds negligible latency to your pipeline. Not a bottleneck.

REST API + Python SDK.

Call the API directly from any language, or pip install datascreeniq for the native Python SDK.

Zero data retention

Raw payloads are discarded immediately. Only aggregated quality metrics are stored. Full details in our privacy architecture.

Deterministic sampling

Every quality check uses a versioned, auditable sampling strategy. Returned in every response for full traceability.

Built for real pipelines

Who relies on DataScreenIQ

Backend Engineers

Validate every API payload before your database sees it

One API call wraps any ingest point. Bad payloads get blocked before they corrupt a single row. No custom validation logic required.

Data Engineers

Stop broken data before your pipeline runs

Schema drift and type mismatches caught at the source. Spend time building models — not debugging why last night's job failed silently.

Analytics Teams

Protect dashboards from silent data corruption

Every metric you trust starts with data verified at ingestion. No more NaN totals, no more war-room calls, no more bad numbers in exec reports.

Pricing

Simple. Usage-based. No surprises.

One bad data incident costs more than a year of this plan.Pay for what you screen. Scale linearly. Cancel anytime. Questions? Email.app@datascreeniq.com

Developer

500K rows / month

All 18 quality checks
PASS / WARN / BLOCK verdicts
JSON quality reports
Schema drift detection

Start for free

Starter

^$19/mo

5M rows / month

Everything in Developer
Job history 90 days
Job history and reports
Multiple API keys
Production use

Get started

We never see your data.
Only the signal.

// privacy_architecture.txt

→Your payload arrives over HTTPS with API key auth

⚡Edge Worker (in-memory only) — formats parsed, schema fingerprinted, statistics computed

✗Raw data discarded immediately — never written to disk

∑Aggregation layer — null rates, type stats, schema hash only

✗No personal data retained at any point

✓Storage — schema hashes (KV), usage counters (DO), QA summaries (D1)

KV: schema hashes only D1: no raw payloads Personal data: never stored

←Response — PASS / WARN / BLOCK + aggregated metrics only

Raw data never leaves memory

Your payload is processed entirely in-memory inside a Cloudflare Worker. Nothing is ever written to disk or logged.

Edge-native — no central server

Processing happens in Cloudflare's global edge network. No central server ever receives your raw data.

Only aggregates retained

We store schema fingerprints, null rates, and type statistics. Non-reversible. No row-level data, no PII, ever.

No request logging

API requests are not logged at the edge layer. Only billing counters (request count, not content) are tracked per customer.

Read the full privacy architecture

Stop bad data beforeit enters your pipeline

Introducing the Data Firewall

Your data is breaking silently

APIs change without warning

Null values spike overnight

Types shift and corrupt models

Engineers debug the wrong layer

Three steps to clean data

Point your pipeline at our endpoint

We run real-time statistical checks

PASS, WARN, or BLOCK — instantly

Try it now — no signup needed

What we catch instantly

Schema drift

Type instability

Null rate spikes

Distribution shifts

Unexpected duplicates

Instant verdict

More than just a screening API

Webhook & Slack alerts on BLOCK

Reset baselines when your schema changes intentionally

Per-source thresholds — not one size fits all

One call. Any language.

Real-time response

REST API + Python SDK.

Zero data retention

Deterministic sampling

Works with your existing stack

Airflow DAG

GitHub Action

dbt post-hook

Prefect flow

Google Colab

Who relies on DataScreenIQ

Validate every API payload before your database sees it

Stop broken data before your pipeline runs

Protect dashboards from silent data corruption

Simple. Usage-based. No surprises.

We never see your data.Only the signal.

Raw data never leaves memory

Edge-native — no central server

Only aggregates retained

No request logging

Get startedin 30 seconds

Stop bad data before
it enters your pipeline

We never see your data.
Only the signal.

Get started
in 30 seconds