Privacy Architecture — DataScreenIQ

Table of contents

1. Overview — privacy by design 2. Data flow architecture 3. Edge processing layer 4. What we store (and what we don't) 5. Storage layer detail 6. No request logging 7. Data retention periods 8. Security controls 9. Compliance & certifications 10. Contact

1. Overview — privacy by design

DataScreenIQ is built on a fundamental principle: we cannot misuse data we never retain. This isn't a policy decision — it's an architectural constraint baked into every layer of the system.

Your raw data payloads pass through our edge compute layer and are immediately discarded after statistical analysis. At no point is any raw payload written to disk, replicated across nodes, or accessible to DataScreenIQ personnel. What we retain is limited to aggregated quality signals that cannot be reverse-engineered into your original data.

Core privacy guarantee

DataScreenIQ never stores, logs, or transmits your raw data payload. Processing is entirely in-memory. The only persistent outputs are non-reversible aggregated statistics: null rates, type distributions, schema hashes, and quality scores.

2. Data flow architecture

The following diagram represents the complete path of a data payload from your system to our response. Every step is described below.

// privacy_architecture — full data flow

┌─────────────────────────────────────────┐
│ Your System │
│ (API / Events / Webhooks / Data) │
└──────────────────┬──────────────────────┘
                  ↓ HTTPS only
┌─────────────────────────────────────────┐
│ Secure API Endpoint │
│ ✔ TLS 1.3 encrypted in transit │
│ ✔ API key authentication │
│ ✔ Rate limiting │
│ ✗ No request body logging │
└──────────────────┬──────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│ Edge Processing (Cloudflare Workers) │
│ IN-MEMORY ONLY — never touches disk │
│ • Format parsing (JSON / CSV) │
│ • Statistical profiling │
│ • Schema fingerprinting │
│ • Drift detection │
│ ✗ Raw data NEVER persisted │
│ ✗ Raw data discarded after processing │
└──────────────────┬──────────────────────┘
                  ↓ aggregates only
┌─────────────────────────────────────────┐
│ Aggregation & Anonymisation Layer │
│ ✔ Null rates per column │
│ ✔ Type distribution │
│ ✔ Schema hash (non-reversible SHA256) │
│ ✔ Statistical summaries only │
│ ✗ No PII retained at this layer │
└──────────┬─────────────┬───────────────┘
           ↓             ↓
┌──────────────┐ ┌─────────────────────────┐
│ KV (hashes) │ │ D1 (QA summaries only) │
│ schema hashes│ │ no raw payloads │
│ config only │ │ aggregated metrics │
└──────────────┘ └─────────────────────────┘

3. Edge processing layer

All compute happens inside Cloudflare Workers — a serverless edge runtime that executes code in isolated V8 contexts. Key privacy properties of this environment:

In-memory execution

Workers do not have access to a persistent filesystem. Every variable and data structure is created in memory and destroyed when the Worker execution completes. This is not a configuration choice — it is a hard constraint of the runtime environment.

This means your raw payload cannot be written to disk, even if we wanted to. The architecture makes it physically impossible at the edge layer.

Stateless isolation

Each request is processed in an isolated V8 context. Workers do not share memory between requests. Data from one customer's payload cannot leak to another customer's execution context.

What the Worker computes

The Worker performs only statistical operations on your payload. It extracts:

Column names and their inferred types (no values)
Null rate per column (a count, not which rows are null)
Min, max, approximate p50, p95 for numeric fields (statistics, not values)
A SHA-256 hash of the schema structure (non-reversible)
Approximate distinct count per column via HyperLogLog (a count, not the values)

The raw values in your payload — email addresses, order amounts, user IDs, and any other field content — are read once for computation and then discarded. They are never stored, copied, or transmitted elsewhere.

Important: What "discarded" means technically

In V8 (the JavaScript engine used by Cloudflare Workers), objects that are no longer referenced become eligible for garbage collection. Since Worker execution is short-lived and the runtime does not persist across requests, all in-scope variables — including your payload — are destroyed at execution end. There is no mechanism for a Worker to write to persistent storage except through explicit API calls to KV, D1, or Durable Objects, and we only make those calls with the aggregated metrics described above.

4. What we store (and what we don't)

Data type	Stored?	Where	Purpose
Raw payload values	NEVER	—	—
Row-level data	NEVER	—	—
Email / PII fields	NEVER	—	—
API request bodies	NEVER	—	—
Schema fingerprint (hash)	YES	KV	Drift detection
Column null rates	YES	D1	QA reports
Type distribution	YES	D1	QA reports
Request count (billing)	YES	Durable Objects	Usage metering
Health score per batch	YES	D1	Historical QA
API keys (hashed)	YES	KV	Authentication
Customer email	YES	D1	Account management

5. Storage layer detail

Cloudflare KV (Key-Value)

KV stores two categories of data: schema fingerprints (SHA-256 hashes of field name structures, not values) and API key material (hashed with bcrypt). KV is eventually consistent — fingerprints may take seconds to propagate globally. For authoritative schema enforcement, Durable Objects are the primary truth source.

Cloudflare Durable Objects

One Durable Object is provisioned per customer, keyed as do:tenant:{tenant_id}. DOs store billing counters (request count, rows processed) and authoritative schema state. They contain no row-level data and no field values.

Cloudflare D1 (SQLite)

D1 is our control-plane store. It holds: QA report summaries (health scores, null rates, type stats per batch), job history, and customer account information. D1 is never used for data-plane storage — no payload data flows through it. It receives only post-aggregation metrics.

D1 write constraint

DataScreenIQ enforces a strict architectural rule: D1 writes only occur after the aggregation layer. The Worker that receives your payload never writes directly to D1. This ensures that no raw data can accidentally reach persistent storage through a code change or bug.

6. No request logging

Cloudflare Workers support request logging via Logpush or Workers Logs. DataScreenIQ does not enable request logging for data-plane Workers. This means:

Request bodies are never logged
Request headers (which may carry sensitive metadata) are not logged
Response bodies are not logged
IP addresses are not retained beyond Cloudflare's own infrastructure logging

Only the control-plane Workers (authentication, billing counter updates) emit structured logs, and those logs contain only tenant ID, request count, and timestamp — never payload content.

7. Data retention periods

Data item	Retention period	Deletion mechanism
Schema fingerprints	90 days from last use	KV TTL expiry
QA report summaries	90 days (Starter), 365 days (Growth+)	D1 scheduled deletion
Billing counters	Current billing period + 12 months	DO TTL + D1 archival
Customer account data	Duration of account + 30 days	Manual deletion on request
Raw payload values	Never stored	N/A

Customers can request deletion of all retained data at any time by contacting app@datascreeniq.com. Deletion is completed within 7 business days.

8. Security controls

Transport security

All API endpoints enforce TLS 1.3. TLS 1.0 and 1.1 are disabled. Certificates are managed by Cloudflare with automatic renewal. HTTP requests are redirected to HTTPS with HSTS enforced.

Authentication

API keys are generated using cryptographically secure random bytes (256 bits). Keys are stored hashed using bcrypt with a work factor of 12. The plaintext key is displayed once at creation and never stored. Key rotation can be performed at any time from the dashboard.

Rate limiting

Rate limiting is enforced at the edge using Cloudflare's Rate Limiting API, before any payload processing occurs. This prevents abuse without requiring payload inspection for the rate-limiting decision.

Tenant isolation

All stored data is partitioned by tenant ID. KV keys are prefixed with tenant scope. D1 rows include non-null tenant_id foreign keys. Durable Objects are provisioned per-tenant. Cross-tenant data access is architecturally impossible through normal code paths.

9. Compliance & certifications

GDPR

Because DataScreenIQ does not store personal data from your payloads, the GDPR obligations around data subject rights (access, deletion, portability) apply only to account-level data (your name, email, billing information). These can be fulfilled on request within 30 days. Our edge processing is classified as a data processor under GDPR Article 4, and we can discuss specific compliance requirements on request.

CCPA

DataScreenIQ does not sell personal information. The statistical quality signals we retain do not constitute personal information under the CCPA definition. California residents may request deletion of account data by contacting us at app@datascreeniq.com.

SOC 2

DataScreenIQ is an early-stage product. While formal SOC 2 certification is not yet in place, our architecture is designed with security and privacy as foundational constraints — not afterthoughts. Cloudflare, our infrastructure provider, maintains SOC 2 Type II, ISO 27001, and PCI DSS certifications, which cover the underlying infrastructure your data transits.

Enterprise compliance inquiries

If your organisation has specific compliance requirements before integration, contact us at app@datascreeniq.com and we will work with you directly.