Building Embedded Fintech Dashboards: Architecture Guide for Fintech-SaaS Engineering Teams (2026)

Reference architecture for embedded fintech dashboards in 2026 - multi-tenancy with row-level security, PCI-DSS scope minimization, real-time + historical data layering, audit logging for SOC 2 evidence, and code samples for the canonical reconciliation-dashboard build.

Vishnupriya B
Data Analyst specializing in data visualization, SQL, Python, and data modeling.
Published On:
May 6, 2026
Updated On:
May 6, 2026
Updated On:
March 24, 2026

Key Takeaways

  • The reference architecture for embedded fintech dashboards has 6 layers: data ingestion (Plaid / Stripe / direct bank API / ledger system), warehouse (Postgres → Snowflake / ClickHouse at scale), reconciliation engine, analytics layer (embedded SDK or platform), tenant isolation + RLS, audit log + compliance trail. Each layer has at least one fintech-specific constraint that generic SaaS analytics doesn't have, which is why the in-house build typically takes 6–12 engineering months even for teams that have shipped analytics in other verticals.
  • Multi-tenancy in fintech requires three layers of isolation: database (RLS), application (guest tokens with per-tenant context propagation), and infrastructure (whale-tenant dedicated-cluster pattern for top tenants by volume or data size). Skipping any layer fails enterprise security review. The standard pattern in 2026 starts shared-schema + RLS and graduates "whale" customers to dedicated clusters as they cross 1% of total query volume.
  • PCI-DSS scope minimization is the single biggest architectural decision for payments and lending fintech embedded analytics. Tokenization at ingestion (PAN never enters the analytics layer), PCI-vs-non-PCI database segregation, and audit-log retention turn an in-scope-everywhere disaster into a tractable PCI Level 2 footprint. The PCI Security Standards Council v4.0 framework codifies the patterns; under-implementing them means a quarterly six-figure operational cost.
  • Real-time + historical data must be layered, not unified. Change-data-capture (Debezium, Fivetran HVR) for the live layer; materialized views in the columnar OLAP layer (ClickHouse, Snowflake) for the historical layer; deliberate cache invalidation between them. Trying to serve both from a single OLTP database breaks at ~$10M ARR or ~10M rows per tenant.
  • The reconciliation dashboard is the canonical fintech embedded-analytics use case - multi-source matching across sales channels, payment gateways, bank statements, and the internal ledger; exception workflow; audit-trail visibility. This is the architecture Zenstatement shipped to retail and F&B finance teams, and the architecture every CFO-stack fintech SaaS builds variants of.
  • Build typically takes 6–12 engineering months to ship the full layered architecture in-house. Embedded analytics platforms abstract the analytics layer + RLS + audit, leaving teams responsible for ingestion + reconciliation logic - the parts that are domain expertise rather than commodity infrastructure. Net effect: weeks-to-ship instead of quarters-to-ship.
  • Self-hosted deployment is a first-class architectural concern, not a deployment afterthought. Most enterprise fintech customers (banking, lending-tech, insurance-tech, regulated CFO-stack) require the entire analytics stack - embedded SDK runtime, data warehouse, audit-log layer, any AI/ML pipelines - to deploy inside their own VPC with no data egress to a third-party SaaS. Per EU DORA (in force January 2025), GLBA + FFIEC (US), and RBI / MAS frameworks (APAC), this isn't a "nice to have" - it's a hard regulatory requirement that determines whether the analytics layer can be deployed at all. Architectures that don't decouple control plane (vendor-managed) from data plane (customer-VPC) fail enterprise security review.

Stripe Atlas's payments-architecture references and Plaid's State of Fintech reports both surface the same pattern in 2026: the fintech SaaS that ship customer-facing analytics fastest are the ones that decoupled the four hard problems (tenant isolation, PCI-DSS scope, real-time + historical layering, audit logging) into modular layers - and the ones that conflated them into a single "dashboard system" requirements document spent multiple quarters re-architecting after the first enterprise security review.

This guide is the technical companion to the embedded analytics for fintech SaaS strategy guide. It walks through the reference architecture, the multi-tenancy patterns, the PCI-DSS scope-minimization decisions, the real-time + historical layering, code samples for the canonical reconciliation-dashboard build, and the operational concerns (SOC 2 evidence, GDPR data residency, ISO 20022 message handling, SLA monitoring) that production deployments need to handle.

For the strategy / SaaS-PM angle (build-vs-embed, 5 maturity levels, tools comparison), see embedded analytics for fintech SaaS. For the broader fintech analytics strategy, see fintech data analytics.

By Vishnupriya B, Data Analyst at Databrain. Data Analyst specializing in data visualization, SQL, Python, and data modeling.

Published May 5, 2026 · Updated May 5, 2026

Reference Architecture for an Embedded Fintech Analytics Layer

The 2026 reference architecture has six layers. Each one is a distinct deployment and operational concern. Conflating them into a single "analytics service" is the most common cause of the 12-month re-architecture cycle.

Layer 1: Data ingestion

Sources for a CFO-stack fintech SaaS like Zenstatement: sales channels (Shopify, Amazon, marketplace platforms), payment gateways (Stripe, Razorpay, Adyen), banks (direct bank API or aggregators like Plaid), logistics platforms (for COD reconciliation), internal ledger.

Ingestion patterns: REST API polling (typical for newer SaaS APIs), webhooks (for event-driven sources like payment gateways), CDC streaming (for partner systems that support it via Debezium / Fivetran HVR), file-based ingestion (for legacy bank statement files via SFTP).

The fintech-specific constraint here is credential management: ingestion connectors typically need to manage OAuth refresh tokens (Plaid), API keys (Stripe), or signed webhook secrets - and the credential rotation policy needs to align with what regulators expect (typically 90-day rotation minimum for PCI-DSS, longer for OAuth-based open-banking integrations).

Layer 2: Data lake / warehouse

The Postgres → Snowflake / ClickHouse migration is the dominant pattern past ~$5M ARR or ~10M rows per tenant. Postgres works for the transactional layer; the columnar OLAP layer is where customer-facing analytics queries land.

The split is structural: writes go to Postgres, a CDC pipeline replicates writes into the OLAP layer (typically with a ~1–60 second lag), reads for analytics happen in the OLAP layer.

GitLab's ClickHouse migration writeup is the canonical case study for this pattern at scale. The standard hesitation around the migration ("we don't have anyone with ClickHouse experience") is increasingly outdated in 2026 - managed offerings (ClickHouse Cloud, Snowflake) reduce the operational burden meaningfully. See ClickHouse migration guide for the migration pattern.

Layer 3: Reconciliation engine

For CFO-stack and payments fintech, reconciliation is the canonical data-transformation use case. Multi-way matching across sales channels, payment gateways, bank statements, and the internal ledger - with exception cases that need human-in-the-loop workflow.

The reconciliation engine is typically a separate service (not a layer of the warehouse) because its requirements diverge: stateful match logic, exception queues, retry policies, manual-override workflows. Reconciliation engines own their own state and push reconciled / unmatched / exception events into the warehouse for the analytics layer to read.

The Zenstatement-shaped multi-source reconciliation pattern (sales channels + payment gateways + banks + logistics platforms) is the architecture pattern most CFO-stack fintech SaaS converge on.

Layer 4: Analytics layer (embedded SDK or platform)

The customer-facing dashboard rendering, KPI cards, drill-down workflows, export controls. This is where build-vs-embed decisions land most consequentially - the analytics layer is the surface area customers interact with, and it's the layer that requires the broadest set of fintech-specific features (white-label rendering, RLS enforcement, audit logging, embedded SDK for in-product rendering).

In an in-house build, this layer is typically Recharts / Highcharts + custom backend + custom RLS enforcement. In an embedded analytics platform deployment, this layer is the platform itself (Databrain, Sisense Compose, Embeddable, or similar).

Layer 5: Tenant isolation + RLS

Three sub-layers:

  1. Database layer: row-level security policies on every fact table, filtering rows by tenant_id. Postgres RLS and Snowflake row access policies are the two standard implementations. ClickHouse supports row policies via its security features.
  2. Application layer: every analytics request carries a signed guest token that propagates tenant context to the database session. Tokens are short-lived (typically 60 minutes), claim-scoped, and verified on every request.
  3. Infrastructure layer: "whale" tenants (top 1% by query volume or data size) graduate to dedicated clusters or schemas, isolating their workload from shared-tenant infrastructure.

For the deeper multi-tenant analytics architecture detail, including the tradeoffs between shared-schema, schema-per-tenant, and database-per-tenant, see the dedicated guide.

Layer 6: Audit log + compliance trail

Every dashboard view, drill-down, export, and admin action produces an immutable log entry: user ID, tenant ID, timestamp, IP, accessed-data-scope, and (for high-stakes views) the underlying query. Logs are tamper-evident (write-once or append-only storage) and retained per the longest-applicable rule (12 months for SOC 2 Type II evidence, 5+ years for AML transaction monitoring per FATF guidance, 7+ years for most fintech default-retention policies).

The audit-log layer is typically the most under-built primitive in in-house implementations. Embedded analytics platforms typically provide it natively as a first-class concern.

Multi-Tenancy in a Fintech Context

RLS implementation patterns

Postgres example:

CREATE POLICY tenant_isolation_orders ON orders
  FOR ALL
  TO app_role
  USING (tenant_id = current_setting('app.current_tenant')::uuid);

ALTER TABLE orders ENABLE ROW LEVEL SECURITY;

-- Set on every connection at session start (typically in connection middleware)
SET app.current_tenant = '<tenant-uuid>';

The current_setting('app.current_tenant') value is set per-session by the application layer, sourced from the verified guest-token claim. The policy applies on every query, regardless of whether the application code remembers to filter - which is the entire point of RLS as a defense-in-depth pattern.

Snowflake example:

CREATE ROW ACCESS POLICY tenant_isolation_policy AS (tenant_id varchar) RETURNS BOOLEAN ->
  tenant_id = current_tenant_setting();

ALTER TABLE orders
  ADD ROW ACCESS POLICY tenant_isolation_policy ON (tenant_id);

Snowflake's row access policy primitive is more declarative than Postgres RLS; the current_tenant_setting() function reads from a session variable set by the application layer.

ClickHouse example:

CREATE ROW POLICY tenant_isolation ON orders
  USING tenant_id = currentUser()
  TO app_role;

ClickHouse's row policies use the user identity directly, which works when each tenant maps to a database user (one common pattern). The alternative is a session-context approach via setSetting('tenant_id', ...).

Guest token + per-tenant context propagation

The guest token is the application-layer mechanism for moving tenant context from the embedded dashboard request into the database session. Standard pattern:

  1. Customer logs into the fintech product. The product's auth issues a session token.
  2. When the customer opens an embedded dashboard, the fintech's backend calls the embedded analytics platform's /auth/guestToken endpoint with the master API key + tenant context (tenant ID, user role, view permissions).
  3. The platform returns a JWT signed with the platform's private key, scoped to the specific tenant + view + 60-minute expiry.
  4. The fintech's frontend embeds the dashboard with the JWT in the iframe URL or SDK-init payload.
  5. Every request from the embedded dashboard carries the JWT. The platform verifies the signature, extracts the tenant context, sets the database session variable, and the RLS policies fire.

The end-user's browser never sees the master API key. The JWT is short-lived (so leaked tokens have a 60-minute blast radius). The signature verification happens on every request.

For deeper coverage of secure embedded analytics with guest tokens, including JWT best-current-practice patterns (RFC 8725), see the dedicated guide.

"Whale" customer dedicated-cluster pattern

The standard threshold for graduating a tenant to dedicated infrastructure is when one tenant exceeds 1% of total query volume or 5% of stored data. At that point, the noisy-neighbor problem becomes a real SLA risk - one whale running heavy queries slows down dashboards for every other tenant on the shared cluster.

Most fintech SaaS in 2026 tier this in the pricing model: shared-cluster on Starter / Professional plans, dedicated-cluster on Enterprise. The dedicated tier also unlocks customer-specific configuration (regional warehouse selection, custom retention, BYOC bring-your-own-cloud) that most enterprise contracts now require.

PCI-DSS Scope Minimization

The PCI-DSS scope of an embedded analytics layer is determined by what cardholder data flows through it. Three architectural decisions reduce scope:

Tokenization at ingestion

PAN (primary account number) gets tokenized at the earliest point - typically at the payment gateway boundary, before data lands in the warehouse. Tokens (typically opaque references managed by Stripe, Adyen, or a token vault) flow through the analytics layer; PAN never does. Per PCI Security Standards Council v4.0, tokenized data outside the cardholder data environment (CDE) is typically out-of-scope, dramatically reducing the audit footprint.

Database segregation (PCI vs non-PCI)

Even with tokenization, some fintech use cases need a system of record for cardholder data. Standard pattern: a small, isolated PCI-scope database (often the payment gateway itself, or a hardened token vault), plus the broader non-PCI analytics warehouse. The two never share connections, credentials, or network segments.

Audit log requirements

PCI-DSS requires logging of access to cardholder data (Requirement 10). For embedded analytics where some queries touch tokenized data, the audit log layer needs to capture every such access - and the logs themselves need to be in PCI scope (immutable, retained, tamper-evident).

The end state of these three decisions: a payments fintech's embedded analytics layer is typically out of PCI scope entirely, with only the payment gateway in scope. A lending fintech's analytics layer is typically out of scope for PCI but in scope for SOC 2 and GLBA. The scope mapping decision should be documented and reviewed quarterly with the compliance team - it's not a set-it-and-forget-it primitive.

Real-Time + Historical Data Layering

CDC for the live layer

Change-data-capture (Debezium for Postgres / MySQL, Fivetran HVR for general-purpose, native CDC in Snowflake / ClickHouse for some sources) reads transaction logs from the OLTP database and replicates writes into the analytics warehouse with sub-minute lag. The standard cadence is 1–60 seconds depending on tooling and workload.

Materialized views for the historical layer

Frequently-queried aggregations (e.g., daily revenue by tenant, monthly cohort retention) are precomputed via materialized views. Reads against materialized views are sub-second; reads against the underlying fact tables are typically multi-second on large datasets. The pattern is: live data via CDC, historical aggregates via materialized views, with the analytics layer reading from whichever is appropriate for the dashboard query.

Cache invalidation strategy

The cache-invalidation problem (when CDC has updated underlying data but the materialized view is stale) is solved one of three ways: (a) scheduled refresh (rebuild materialized views every N minutes; simplest but introduces lag), (b) event-driven invalidation (listen for CDC events that touch the materialized view's source rows; trigger refresh), (c) read-time decision (frontend checks data freshness timestamps and renders the appropriate layer).

Most fintech embedded analytics platforms in 2026 abstract this - the cache-invalidation policy is a configuration option per-dashboard rather than a custom-engineered system.

Code Walkthrough: Building a Reconciliation Dashboard

The canonical fintech embedded-analytics use case is multi-source reconciliation. Here's the walkthrough for a CFO-stack fintech (Zenstatement-shaped pattern).

Data model

-- Sales channel orders
CREATE TABLE orders (
  order_id uuid PRIMARY KEY,
  tenant_id uuid NOT NULL,
  channel text NOT NULL,            -- 'shopify' | 'amazon' | 'marketplace'
  external_id text NOT NULL,         -- channel-specific order ID
  amount_cents bigint NOT NULL,
  currency char(3) NOT NULL,
  created_at timestamptz NOT NULL
);
CREATE INDEX orders_tenant_idx ON orders (tenant_id, created_at);

-- Payment gateway events
CREATE TABLE gateway_events (
  event_id uuid PRIMARY KEY,
  tenant_id uuid NOT NULL,
  gateway text NOT NULL,             -- 'stripe' | 'razorpay' | 'adyen'
  external_id text NOT NULL,         -- gateway's payment ID
  amount_cents bigint NOT NULL,
  fee_cents bigint NOT NULL,
  currency char(3) NOT NULL,
  status text NOT NULL,              -- 'succeeded' | 'refunded' | 'disputed'
  occurred_at timestamptz NOT NULL,
  related_order_external_id text     -- nullable; gateway-reported reference
);
CREATE INDEX gateway_events_tenant_idx ON gateway_events (tenant_id, occurred_at);

-- Bank statement entries
CREATE TABLE bank_entries (
  entry_id uuid PRIMARY KEY,
  tenant_id uuid NOT NULL,
  bank_account_id uuid NOT NULL,
  description text,
  amount_cents bigint NOT NULL,
  currency char(3) NOT NULL,
  posted_at timestamptz NOT NULL
);
CREATE INDEX bank_entries_tenant_idx ON bank_entries (tenant_id, posted_at);

-- Reconciled matches (the output of the reconciliation engine)
CREATE TABLE reconciliation_matches (
  match_id uuid PRIMARY KEY,
  tenant_id uuid NOT NULL,
  order_id uuid REFERENCES orders(order_id),
  gateway_event_id uuid REFERENCES gateway_events(event_id),
  bank_entry_id uuid REFERENCES bank_entries(entry_id),
  match_status text NOT NULL,        -- 'auto_matched' | 'manual_matched' | 'partial' | 'unmatched'
  matched_amount_cents bigint NOT NULL,
  variance_cents bigint NOT NULL,    -- mismatch amount, if any
  matched_at timestamptz NOT NULL
);
CREATE INDEX matches_tenant_idx ON reconciliation_matches (tenant_id, matched_at);

-- RLS policies on every fact table
CREATE POLICY tenant_isolation_orders ON orders FOR ALL TO app_role
  USING (tenant_id = current_setting('app.current_tenant')::uuid);
CREATE POLICY tenant_isolation_gateway ON gateway_events FOR ALL TO app_role
  USING (tenant_id = current_setting('app.current_tenant')::uuid);
CREATE POLICY tenant_isolation_bank ON bank_entries FOR ALL TO app_role
  USING (tenant_id = current_setting('app.current_tenant')::uuid);
CREATE POLICY tenant_isolation_matches ON reconciliation_matches FOR ALL TO app_role
  USING (tenant_id = current_setting('app.current_tenant')::uuid);

ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
ALTER TABLE gateway_events ENABLE ROW LEVEL SECURITY;
ALTER TABLE bank_entries ENABLE ROW LEVEL SECURITY;
ALTER TABLE reconciliation_matches ENABLE ROW LEVEL SECURITY;

Multi-way reconciliation match query

The dashboard's "reconciliation status" view answers: of the orders in the last 30 days, how many are auto-matched, manually-matched, partial, or unmatched?

SELECT
  match_status,
  COUNT(*) AS match_count,
  SUM(matched_amount_cents) / 100.0 AS matched_amount,
  SUM(variance_cents) / 100.0 AS variance_amount
FROM reconciliation_matches
WHERE matched_at > now() - interval '30 days'
GROUP BY match_status
ORDER BY match_status;

The query is tenant-scoped automatically by RLS - there's no WHERE tenant_id = ... clause in the application code, because the database enforces it. Even if the application code has a bug that forgets to filter by tenant, the database returns rows for the current session's tenant only.

Exception list query

The "exceptions queue" view answers: which orders haven't matched in the last 7 days, and what's blocking them?

SELECT
  o.order_id,
  o.external_id AS order_reference,
  o.amount_cents / 100.0 AS order_amount,
  o.currency,
  o.created_at,
  CASE
    WHEN ge.event_id IS NULL THEN 'no_gateway_event'
    WHEN be.entry_id IS NULL THEN 'no_bank_entry'
    ELSE 'amount_variance'
  END AS exception_reason
FROM orders o
LEFT JOIN gateway_events ge
  ON ge.related_order_external_id = o.external_id
  AND ge.tenant_id = o.tenant_id
LEFT JOIN bank_entries be
  ON be.amount_cents = ge.amount_cents - ge.fee_cents
  AND be.posted_at BETWEEN ge.occurred_at AND ge.occurred_at + interval '5 days'
  AND be.tenant_id = o.tenant_id
WHERE o.created_at > now() - interval '7 days'
  AND NOT EXISTS (
    SELECT 1 FROM reconciliation_matches rm
    WHERE rm.order_id = o.order_id
      AND rm.match_status IN ('auto_matched', 'manual_matched')
  )
ORDER BY o.created_at DESC
LIMIT 100;

This query produces the exception queue the finance team works through. Each row has a reason, an order amount, and a reference back to the source order - and clicking a row drills into the underlying data without leaving the dashboard.

Frontend SDK guest-token handshake (React example)

import { DashboardEmbed } from '@databrain/react-sdk';
import { useEffect, useState } from 'react';

function ReconciliationDashboard({ tenantId, userId }: { tenantId: string; userId: string }) {
  const [guestToken, setGuestToken] = useState<string | null>(null);

  useEffect(() => {
    fetch('/api/embed/guest-token', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        tenantId,
        userId,
        dashboardId: 'reconciliation-status',
        // The fintech's backend exchanges this for a Databrain-signed JWT
      }),
    })
      .then(r => r.json())
      .then(data => setGuestToken(data.guestToken));
  }, [tenantId, userId]);

  if (!guestToken) return <div>Loading…</div>;

  return (
    <DashboardEmbed
      dashboardId="reconciliation-status"
      guestToken={guestToken}
      theme="zenstatement-light"
      onError={(err) => console.error('Dashboard render error', err)}
    />
  );
}

The fintech's backend mints the guest token (server-side, master-API-key-authenticated). The frontend never sees the master key. The guest token is short-lived (60 minutes) and scoped to the specific dashboard + tenant context.

Architecture Pattern: Zenstatement

Zenstatement implements exactly the layered architecture described above for retail and F&B finance teams. Their data ingestion layer pulls from sales channels, payment gateways, banks, and logistics platforms. Their reconciliation engine handles multi-way matching including fees, refunds, chargebacks, and COD payments. Their tenant-isolation model uses RLS at the database layer + guest tokens at the application layer. The analytics + dashboards + NL-query layers are delivered via Databrain - letting Zenstatement focus engineering on the ingestion and reconciliation logic that's their domain expertise, rather than rebuilding tenant-isolated analytics infrastructure.

The full architecture pattern, including the multi-source reconciliation match logic and the embedded analytics handshake, is the canonical shape for CFO-stack fintech SaaS in 2026.

Operational Concerns

SOC 2 evidence collection

SOC 2 Type II audits typically require 6–12 months of operating evidence. For embedded analytics, the relevant controls are typically: access logging (every dashboard view + export logged with user, tenant, timestamp), change management (deployment audit trail), incident response (alerting + runbook documentation), and vendor management (subprocessor inventory + SLAs). Embedded analytics platforms typically provide audit-log primitives that satisfy the access-logging control natively; in-house builds typically under-implement this and pay the audit-prep cost in subsequent quarters.

GDPR data residency (multi-region routing)

EU-resident customer data must reside in EU data centers. Three architectural patterns: (1) database-per-region (one warehouse per region, route tenants to their region's warehouse - most common), (2) row-level region tagging with regional query routing, (3) full database-per-tenant in customer-chosen region (for enterprise SLA requirements). GDPR Article 44–49 governs data transfers; the architecture choice is typically driven by enterprise customer DPA (Data Processing Addendum) requirements.

ISO 20022 message handling

SWIFT's ISO 20022 migration is in production for cross-border payments and continues rolling forward. ISO 20022 messages carry richer structured data than legacy MT messages (party identifiers, purpose codes, structured remittance information). Analytics architectures that parse only amount-and-timestamp leave the most valuable signal on the floor. Mature fintech analytics layers in 2026 have a dedicated ISO 20022 parser layer that extracts structured fields into dimensional columns in the warehouse.

SLA + monitoring

Customer-facing analytics has SLA expectations (typically 99.9% uptime for business-hour dashboard rendering, sub-2-second median dashboard load time, no more than 60-second data-freshness lag). These are observability concerns: dashboard-render latency monitoring (P50, P90, P99), CDC pipeline lag monitoring (event-time vs ingest-time), and embedded-platform vendor SLA tracking (uptime + incident history). Most in-house builds under-instrument observability on the first iteration.

Self-Hosted Deployment Architecture

For fintech customers under banking, lending, insurance, or PCI-DSS-heavy payments regulation, customer-VPC self-hosted deployment is typically a hard requirement - driven by GLBA + FFIEC (US), EU DORA + GDPR Article 44–49 (EU), RBI IT Framework for NBFCs (India), and MAS Technology Risk Management Guidelines (Singapore). The architectural pattern that satisfies these constraints is the control-plane / data-plane split.

Control plane vs data plane

  • Control plane (vendor-managed, runs in vendor's infrastructure): authentication for vendor-side admins, metering / billing, software-update distribution, marketplace UI, customer-success tooling. Holds no customer financial data.
  • Data plane (customer-VPC, runs in the customer's AWS / Azure / GCP account): the embedded SDK runtime, the analytics warehouse, all dashboard rendering, all customer data, all RLS enforcement, all audit logs. No data egress - the customer's data never leaves their VPC.

Vendor software updates flow control-plane → data-plane via signed software bundles. Customer data flows the opposite way (customer ingestion → analytics → customer's own end-users) without ever transiting vendor infrastructure.

What this means for the 6 architecture layers

Each layer from §"Reference Architecture" deploys differently in self-hosted mode:

LayerWhere it deploys in self-hosted mode
Data ingestion (Plaid / Stripe / bank API)Customer VPC. Webhooks / CDC connect from external sources directly to customer's ingestion service.
Warehouse (Postgres / Snowflake / ClickHouse)Customer VPC. Customer owns and operates the database.
Reconciliation engineCustomer VPC. Vendor ships the engine as a deployable artifact (Docker image / Helm chart).
Analytics layer (embedded SDK)Customer VPC. Vendor ships the SDK as a containerized service for the customer's K8s cluster.
Tenant isolation + RLSCustomer VPC. RLS policies live in the customer's database.
Audit log + compliance trailCustomer VPC. Logs land in customer's chosen tamper-evident storage (S3 + Object Lock, Azure Blob immutable storage, GCS Bucket Lock).

Customer-VPC self-hosted vs BYOC: the operational distinction

  • Customer-VPC self-hosted: vendor manages the lifecycle (software updates, version pinning, patches, monitoring) but the customer owns the infrastructure. Vendor has limited operational access via a vendor-side bastion or via a "break-glass" workflow auditable by the customer.
  • BYOC (Bring Your Own Cloud): customer manages the entire lifecycle; vendor provides only the software bundle + documentation. Full control to the customer; full operational burden on the customer's platform team.

Most regulated fintech customers prefer customer-VPC self-hosted because it preserves vendor-provided lifecycle management while satisfying GLBA / DORA / GDPR / RBI / MAS data-control requirements. BYOC is reserved for the most regulated tier (banking, government-adjacent) where the customer's compliance team won't accept any vendor operational access.

Audit-trail considerations under self-hosted

Audit logs (every dashboard view, export, drill-down, admin action) must be tamper-evident in the customer's storage. Three patterns satisfy this:

  • Object-store immutability: S3 Object Lock (Compliance mode), Azure Blob immutable storage with time-based retention, GCS Bucket Lock. Logs are append-only; no admin (vendor or customer) can delete or modify within the retention window.
  • Append-only database tables: Postgres tables with revoke-DELETE / revoke-UPDATE grants, or columnar warehouses that don't natively support row-level UPDATE (ClickHouse).
  • Hash-chain or blockchain-anchored logs: rare in production fintech but starting to appear for highest-tier banking deployments.

For most fintech embedded analytics, S3 Object Lock + structured JSON log records satisfies SOC 2 Type II, PCI-DSS, and DORA requirements without additional cryptographic infrastructure.

Databrain's self-hosted architecture

Databrain's financial-services product line ships customer-VPC self-hosted with the control-plane / data-plane split described above. Specific guarantees: no user data storage in vendor infrastructure, multi-region routing for jurisdiction-specific deployments, VPC peering for hybrid setups (customer's analytics warehouse in their own VPC, plus existing private databases connected via SSH tunnel), and 4-level multi-tenancy isolation (customer → division → tenant → end-user). The strategy-level deployment-decision framework (when self-hosted is required vs cloud-hosted is acceptable) is in embedded analytics for fintech SaaS.

Tools & Platforms (Technical Comparison)

The fintech embedded analytics tools landscape in 2026 from a technical-fit perspective:

  • Databrain - embedded analytics platform with native multi-tenancy, RLS enforcement at the database layer, white-label SDK, JWT-based guest-token authentication, SOC 2 Type II + PCI-DSS-aligned. Strong fit for fintech SaaS prioritizing speed-to-ship and SOC 2 / PCI evidence cost. Customer base includes Zenstatement and other CFO-stack and finance-tech-adjacent SaaS.
  • Sisense Compose - embedded analytics with deep customization options; longer implementation cycles compared to native-SaaS-first platforms.
  • Embeddable - newer entrant focused on developer-experience for embedded dashboards.
  • Luzmo (formerly Cumul.io, rebranded 2024) - European embedded analytics platform with strong GDPR positioning.
  • In-house build with Recharts / Highcharts + custom backend - viable for Level 2 dashboards (one-size-fits-all generic embedded dashboard); structurally hits the wall at Level 3 (RLS + audit + multi-tenancy enforcement). Most teams that build at Level 2 re-architect to a platform within 12–18 months.

The right choice depends on the maturity level you're targeting and the regulatory exposure of your customer base. For deeper SaaS-PM-side decision framework, see embedded analytics for fintech SaaS.

Sources

This guide draws on the following authoritative architecture, security, and standards references:

For complementary fintech analytics resources, see embedded analytics for fintech SaaS, fintech data analytics, multi-tenant analytics, secure embedded analytics with guest tokens, ClickHouse migration guide, embedded data visualization, and fintech data visualization.

About the author

Vishnupriya B is a Data Analyst at Databrain specializing in data visualization, SQL, Python, and data modeling. She works on fintech, procurement, and supply-chain analytics implementations across the Databrain customer base - including the Zenstatement implementation referenced in this guide - and writes about the architectural patterns that separate fintech analytics layers that scale from ones that hit a wall at Series A. Connect on the author page.

Frequently Asked Questions

What's the difference between embedded analytics and a separate analytics product for a fintech SaaS?

Embedded analytics renders inside the fintech product's UI with the fintech's branding and tenant context propagated from the fintech's auth. A separate analytics product requires customers to log into a different system, manage a different authentication, and tolerate branding mismatch. The friction difference (one product vs two) typically translates to 2–3× higher adoption rates for embedded versus separate.

How do I implement RLS for a multi-tenant fintech analytics layer?

Three-layer pattern. Database layer: row-level security policies on every fact table filtering by tenant ID (Postgres CREATE POLICY ... USING (tenant_id = current_setting('app.current_tenant')), Snowflake ROW ACCESS POLICY, ClickHouse ROW POLICY). Application layer: guest tokens carry the tenant claim, the application sets the database session variable from the verified claim. Infrastructure layer: dedicated clusters for whale tenants over 1% of query volume.

What database should I use for the historical layer of a fintech analytics product?

Postgres for under ~$5M ARR or ~10M rows per tenant. Past that, columnar OLAP: ClickHouse (cost-leader, the GitLab migration writeup is the canonical case study), Snowflake (operationally easiest, most managed), BigQuery / Databricks (for teams already on GCP / Azure). The hybrid is most common: Postgres for OLTP writes, CDC pipeline into the OLAP layer for analytics.

How do I propagate tenant context through guest tokens?

Backend mints a JWT signed with the platform's private key; the JWT carries a tenant_id claim and is scoped to specific dashboards + roles + a 60-minute expiry. The frontend embeds the dashboard with the JWT in the iframe URL or SDK init payload. The platform verifies the signature on every request, extracts the tenant claim, sets the database session variable, and the RLS policies fire automatically. End-user browsers never see the master API key.

What's the PCI-DSS impact of adding embedded analytics?

Goal is scope minimization. Tokenize PAN at ingestion (PAN never enters the analytics layer), segregate PCI-scope databases from non-PCI databases, and route only token-or-aggregate data into the customer-facing analytics layer. Done correctly, the embedded analytics layer is typically out of PCI scope entirely. Done incorrectly, the entire analytics layer is in scope and the quarterly audit becomes a six-figure operational cost.

Make analytics your competitive advantage

Get it touch with us and see how Databrain can take your customer-facing analytics to the next level.

Interactive analytics dashboard with revenue insights, sales stats, and active deals powered by Databrain