Where to start with a data warehouse – a practical guide
If you’re rethinking your fraud or AML stack, you’ll keep bumping into the same question: “Do we need a proper data warehouse before we fix anything else?” For most fast-growing financial institutions, the answer is now yes. This guide is a practical overview of why a modern warehouse matters, how to think about it if you work in fraud or compliance, and where to start if you’re still early in your journey.
1. What is a modern data warehouse (or lakehouse)?
At its core, a data warehouse is a central, governed store for structured data from across your business – built to make analysis, reporting and decision-making easier and more reliable.
Modern cloud warehouses (like Snowflake) add:
• Elastic storage and compute – scale up and down with demand
• Strong security and governance
• Support for many workloads (analytics, reporting, ML, applications) on the same underlying data
A data lakehouse (for example, the Databricks approach) combines elements of a data lake and a warehouse: it keeps large volumes of raw and curated data in one place, but with the reliability, transactions and governance you’d expect from a warehouse.
For you, the labels matter less than the outcomes:
• A single, trusted environment that brings together customer, KYC, transaction, device and case data
• Scalable storage and compute so you can handle spikes in transaction volume or investigative work
• Built-in governance and security that can stand up to regulatory and internal scrutiny
2. Why fraud and AML teams should care?
Fraud and AML teams are already “data teams in disguise”.
You’re trying to:
• Detect suspicious behaviour across multiple products and channels
• Build a joined-up picture of customers, merchants and counterparties
• Prove to regulators that decisions are based on complete, accurate data
Without a central warehouse, you typically end up with:
• Fragmented feeds into each vendor system
• Conflicting numbers between risk, finance and product
• Endless Excel workbooks to “fix” gaps before an audit or remediation project
A modern warehouse changes this in three important ways:
a) It unifies data for detection and investigations
A warehouse brings together core financial crime data in one place:
• Customer and account master
• Transactions across cards, payments, wallets and acquiring
• KYC and onboarding information
• Alerts, cases and outcomes
• External reference data (sanctions, PEP lists, device intel, merchant categories)
For compliance teams, that often shows up as hours lost wrangling data feeds and Excel workbooks instead of doing the job they actually care about: stopping financial crime. For fraud teams, it shows up as time spent reconciling feeds, arguing over which number is “right”, and waiting on new data instead of experimenting with sharper detection. With a warehouse, investigations and analytics all point to the same underlying truth.
b) It improves governance and regulatory comfort
Regulators increasingly expect firms to show:
• Where data came from
• How it was transformed
• How it fed into a decision, rule or model
A good warehouse makes this easier by centralising:
• Data lineage – how fields flow from source systems to reports and models
• Fine-grained access controls – who can see what, and why
• Audit trails – who changed what, and when
This is exactly the kind of evidence that makes audits and regulatory reviews more straightforward.
c) It prepares you for AI-driven fraud and AML
Machine learning and AI agents are only as good as the data you feed them.
A modern warehouse gives you:
• Clean, well-modelled data sets to train and monitor models
• Enough history to understand rare events and long-tail patterns
• A controlled environment to run AI-assisted investigations without copying data into yet another tool
If you want to reduce false positives, test new models safely and use AI to support human analysts, a warehouse (or lakehouse) is the foundation you’ll need.
3. How to know you’re ready for a warehouse
You don’t need to be a Tier-1 bank to justify this. The patterns we see in fast-growing fintechs, processors and acquirers are similar:
• Data is scattered across core banking, card processors, PSPs, CRMs, case tools and spreadsheets
• Different teams produce different numbers for the same metric (fraud loss, chargebacks, SARs)
• Every new tool needs its own data integration project, which stalls delivery
• Investigations and thematic reviews mean manual stitching of CSVs and Excel sheets
• Compliance projects (e.g. new AML rules, KYC refresh) are blocked on data engineering capacity
If two or three of these feel familiar, you’re usually at the point where a warehouse will make fraud/AML projects faster and safer, and continuing without one will compound technical and regulatory risk over the next 12–24 months.
4. Warehouse vs lakehouse: does it matter?
Short answer isless than you think. Both approaches can work very well for fraud and AML as long as you can:
• Bring all relevant data together (customer, transaction, KYC, behavioural, external lists)
• Apply strong governance and security suitable for regulated workloads
• Support both BI and advanced analytics / ML on the same foundation
Snowflake leans towards a highly scalable, fully managed warehouse model and has a strong ecosystem of “powered by Snowflake” applications. Databricks leans towards a lakehouse model built around open formats and unified batch/streaming processing.
For most fast-growing institutions, the more useful questions are:
• Which platform best fits our existing stack and skills?
• Which has the governance posture our regulators and security teams are happiest with?
• Where can we plug in warehouse-native applications (like AML or fraud tooling) without creating yet another data silo?
5. A practical roadmap: getting started in 5 steps
Here’s a simple, risk-friendly way to move from idea to implementation.
Step 1 – Pick 2–3 high-value use cases
Start with fraud/AML problems that really hurt today, for example:
• A single, trusted view of fraud losses and SARs
• Faster investigations across products and channels
• Better backtesting and tuning of existing rules
Make these your first success criteria. They will guide how you design your warehouse and what you load first.
Step 2 – Confirm your warehouse platform
For many organisations, this ends up being Snowflake; for others, it may be Databricks, BigQuery or a similar cloud platform.
The key is that your chosen warehouse:
• Meets your security and compliance requirements
• Can support analytics and ML, not just static reporting
• Has an ecosystem that supports warehouse-native applications
You don’t have to make every decision on day one – but you do need a clear direction of travel.
Step 3 – Define a “minimum viable” financial crime data model
Don’t try to solve every data problem at once. Start by listing the core datasets needed for high-value fraud and AML work:
• Customer and account data
• Transaction streams (cards, payments, wallets, acquiring)
• KYC / onboarding data
• Alerts, cases and outcomes
• Relevant external data
Then decide:
• The level of detail (transaction-level, account-day, customer-product)
• Latency targets (e.g. intraday for monitoring, daily for investigations and reporting)
• Basic data quality checks and ownership “Good enough and usable” beats “perfect and never delivered”.
Step 4 – Land a first slice of data and prove value
Load a narrow but meaningful slice of data into the warehouse and use it to:
• Rebuild one or two key reports that are painful today
• Run a first set of backtests on existing rules
• Let investigators, fraud and compliance teams explore live data
This is where confidence starts to build – and where you start to see the gap between “Excel-driven” and “warehouse-driven” work.
Step 5 – Introduce warehouse-native fraud and AML tooling
Once the basics are working, shift from using the warehouse only for reporting to using it as the engine for financial crime:
• Design and test new scenarios directly on warehouse data
• Use ML and AI to cut false positives and improve detection
• Run investigations, analytics and evidence-gathering without copying data into yet another system
This is where warehouse-native applications – tools that run directly on Snowflake or your chosen platform – become powerful. They let you use the trusted data you already have, instead of building and maintaining yet another set of feeds.
6. Where Fortify fits: warehouse-native fraud and AML
Once you have a strong warehouse in place, you face a choice:
• Continue to plug in black-box vendor tools that require their own data pipelines, or
• Move towards warehouse-native applications that live inside (or directly on top of) your Snowflake environment
We believe warehouse-native is the future for fraud and AML:
• No extra data plumbing – applications work on the trusted data you already have
• Aligned metrics – risk, finance and product teams see the same numbers
• Better governance – access control and lineage stay inside your data platform
• Faster iteration – new rules and models can be tested and deployed quickly on full-history data Fortify AML is being built specifically for this world: warehouse-native AML for institutions that already see Snowflake as their central nervous system for data.
7. If the symptoms we describe in this article feel familiar, your warehouse journey has effectively already started.
The question now is how to turn it into something concrete for fraud and AML.
• If you’re re-thinking your fraud or AML stack and already have a warehouse strategy, we’re always happy to compare notes on what warehouse-native could look like in practice for your team – from quick wins to a full roadmap.
• If you’re still early in your warehouse journey, you can use the steps above as your starting plan – and we’re glad to walk through them with you and tailor them to your context.
👉 Get in touch if you’d like to talk through what a modern data warehouse could do for your fraud and AML teams, or how warehouse-native AML on Snowflake might fit into your roadmap.
Sign up for the latest news and insights from Fortify

Turn risk into ROI
The Fortify team can help

Find out how we can support your prevention strategy

Related articles
Related articles
Need expert advice?
Get in Touch





