AI-powered Power BI data modeling from CRM Knowledge Graph with ABIS
Agentic BI on Power BI data modeling Case — from a raw CRM graph and messy status fields to a clean star schema, dataset plan, and dashboard.

Agentic BI on Power BI data modeling Case — from a raw CRM graph and messy status fields to a clean star schema, dataset plan, and dashboard.
Most Power BI projects don't fail at the visual layer. They fail two steps earlier — in the data nobody verified and the model nobody planned. A dashboard built on an inconsistent status column and overlapping geography fields will look finished and quietly mislead everyone who reads it.
This article walks through a real exercise in AI-powered Agentic Power BI data modeling : taking a raw CRM export, exploring it with an AI agent, finding the data-quality landmines before they reach a report, and planning a clean star schema for a lead-funnel dashboard. The point isn't the tooling. It's the discipline the tooling makes faster — verify the data, then model it, then visualize it, in that order.
The phrase gets thrown around loosely, so it's worth being precise. AI in business intelligence is moving from suggesting visuals to participating in the modeling work itself. At Microsoft Build 2026, Power BI introduced agent capabilities that let developers prompt an AI agent to build and refine semantic models and reports directly, rather than only answering questions about finished ones.
That shift matters because the hardest part of any BI engagement was never writing DAX. It was the unglamorous middle: understanding what's actually in the source, deciding what's a fact and what's a dimension, and cleaning the fields that will become slicers. An AI agent that can inspect a raw graph, count distinct values, and flag inconsistencies compresses days of discovery into minutes — but only if you still apply judgment to what it surfaces. The agent is a confident guesser without business context. The modeling decisions stay human.
So the workflow below is deliberately ordered: explore, verify, model, plan. The AI accelerates each step; it doesn't skip any of them.
The starting point was a dummy CRM dataset loaded as a graph — roughly 7,600 records spanning a single tenant. The funnel-relevant entities were all present:
| Entity | Count | Funnel role |
|---|---|---|
| Lead Person | 203 | Top of funnel |
| Contact | 200 | Engaged / nurtured |
| Account | 145 | Company-level |
| Opportunity | 133 | Qualified pipeline |
| Deal | 149 | Closing stage |
| Campaign | 7 | Acquisition source |
| Activity | 578 | Touchpoints / engagement |
| Support Ticket | 143 | Post-sale health |
| Consent State | 447 | GDPR opt-in/out |
| Risk Signal | 89 | Churn / at-risk flags |
A useful first signal: the source had no pre-modeled tables — no fact/dimension classification, no defined relationships beyond a flat parent-child hierarchy. This is exactly what a typical CRM export looks like. Everything is technically there, but nothing is shaped for analysis yet. Treating that export as if it were a ready-made model is the most common — and most expensive — mistake in a Power BI build.
This is the step teams skip under deadline pressure, and it's the one that pays for itself. Inspecting the actual field values turned up four issues that would have silently broken a dashboard.
Intuition said the funnel stage would be in a column called stage . It wasn't — or rather, stage was contaminated. It mixed two completely different concepts: buyer roles (Champion, Decision Maker, Influencer, Procurement) and actual funnel stages (MQL, SQL, Proposal). The real stage progression lived in the status field instead:
New → MQL → SQL → Discovery → Proposal → Negotiation → Closed Won / Closed Lost
Had we charted stage as the funnel axis — the obvious choice — every conversion number would have been wrong, and no visual would have revealed the error.
The status and stage values carried duplicates that look identical to a human and count as separate categories to Power BI:
Open vs open
Negotiation vs Negotation (a typo persisted across records)
Procurement vs Procurem. (an abbreviation)
Proposal vs proposal
Each pair splits one real category into two bars on a chart. Conversion rates, counts, and any measure grouped by status would be quietly understated.
The value_eur field was populated on only a handful of records. Any "pipeline value" or weighted-forecast measure built on it would be technically valid and practically meaningless. The honest move is to flag the coverage gap to stakeholders rather than present a confident-looking number on top of mostly-empty data.
Geography was split across two overlapping fields. country held clean ISO codes:
| Country | Leads |
|---|---|
| DE (Germany) | 607 |
| NL (Netherlands) | 602 |
| AT (Austria) | 586 |
| CH (Switzerland) | 576 |
| BE (Belgium) | 563 |
But region mixed three different levels in one column — "DACH" (multi-country), "Austria" (country-level), and "Germany North / Germany South" (sub-country). Because records carry both fields, naive totals on region exceed the country totals and confuse anyone reading the report. A clean geographic hierarchy has to be built deliberately, not inherited from the source.
The lesson generalizes: the fields that become your slicers are the fields most likely to be dirty. Verify them first.
With the data understood, the model takes shape. A star schema is still the right target for a lead-funnel dashboard, and it remains the recommended foundation for AI-ready semantic models — clean relationships, predefined measures, consistent naming, and descriptive fields are what let both humans and AI agents query the model without guessing.
Here is the dataset plan for the conversion-focused dashboard.
Fact_Funnel_Stage — one row per funnel entity (Lead Person, Opportunity, Deal), carrying:
entity_id
normalized_status (cleaned in Power Query — see below)
value_eur (with a null-coverage flag)
last_activity_date
foreign keys to each dimension
| Table | Built from | Key fields | Grain |
|---|---|---|---|
Dim_Stage | curated mapping | stage_name, stage_order (1–8), is_won, is_lost | One row per canonical stage |
Dim_Geography | country + region | country_code, country_name, region, super_region | One row per country |
Dim_Source | source_system | source_system | One per source |
Dim_Campaign | Campaign records | campaign_id, name | One per campaign |
Dim_Date | from last_activity_date | date, month, quarter, year | Standard date table |
Dim_Geography is the fix for the grain problem: it maps each country to a single canonical region and super-region, so slicers behave predictably instead of double-counting.
Total Leads = COUNTROWS(Fact_Funnel_Stage)
Closed Won = CALCULATE([Total Leads], Dim_Stage[is_won] = TRUE())
Win Rate = DIVIDE([Closed Won], [Total Leads])
Stage Conversion % — ratio of count at stage n+1 to stage n , using Dim_Stage[stage_order]
Avg Days in Stage — average gap between stage-entry dates
Power Query is where the verification findings turn into enforceable rules. The data preparation layer should, at minimum:
Normalize text — trim, fix casing, and map known variants ( Negotation → Negotiation , Procurem. → Procurement , lowercase open → Open ).
Separate concerns — strip buyer-role values out of the stage logic so only true funnel stages flow into Dim_Stage .
Map stage order — join to the curated Dim_Stage so charts sort by funnel position, not alphabetically.
Flag sparse fields — add a boolean column marking whether value_eur is populated, so report consumers can see coverage at a glance.
Build the geography key — resolve country to a single Dim_Geography row and derive region cleanly.
Do this once, in the query layer, and every downstream visual inherits clean data. Patch it visual-by-visual and you'll be chasing the same typo for the life of the report.
Once the model is honest, the report almost designs itself. The lead-funnel page lands on:
KPI cards: total leads, new leads, closed won, win rate, pipeline value (with the coverage caveat)
A stage funnel using the normalized status progression, with Closed Lost broken out separately
Stage-to-stage conversion, color-coded so weak transitions are obvious
Average days in stage, to expose where deals stall
A geographic page driven by Dim_Geography — a country choropleth plus a region breakdown that finally totals correctly
None of these visuals are clever. They're just trustworthy, because the three steps before them did the real work.
The takeaways travel well beyond this one dataset:
Verify before you visualize. The dirtiest fields are usually the ones you most want to slice by. Inspect distinct values early.
Don't trust field names. A column called stage held buyer roles; the real stages were in status . Read the data, not the labels.
Model deliberately. A star schema with clean dimensions and named measures is what makes a report maintainable — and what makes it usable by the AI agents now entering the BI stack.
Let AI accelerate, not decide. An agent can count, classify, and flag inconsistencies in minutes. The modeling judgment — what's a fact, how geography rolls up, which measures matter — stays with you.
AI-powered Power BI data modeling isn't about handing the build to a machine. It's about spending your saved time on the decisions that actually determine whether the dashboard is right.
DBI Analytics GmbH provides Power BI consulting and AI-assisted analytics, helping teams turn raw operational data into governed, decision-ready models. dbi-analytics.de