Querying Azure Data Explorer using Power BI

This is the first post in this series, discussing the usage of Power BI to visualize Azure Data Explorer data. You can read the rest of the posts here:

I’ve been at Microsoft for several years now, working as part of the Azure Data Explorer (ADX) team. Although my focus is mainly behind the scenes working on the Control Plane of the service, one of my responsibilities is the Power BI Connector for Azure Data Explorer. The connector is a newer-style M-based connector. Its purpose is to allow you to easily query your ADX cluster, in either Import or Direct Query mode, and generates native Kusto Query Language (KQL) queries. The connector allows you to work mainly in Power BI if that’s your cup-of-tea, but also support more advanced scenarios, in case you’re an Azure Data Explorer / KQL kind of guy.

In this post series I’ll use one of our demo Azure Data Explorer clusters – it contains a database named “Samples”, which contains a “StormEvents” table, which contains a list of storm events that happened, along with their details. Although the table has only around 65K rows (ADX is able to handle billions of rows without any issue), it will serve well for what I have to show you.

Retrieve data from Azure Data Explorer

Let’s start with something simple. I’ll create a new data source, and select “Azure Data Explorer (Kusto)” as my source:

Selecting Azure Data Explorer connector in Power BI

Then, I’ll provide my cluster’s query URL. At this point, I can just press the “Connect” button, and use the user interface to select a database and a table (or function, more on that in a future post). However, it usually gives better results if you provide the database and table or query manually:

Connecting to Azure Data Explorer with Power BI

Notice that I’ve selected to load the data in Direct Query mode. without going into too much details, I’d usually use Import mode for dimension tables, and Direct Query mode for facts. It’s also the recommended way if you have big data, or if you need near-real time data in your report.

The M query generated looks like the following:

let
    Source = AzureDataExplorer.Contents("https://XXX.kusto.windows.net", "Samples", "StormEvents", [MaxRows=null, MaxSize=null, NoTruncate=null, AdditionalSetStatements=null])
in
    Source

I’ll discuss the meaning of the various settings in a future post. Suffice to see that the cluster URL, database name, and the table name I provided were used to invoke the AzureDataExplorer.Contents() method.

Displaying data in the PowerBI report

At this point, the data is ready to be loaded into my report:

Azure Data Explorer table in Power BI

I can now use the data to add some visualizations. I’ll start by adding a couple of filters, one for State, and another for Source – both are dimension columns, which we’ll meet again when we’ll discuss dimension tables:

Create filters when querying Azure Data Explorer using Power BI

Last, I’ll add a pie chart, showing how many events of each type we have:

Power BI Pie-chart based on Azure Data Explorer data

I used the EventType column in the “Legend” area, and the EventId column for the Values area. Notice how EventId is aggregated to count the number of values. I could have used any column in this case.

If you’ll look at the queries being sent to your Azure Data Explorer cluster, you can see that PBI, with the help of the ADX connector, generated something along the following queries (for the 2 filters, and the pie-chart):

StormEvents
| project ["State"]
| summarize by ["State"]
| order by ["State"] asc
| limit 101

StormEvents
| project ["Source"]
| summarize by ["Source"]
| order by ["Source"] asc
| limit 101

StormEvents
| summarize ["a0"]=countif(isnotnull(["EventId"])) by ["EventType"]
| limit 1000001

In the next post, I’ll discuss using a combination of Import and Direct Query mode, to work with dimension values, and improve performance of your reports.