Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (2023)

I am very excited to publish this blog post and can't wait to start using Microsoft Fabric and all the different components.

The goal of my blog post is to see how you might use Microsoft Fabric from the point of view of a Power BI developer.

Microsoft Fabric is a new all-in-one platform that includes Power BI. It's a single area where you can now complete all of your data requirements without the overhead of having to configure and manage multiple components. It is also possible to use the various components with a smooth experience.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (1)

As you can see below, this is how Power BI Direct Lake works, where you don't need to import the data into Power BI, but you can read the data directly from OneLake.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (2)

In this blog post below I'll show you how I made it work.

My approach comes from a Power BI user where I am comfortable using the no-code or low-code approach to getting things done.

In my example below, data is being inserted into an Azure SQL database every minute.

Then I'll show you how I use the new features in Fabric to keep the whole process flowing.

I will use the following functions:

  • Lake House

    • This is where all the data will be stored.
  • Gen2 data stream

    • This is the new version of dataflows used by Power Query Online.
  • Lake Direct Dataset

    • This is the Power BI dataset that pulls the data directly from Lakehouse without having to import it.
  • Power BI Reporter

    • This is where I create my Power BI reports
  • data pipelines

    • The data pipe is what I use to update or load my data.
  • Power Automate (out of structure, but still used to update Power BI dataset)

The first step in the process for me is to create a new Lakehouse.

Lakehouse is where all the files and tables will be stored.

When I think of Lakehouse, I think of files as source files that are simply loaded in their original form from their source.

While the tables in Lakehouse are the parquet files that have been processed through Dataflow Gen2 or Spark Notebooks.

I navigate to my application workspace, which is not configured as a Microsoft Fabric application workspace.

So in the bottom left corner I click on the Power BI icon and select Data Engineering

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (3)

So at the top under again, I clicked on Lakehouse.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (4)

I then entered my name LH_FourMoo and clicked Create.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (5)

Once set up I was able to see my Lakehouse as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (6)

Now I had created my Lakehouse

The next step was to create my Gen2 dataflow, which will pull data from my Azure SQL Server database and load it directly into the Lakehouse table.

I clicked on New Dataflow Gen2.

You will notice that this is very similar to Dataflow.

I then clicked Import from SQL Server.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (7)

I then entered all the necessary credentials for the server, the database and clicked Login to authenticate. And once I logged in, I clicked Next.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (8)

I then browsed to my "tb_TriedentData" table, selected the table, and clicked Create.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (9)

I then added a new column called "Blog Post" to Dataflow Gen2 as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (10)

Before proceeding to load data into Lakehouse, all columns must have a defined data type.

If they don't, as my new column called "Blog Post" currently renders, if I tried to load the data into Lakehouse, I would get the following below, where ANY data type cannot load the data into Lakehouse.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (11)

I went to my column called "Blog Posts" and set the data type to Text as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (12)

Another thing to note is that tables going to Lakehouse can't have spaces, so I also had to change the column name to Blog_Post. If you had a space, you would have the same screen as above where the column would not be part of the table loaded into Lakehouse.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (13)

The next step I needed to configure is to add a data destination which is where I want the data to go.

In my example I want to put this data in my Lakehouse named "LH_FourMoo", to do that on the ribbon I clicked Home then Add Data Destination and select Lakehouse as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (14)

It should then detect that it will be a Lakehouse connection as shown below. Then I clicked Next.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (15)

Next, I need to select the Lakehouse location where I want to create my new Lakehouse table. As shown below, I selected the Lakehouse that I created earlier.

I also chose to create a new table and gave it the table name "tb_TridentData".

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (16)

Then I clicked Next.

On the "Select target options" screen, I then selected the update method to replace and in the column mapping I could see that all my columns would be mapped.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (17)

I then clicked Save Settings, which then saves all my data destination details.

Now I could see the details of the data destination at the bottom right of my Dataflow Gen2.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (18)

Then I clicked Publish to publish my Dataflow Gen2

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (19)

Currently, by default, what it does is save the dataflow with a default name as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (20)

Once I click publish, I also have to wait for the dataflow post to complete.

NOTE: This is the first version of the dataflow, no data has been inserted.

To change the name of the dataflow, I scrolled to select the three dots and then selected "Properties"

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (21)

I then renamed it to "DF - Azure SQL Data" and clicked Save.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (22)

I then clicked the refresh now button which will now load the data into my lakehouse table.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (23)

So I could see that Dataflow Gen2 completed successfully.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (24)

To confirm this, I entered my Lakehouse "LH_FourMoo"

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (25)

Then I could see that my table was in Lakehouses under Tables as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (26)

I can also confirm that it is in the required Delta/Parquet format used to create tables in the lake house (note this little triangle on the table icon + the _delta_log folder in the file hierarchy behind this table). If it was just Parquet, we wouldn't be able to take advantage of Direct Lake mode. Delta/Format file format for DirectLake by clicking the three dots next to "tb_Trident" and selecting "View Table Files"

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (27)

So I was able to see all the files in my table.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (28)

One other thing to note, there's a great new feature where on the left hand side I can see my app workspace and below that I can also see my Lakehouse that I have open.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (29)

If you were to open the LH_FourMoo SQL endpoint, you would see that it was also opened on the left hand side.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (30)

I have now completed the creation of my Dataflow Gen2, published it and uploaded the data to Lakehouse in the required parquet format.

Now this is the most amazing part where I have to create my DirectLake dataset where I don't have to import the data into Power BI but just create my dataset directly from the tables in Lakehouse.

To do this, I must first go to my Lakehouse "LH_FourMoo"

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (31)

Once in Lakehouse, I clicked on New Power BI Dataset

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (32)

I then selected my table "tb_TridentData" and clicked continue

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (33)

Then I was able to see how my dataset was created.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (34)

Once it was created I was able to see the dataset

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (35)

The first thing I wanted to do was rename the default dataset "L_FourMoo(1)" at the top left.

I clicked on the name and updated it as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (36)

To save the new name, I clicked the name at the top again.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (37)

The next step for me was to create a Power BI report.

In my dataset, I clicked Create a report and from the dropdown menu I chose "Start over."

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (38)

Since this data set is not complex, I made a single image with the details. By doing that, this will show me if the data is updating once I put it together. Here's what the Power BI report looks like.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (39)

I then saved my report as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (40)

The final piece was that I needed to create a data pipeline that would update the data in my gen2 dataflow and then update the Power BI dataset as well.

Although I am aware that the Power BI dataset uses DirectLake, I still need to refresh the dataset, which will then perform a metadata refresh so that I can see the refreshed data directly from the parquet files.

Before completing my data pipeline, I had already created a Power Automate flow that will update the Power BI dataset, starting with "When an HTTP trigger is received" and then updating the Power BI dataset.

Below is an overview of what I have created with Power Automate, I will not go into detail on how this is completed as I feel it is beyond the scope of the current blog post.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (41)

On the bottom left I made sure I selected "Data Engineering"

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (42)

Then I clicked on Datapipeline.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (43)

Then the new pipeline window appeared, I inserted the name "DP - Azure SQL" and clicked Create.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (44)

I then clicked Add Pipeline Activity.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (45)

Then I chose Dataflow.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (46)

On the General tab, I renamed "Dataflow 1" to "DF - Azure SQL" and left all other default values

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (47)

I then clicked on Settings, clicked on the drop down menu and selected "DF - Azure SQL Data" as shown below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (48)

I then clicked Activities on the ribbon and selected "Web"

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (49)

So in the general settings I named it "PA - Update URL" and left the other settings default.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (50)

I then clicked Settings and pasted in the HTTPS URL of my Power Automate flow that I had shown earlier, selected the "GET" method, and left the defaults.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (51)

My last step was to drag "On Success" from the dataflow to the web activity, which means I'll first refresh the dataflow (which will fetch the fresh data) and then refresh the Power BI dataset (via of the Power Automate flow)

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (52)

I then clicked save to save my pipeline activity.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (53)

The last step was to run the pipeline to make sure everything works as expected I clicked Run.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (54)

I was then able to see my pipeline details in the output as I ran.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (55)

And then when it completed, I was able to confirm it in the output window as below.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (56)

Then when I went to my Power BI report, I could see that it updated the report with the details.

Using Power BI DirectLake on Microsoft Fabric - FourMoo | energy BI | Data analysis (57)

In this blog post, I've demonstrated step-by-step how to use Microsoft Fabric to create a Direct Lake for Power BI datasets and Power BI reports.

I'm really excited about Microsoft Fabric and what it has to offer.

Please let me know if there's any other content you'd like me to blog about and I'll see what I can do.

Related

Top Articles
Latest Posts
Article information

Author: Golda Nolan II

Last Updated: 05/24/2023

Views: 6231

Rating: 4.8 / 5 (58 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Golda Nolan II

Birthday: 1998-05-14

Address: Suite 369 9754 Roberts Pines, West Benitaburgh, NM 69180-7958

Phone: +522993866487

Job: Sales Executive

Hobby: Worldbuilding, Shopping, Quilting, Cooking, Homebrewing, Leather crafting, Pet

Introduction: My name is Golda Nolan II, I am a thoughtful, clever, cute, jolly, brave, powerful, splendid person who loves writing and wants to share my knowledge and understanding with you.