I recently had a customer ask about a specific scenario involving Power BI’s On-Prem Data Gateway and thought it would be helpful to clear the air about how the gateway is used and the different types of roles and data connectivity methods involved.
To quote official Microsoft docs, the On-Premises Data Gateway “acts as a bridge providing quick and secure data between on-premises data and the Power BI, Microsoft Flow, Logic Apps, and PowerApps services.” Personally I love that Microsoft uses the term bridge, because I believe that’s the best analogy to describe the tool, and will be the analogy that I follow.
(Side note: the Data Gateway consists of three components – Gateway Cloud Service, Azure Service Bus, and Application Gateway, as seen in the picture below – but for the sake of keeping technical depth limited, I will be not be addressing them individually in this post. See official documentation here if interested in reading more.)
Growing up in Savannah, GA, I’m way too familiar with the Talmadge Bridge that connects Savannah to South Carolina. I imagine architects were involved building the bridge, and some sort of administrative body (for simplicity, let’s say the government) is responsible for owning and maintaining it once it’s built.
That being said, the Talmadge Bridge is simply a connector – a process, so to say – between Georgia and South Carolina; it doesn’t realize its true value until the bridge allows access to vehicles, enabling movement of goods across Savannah and South Carolina with greater ease. Each vehicle is assigned a registrant, but it can be used by many drivers to transport goods with the appropriate permissions.
So let’s take that analogy and apply it to Power BI’s architecture:
- Talmadge Bridge = Data Gateway
- Savannah = Power BI Cloud Service
- South Carolina = On-Prem Network
- Architects = Microsoft
- Government = Gateway Administrators
- Vehicles = data sources
- Goods = data
- Registrant = Stored Credential
- Drivers = Data Source Users
Rewriting the above paragraphs while substituting in the Power BI nomenclature and adjusting grammar to fit, we get:
Growing up in the Power BI Cloud Service, I’m way too familiar with the Data Gateway that connects the Power BI Cloud Service to an On-Prem network. I imagine Microsoft was involved in building the Data Gateway, and some sort of administrative body (Gateway Administrators) is responsible for owning and maintaining it once it’s built.
That being said, the Data Gateway is simply a connector – a process, so to say – between the Power BI Cloud Service and an On-Prem Network; it doesn’t realize its true value until the Data Gateway allows access to data sources, enabling movement of data across the Power BI Cloud Service and an On-Prem Network with greater ease. Each data source is assigned a Stored Credential, but it can be used by many Data Source Users to transport data with the appropriate permissions.
Knowing that, let’s elaborate on some of the terms used:
- Gateway Administrators are users who can administer the Data Gateway itself, and have no impact on or communication with any data source added to the gateway
- A Stored Credential must be designated anytime a data source is created and configured within the data source settings. (Each Stored Credential is encrypted using the Recovery Key stored on the Data Gateway’s machine.) Unless the report/dashboard uses a Live Connection to Analysis Services or DirectQuery with Kerberos for SSO enabled (more info here), all data will be transported across the gateway using this account.
- Data Source Users are simply users who are empowered to publish Reports and Dashboards in the service using the data source. To state the converse, an individual cannot use a data source to publish content unless he/she is added as a Data Source User. (Note: there is no need to create Data Source Users for individuals who are only consuming content.)
Finally, let’s cover some common scenarios and unpack what’s important to consider when configuring a Data Gateway. I typically recommend that, due to Power BI’s high performance query engine and fully featured data experience, customers import data whenever possible, especially when using Power BI Premium. That being said, there are certainly reasons where it makes more sense to utilize DirectQuery, and will defer you to Microsoft’s official documentation to read more, as it explains it much better than I can.
Scenario #1 – Data Import
The simplest real-world scenario to consider is when data is imported into the Power BI Cloud Service. In this case, the Stored Credential is the only account who needs access to the data source. Any report developer will also need to be a Data Source User in order to publish datasets or reports connecting to a data source added to the gateway.
Scenario #2 – DirectQuery without Kerberos SSO
This scenario operates in the same fashion as Data Import. Power BI connects to the data source using the Stored Credential, and any user who needs to publish reports using the data source needs to be added to the list of Data Source Users.
Scenario #3 – DirectQuery with Kerberos SSO
By configuring a Data Gateway with Keberos, the gateway facilitates single sign-on (SSO) using DirectQuery, which it uses to connect to on-prem data sources. In this case, queries execute under the identity of the user interacting with the Power BI Cloud Service, and thus sees precisely the data for which they have permissions in the underlying data source. Per how DirectQuery functions, each cross-filter, slice, sort, and report edit operation can result in queries hitting the data source. Like the above, all report publishers will need to be Data Source Users, but regardless all consumers’ identities will be passed through to the data source.
Scenario #4 – Live Connection to Analysis Services
One important distinction of using Live Connection across a Data Gateway is that the Stored Credential must have Server Admin permissions for the Analysis Service instance. Outside of that, this works similar to scenario #3 in that the effective username (read: the email address used to sign in to Power BI) is passed through the Data Gateway to the on-prem Analysis Services server each time a user interacts with a report inside the Power BI cloud service. Depending on how your organization manages identities, you may need to map Power BI users to Windows accounts in Analysis Services, which is covered in more detail in this article.
That’s it! In my experience, the Data Gateway is a simple process to understand, but complexity can increase drastically as you expand the number of users, data sources, and methods to connect to data. As always, please consult official Microsoft Power BI documentation for overall guidance, and specifically this link to get started on a thorough walkthrough of using the Data Gateway.
One Reply to “Power BI Data Gateway: What is it and who needs access?”
Comments are closed.