Connect to Power BI published reports with Python without API
Power BI is a great tool to import and transform data from various sources. Exporting data is however more difficult.
This Python function returns a Pandas DataFrame from Power BI service, given a dataset id and a table name (or Dax query), allowing you for exemple to save Power BI data in a file or in a SQL database.
This method uses Windows identity, also used in Analyse in Excel
, and should be used as a quick and convenient way to extract Power BI data. Power BI rest API execute queries should be preferred if you have Azure and Power BI relevant rights.
Requirements
- Test if
Analyse in Excel
is working for a report you have access to - Analysis Services client library MSOLAP should be version 15 (recent version 16 is not working as today)
- Install pywin32 via
pip install pywin32
orconda install pywin32
- Go to app.powerbi.com/datahub/datasets, select a dataset: end of url is your dataset id
- Update the code your dataset id and table name (list of tables available via
Show tables
button on Power BI dataset page) or use DAX code to generate a custom table with measures
The code
"""Connect to published Power BI datasets and export data with Python
Author:
https://LittleBigFrog.xyz - 13.01.2022
"""
import adodbapi #Install via pip install pywin32 or conda install pywin32
import pandas as pd
def GetData(DatasetId,TableQuery):
ConnectionString = f"""Provider=MSOLAP.8;
Integrated Security=ClaimsToken;
Persist Security Info=True;
Initial Catalog=sobe_wowvirtualserver-{DatasetId};
Data Source=pbiazure://api.powerbi.com;
MDX Compatibility=1;
Safety Options=2;
MDX Missing Member Mode=Error;
Identity Provider=https://login.microsoftonline.com/common, https://analysis.windows.net/powerbi/api, 929d0ec0-7a41-4b1e-bc7c-b754a28bddcc;
Update Isolation Level=2"""
Query = TableQuery if TableQuery.lower().startswith("select") else f"evaluate {TableQuery}"
conn = adodbapi.connect(ConnectionString)
DataFrame=pd.read_sql(Query, conn)
return DataFrame
# Find your DatasetId
# go to https://app.powerbi.com/datahub/datasets
# click on your dataset: end of url is your DatasetId
# click on "Show tables" to see available tables
DatasetId = "ee30a2fc-aezb-4eb8-81d2-d1ec21eec060"
QueryTablesList="Select TABLE_NAME from $SYSTEM.DBSCHEMA_TABLES where TABLE_SCHEMA='MODEL' and TABLE_TYPE='SYSTEM TABLE' "
TableName = "DatasetTableName" #not case sensitive
# List of Dataset tables
AvailableTables=GetData(DatasetId,QueryTablesList)['TABLE_NAME'].tolist()
# Get pandas dataframe for specific table
Data=GetData(DatasetId,TableName)
# Get data from a DAX query
Dax='SUMMARIZE (Sales,Sales[Color],"Sales", SUM ( Sales[Amount] ))'
DataDax=GetData(DatasetId,Dax)
# Export to Excel (requires openpyxl)
Data.to_excel("PowerBIexport.xlsx", index = False)
# Export to csv
Data.to_csv('out.csv')
# Export table to SQL server
from sqlalchemy import create_engine
#Build your connection: https://docs.sqlalchemy.org/en/14/dialects/mssql.html#module-sqlalchemy.dialects.mssql.pyodbc
connection_uri = create_engine(
"mssql+pyodbc://scott:tiger@myhost:49242/databasename"
"?driver=ODBC+Driver+17+for+SQL+Server"
"&authentication=ActiveDirectoryIntegrated"
)
engine = create_engine(connection_uri, fast_executemany=True)
Data.to_sql("SQL_Table_Name", engine, if_exists="replace", index=False)
Notes
- The function uses the connection generated for Power BI
Analyse in Excel
- The Windows identity of the caller is used to connect to Analysis Services (
Integrated Security=ClaimsToken
) - Beauty of the solution is avoiding authentification required for premium XLMA and Power BI REST api (you/IT need to register an Azure AD application and I'm still looking for a clear documentation)
- I'm investigating the issue with latest MSOLAP client library. Meanwhile, previous version 15.1.108.23 is available here: x64 and x86 (thank you Rahul Tenany for the links)
- It was tested on different machines and configurations: more testing and feedback is welcome