Fast and convenient connection to SharePoint file in Power BI

A Power Query function that returns the file when providing its SharePoint link: Faster, less verbose and simpler authentification than default SharePoint connector.

A Power Query function that returns the file when providing its SharePoint link. Paste the function in you report, call it with any SharePoint file url.
Et voilà! No more endless waiting at every new editing step, a faster refresh, and only one source settings to manage.
 

Pros

  • Faster than default SharePoint.Files connector
  • Get your file with one simple step: SharePoint_File(YourFileURL)
  • The same function works for all your company SharePoint sites
  • Only one authentification and data source setting to manage for all your files (not by SharePoint site)

Cons

  • Requires hardcoding your tenant/company name in the function to avoid hand-authored queries issue
  • Requires to "Skip test connection" in Data source credentials to enable refresh in Power BI service

How To

  • Paste the function code in a new query and call it "SharePoint_File"
  • Update TenantName (currently "WayneCorp") in the first row of code with your tenant name (your SharePoint url starts with https://TenantName.sharepoint.com)
  • Find your file url : Click File –> Copy path from Excel application or use "People With Existing Access" on SharePoint online sharing menu
  • Get your file in any query by calling SharePoint_File(YourFileURL) in any query
  • Get credentials: choose organisational account. Power BI should only ask it on your first call of the function

The Code

let
  Source = 
    let TenantName    = "WayneCorp", // update it with your tenant name, usually your company name (your sharepoint url starts with TenantName.sharepoint.com)
      BaseUrl         = "https://" & TenantName & ".sharepoint.com/", 
      // SharePoint root site url has to be hardcoded to avoid hand-authored queries that cannot be refreshed in the Power BI service
      // see https://docs.microsoft.com/en-us/power-bi/connect-data/refresh-data#refresh-and-dynamic-data-sources
      BaseUrl_Updated = BaseUrl <> "https://WayneCorp.sharepoint.com/", 
      fn              = (SharePoint_File_url as text) as binary =>
        let
          FileUrl_ValidTenant = Text.StartsWith(SharePoint_File_url, BaseUrl),
          FileUrl_IsSharePoint =  Text.Contains(SharePoint_File_url, ".sharepoint.com/",Comparer.OrdinalIgnoreCase) and Text.StartsWith(SharePoint_File_url, "https://",Comparer.OrdinalIgnoreCase), 
          FileUrl_ValidFormat = FileUrl_IsSharePoint and Text.Contains(SharePoint_File_url, "/Shared%20Documents/",Comparer.OrdinalIgnoreCase), 
          FileUrl_IsValid     = FileUrl_ValidTenant and FileUrl_ValidFormat, 
          Path                = Uri.Parts(SharePoint_File_url)[Path],
          FileUrl_Host        = Uri.Parts(SharePoint_File_url)[Host],
          FileUrl_Tenant      = Text.BeforeDelimiter(Uri.Parts(SharePoint_File_url)[Host],".sharepoint.com"),
          PathList            = List.Buffer(List.Select(Text.Split(Path, "/"), each _ <> "")), 
          PathListEnd         = List.RemoveFirstN(PathList, List.PositionOf(PathList, "sites")), 
          RelativeUrl         = Text.Combine(List.FirstN(PathListEnd, 2), "/") & "/_api/web/getfilebyserverrelativeurl('/" & Text.Combine(PathListEnd, "/") & "')/$value", 
          FileBinary          = Web.Contents(BaseUrl, [RelativePath = RelativeUrl]), 
          LineBreak           = Character.FromNumber(10), 
          output              = 
            if FileUrl_IsValid and BaseUrl_Updated then
              FileBinary
            else if BaseUrl_Updated = false then
              error Error.Record(
                "TenantName not updated", null, 
                LineBreak & "TenantName not updated" & LineBreak & "Function will only work once you update TenantName variable in first row of function code" & LineBreak
                & (if FileUrl_IsSharePoint then "According to link provided, you should update the function code with TenantName=""" & FileUrl_Tenant &"""" else "")
              )
            else if FileUrl_ValidFormat and FileUrl_ValidTenant = false then
              error Error.Record("Url issue", null, LineBreak & "File link is for a file on  " & FileUrl_Host & LineBreak & "But function is setup for " & BaseUrl)
            else if FileUrl_IsValid = false then
              error Error.Record(
                "Url issue", 
                null, 
                 "Wrong SharePoint link" & LineBreak
                  & "The Sharepoint direct link should look like:"& LineBreak
                  & BaseUrl & "sites/SITENAME/Shared%20Documents/FOLDERS/FILENAME"
                  & LineBreak & LineBreak
                  & "Find the correct link:" & LineBreak
                  & "Click File –> Copy path from Excel or Office application" & LineBreak
                  & "Or use ""People With Existing Access"" on SharePoint online sharing menu " & LineBreak 
                  & "(and ask your IT to set it as default SharePoint option: https://docs.microsoft.com/en-us/sharepoint/change-default-sharing-link)"
              )
            else
              null
        in
          output, 
      fnType          = type function (SharePoint_File_url as text) as list
        meta [
          Documentation.Name            = "SharePoint_File", 
          Documentation.LongDescription = "Returns the SharePoint file (binary) when given its direct SharePoint link <br>"
            & (
              if BaseUrl_Updated then
                ""
              else
                "<br>SharePoint root site url has to be hardcoded to avoid hand-authored queries that cannot be refreshed in the Power BI service.<br>"
                  & "<b>Please update TenantName variable in the first row of the function code (in Formula Bar or Advanced Editor).</b><br>"
                  & "Your tenant name is usually your company name (your sharepoint url starts with TenantName.sharepoint.com).<br>"
            )
            & "<br>Refresh in Power Bi service requires to check ""Skip test connection"" in Data source credentials.<br>"
            & "Explanations and updates on https://littlebigfrog.xyz"
        ]
    in
      Value.ReplaceType(fn, fnType)
in
  Source
Code available on GitHub

Behind the scenes, the function will generate the following code:

Web.Contents(
"https://WayneCorp.sharepoint.com/",
[RelativePath=
"sites/Batcave/_api/web/getfilebyserverrelativeurl('/sites/Batcave/Shared Documents/General/Folder/Folder/file.xlsx')/$value"
])

The Performance Story

Connecting Power BI to a SharePoint file the default way is slow, time-consuming, cumbersome, inefficient and generates ugly M code.
There are three methods in Power Query to connect to a SharePoint file:

SharePoint.Files
Default connector used by the GUI, it generates a call to get all the files in the SharePoint and returns a table (technically, one API call by each 1000 files if you check Query Diagnostic). To get the targeted file, you must filter the table on Folder Path column, which is inconveniently the folder url, i.e. too long for convenient filtering in the GUI.

Slow performance seems to confirm there is no query folding, and you can expect one minute execution time for a SharePoint with thousands of files. Just don't be afraid of the Advanced Editor and never use SharePoint.Files again.

Icing on the cake: there is a limit on the number of files the query can return (50,000 items?). Your file may never appear, don't waste a day as I did to investigate why (usually the fact there is 50,000 files on a SharePoint is not your choice).

SharePoint.Contents
According to the documentation: Returns a table containing a row for each folder and document found at the specified SharePoint site. Each row contains properties of the folder or file and a link to its content. Meaning you first get a view of the root folder, and navigate to correct location like:

let
  Source=SharePoint.Contents(SharepointRootUrl)
          {[Name="Shared Documents"]}[Content]
          {[Name="General"]}[Content]
          {[Name="Folder"]}[Content]
          {[Name="Subfolder"]}[Content]
          {[Name="Subfolder"]}[Content]
          {[Name="FileName.xls"]}[Content]
in
  Source

Behind the scenes: more API calls, but faster as they only have to return folder items every time. Few seconds should be maximum execution time.
SharePoint.Contents is also you best move is you want to combine files in a folder. Paste your folder link in the tool to get the code to connect to a folder instead of a specific file.

Important caveat: you must have access to the root folder, i.e. you can't use SharePoint.Contents if you only have access to a folder via sharing.

Web.Contents
This method doesn't use SharePoint connectors, you use the web connector to generate the direct call to SharePoint API. And it's faster, a lot faster.
If you look at query diagnostic, this method is a shortcut: it is in fact the last step of the SharePoint.Files method. You spare Power BI to build a map of your SharePoint by giving him the location of the file.

What's the rush?

One minute refresh isn't an issue when refreshing a report, especially on Power BI service. One minute refresh time in Query Editor is painful but manageable. But just try the following with a crowded SharePoint:

  • Load the same table from the same Excel or csv file in two separate queries, one with SharePoint.Files and one with Web.Contents
  • Filter any value on any column on both queries
  • Pay attention to the time required for the same operation on the same file

You will notice every subsequent step have a noticeable lag when you are using SharePoint.Contents, the same simple operation can takes seconds when it's nearly instant if you connected via Web.Contents.

Why this additional lag? Doesn't Power query cache the same file as query result? Why is this lag only few seconds if the full query is one minute? From my understanding, Power BI uses the cache to recreate all steps of your query, not just the last output. So, to add a new step to your SharePoint.Files query, even a simple filtering, Power BI will re-run all the internal steps but using the cache: building the map of your SharePoint by merging cached chunk lists of files and their details, filtering the map to the desired path and file name, reading the file.

SHAREPOINT.FILES TO CONNECT TO A SHAREPOINT FILE IS INEFFICIENT, SLOW TO REFRESH AND SLOWS DOWN POWER QUERY EDITING EXPERIENCE FOR ALL SUBSEQUENT STEPS.
STOP USING THE GUI TO CONNECT TO A SHAREPOINT FILE TO GET A HEALTHIER EDITING EXPERIENCE.