SharePoint Online source connector
The SharePoint Online source connector consists of multiple flows, which are described below. These flows use the MS Graph API v1 and the SharePoint Online REST API.
Graph API
In order to connect to the Graph API you need to register an application using the Azure portal. For more information see MS Graph API v1 authentication.
The following API permissions are required using the Application Permissions
tab:
Sites.Selected
for specific sites orSites.Read.All
for the entire tenantUser.Read.All
Group.Read.All
(optional for retrieving groups)TermStore.Read.All
(optional for retrieving managed metadata)
When limited site access is given with the Sites.Selected
permission, the administrator has to grant access to the sites. For example, this PnP PowerShell command can be used:
Grant-PnPAzureADAppSitePermission.
For more information see Updates on controlling app specific access on specific SharePoint sites (Sites.Selected)
REST API
When managed metadata (Term Store) is used or when permissions have to be set, access to the SharePoint Online REST API is required. For this purpose permissions have to be granted to the previously registered Graph API application.
On site level
This can be done by going to this page: https://<tenantName>.sharepoint.com/<site/>_layouts/15/appinv.aspx
(replace <tenantName>
with the name of the tenant and <site/>
with the relative site url).
For the App ID
field use the client id
value of the previously created Graph API application. Paste this XML snippet in field App's permission request
:
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest
Scope="http://sharepoint/content/sitecollection"
Right="Read"
/>
</AppPermissionRequests>
On tenant level
This can be done by going to this page: https://<tenantName>-admin.sharepoint.com/_layouts/15/appinv.aspx
(replace <tenantName>
with the name of the tenant).
For the App ID
field use the client id
value of the previously created Graph API application. Paste this XML snippet in field App's permission request
:
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest Scope="http://sharepoint/content/tenant" Right="Read" />
</AppPermissionRequests>
For more information see Granting access using SharePoint App-Only
Features
- Exporting content structure
- Exporting users
- Exporting groups
- Exporting user OneDrives
- Exporting content types
- Exporting managed metadata
Rate limit
Every application has its own limits in a tenant, which are based on the number of licenses purchased per organization. In the HTTP request components, the limits are set to make sure throttling is avoided for the lowest amount of license count (0 - 1k), with a safe margin. More info can be found here.
Flows
SharePoint Online (1. Content)
The first flow exports the tenant content and users.
Settings
mongoConnection
The Mongo connection string including the database name to connect to.
tenantID
The ID of the tenant to connect to.
clientID
The client ID of the application.
clientSecret
The client secret of the application.
rootID
The ID of the root site to start crawling using the following format:
<tenantName>.sharepoint.com,<siteId>,<webId>
To get the siteID and webID use the following urls:
<siteURL>/_api/site/id
to get the siteID
<siteURL>/_api/web/id
to get the webID
Setting root
as value will retrieve all sites and underlying content directly under the tenant, sites that are in /sites
are excluded.
When getAllSites
is set to true
, rootID
will be ignored, the requested rootID will already be retrieved by getting all sites including sub sites.
getAllSites
When set to true
all sites and sub sites within the tenant will be retrieved.
Microsoft Teams are also retrieved when this setting is enabled.
getOneDrives
When set to true
the OneDrives of the users are retrieved.
getGroups
When set to true
the groups are retrieved.
getHiddenLists
When set to true
the hidden lists are retrieved.
Origin
Specifies the origin of the document in the Content Store.
SharePoint Online (2. Content Types & Term Store)
This flow exports the content types for each stored site and document library in the Content Store. Furthermore, for each site it also exports the term store.
Settings
mongoConnection
The Mongo connection string including the database name to connect to.
tenantID
The ID of the tenant to connect to.
tenantName
The name of the tenant to connect to.
clientID
The client ID of the application.
clientSecret
The client secret of the application.
Origin
Specifies the origin of the document in the Content Store.
SharePoint Online (3. Permission Levels)
This flow exports all permission levels of each site in the Content Store. It is only required to run this flow when (custom) permissions are set on objects that are migrated.
Settings
mongoConnection
The Mongo connection string including the database name to connect to.
tenantID
The ID of the tenant to connect to.
tenantName
The name of the tenant to connect to.
clientID
The client ID of the application.
clientSecret
The client secret of the application.
Origin
Specifies the origin of the document in the Content Store.