Bulk Delete assets in Microsoft Purview with Power Automate flow

Posted by

We’re exploring Microsoft Purview and since the scanning of on-premises SQL Servers is free (for a limited time) we decided to scan our test environments, ending up with 1.1 million assets taking up 38 GB of metadata storage and therefore allocating 4 (instead of 1) capacity units. To reduce the costs we wanted to delete all assets of this collection, but it seemed that there was no way to do that in a user-friendly way in the user interface itself (maximum of 25 selected items). Searching for clues (idea 1, idea 2, idea 3, workaround) gave me the impression that it was only doable through the Azure Purview REST API: Bulk Delete.

So I’ve created a Power Automate flow to automate the deletion of assets in Microsoft Purview, although this can be done in the same way in Azure Logic Apps if you’re looking for an Azure-only approach. Here’s the overview of the flow:

Prerequisites:
To get started we need to create a service principal (application) in Azure AD (Entra ID), so we’re able to get an access token to use in calling the Azure Purview REST API. This token is valid for 60 minutes.

There are some challenges to take into account for the flow design:

  1. Keep on deleting batches of 50 assets per request until all (selected) assets are deleted
  2. The expiration of the access token after 60 minutes if the flow is still running.

To be able to take these two hurdles I ended up with a ‘Do until’ control within a ‘Do until’ control.

The flow is manually triggered with two input fields for selecting the relevant assets. In my case I used ‘entityType’ and ‘collectionId’. Next action is to initialize a string variable called ‘continuationToken’.

The first ‘Do until’ control has to run until the continuationToken is empty. The limits are set to 5000 for Count (maximum) and P1D for Timeout (1 day), since the flow might need to run for a long time.

Then we get the access token from Azure AD (Entra ID) with an HTTP request to: https://login.microsoftonline.com/{your-tenant-id}/oauth2/token
Set the client_id and client_secret of the service principal (application) in the body.

The second ‘Do until’ control will have to run until the continuationToken is empty or the access token is expired, so adding the following expression as the condition in advanced mode:

@or(lessOrEquals(int(body('HTTP_|_Get_access_token')?['expires_on']), div(sub(ticks(utcNow()), ticks('1970-01-01Z00:00:00')), 10000000)),empty(variables('continuationToken')))

FYI: the “expires_on” attribute is in Unix timestamp format, so utcNow() is converted to the same format and used in the logical comparison function lessOrEquals.

The limits are set to 5000 for Count (maximum) and P1D for Timeout (1 day).

To get a selection of to be deleted assets, we use the Discovery – Query endpoint of the Azure Purview REST API. The search query is based on this example Discovery_Query_EntityType, combined with this example Discovery_Query_PaginationContinuationPage. You can fine tune the query yourself to get the result you want. See the Type – List endpoint to get a list of all entity types (typeName).
According to the documentation the default value for limit is 50, with a maximum value of 1000, but the flow will give an error when limit is set greater than 50.

{
  "keywords": null,
  "limit": 50,
  "continuationToken": "@{variables('continuationToken')}",
  "filter": {
    "and": [
      {
        "entityType": "@{triggerBody()['text']}",
        "collectionId": "@{triggerBody()['text_1']}",
        "includeSubTypes": false
      }
    ]
  }
}

To call the Bulk Delete endpoint we need to build a request url like this:

DELETE {endpoint}/datamap/api/atlas/v2/entity/bulk?guid=18e06957-e265-967a-07f1-e14e2ab8940f&guid=cc0730ba-9b30-41f0-6953-559d17626d2b

A ‘Select’ and ‘Compose’ action will do the trick to generate the right formatted output.

The last HTTP request will delete the selected assets. And the continuationToken from the search query output is set to the variable, if there are more than 50 results. This will determine if the ‘Do until’ loop is exited or not.

This flow is now ready to remain active for a long time and can refresh the access token so that all selected assets can be deleted. Even millions 😉

In hindsight: is all for nothing?

While I was building and testing the flow, I found out by navigating Microsoft Purview that it is possible to start a bulk delete of assets in the user interface itself. Hahahahah. Well it was fun to build the flow anyway. So if you delete the right entityType the flow probably doesn’t need to run for long. 🤣

See my next blog post about: Bulk delete assets in Microsoft Purview via the user interface.

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *