Azure Databricks managed Service Principal and PAT automation
In this guide, we will walk through how we've setup the Azure Databricks ServicePrincipal Token Automation for FFA TITAN 2.0. Due to a limitation in Azure Databricks, we cannot use the databricks_obo_token
Terraform resource. It only supports Databricks on AWS. That's why we need a separate PowerShell script to generate a Personal Access Token (PAT) and securely store it in Azure Key Vault. This process ensures secure authentication and a consistent and repeatable deployment workflow.
A Databricks Managed Service Principal is a service identity created and managed within Databricks, rather than externally in Microsoft Entra ID or another identity provider.
A service principal is used for automated access control to Databricks workspaces and its resources such as clusters, workflows, jobs and and Unity Catalog resources.
Microsoft Entra ID Service Principal (SPN) tokens are short-lived, typically valid for only 1 hour (3600 seconds), requiring frequent re-authentication. On the other hand, Databricks PATs can be long-lived (e.g., valid for months), making them more practical for automated tasks that require persistent access over a longer period of time, and it works without an authentication provider.
Scenario | Use PAT | Use Microsoft Entra ID Token |
Long-running jobs | ✅ | ❌ (tokens expire quickly) |
API authentication | ✅ | ❌ (some APIs require PATs) |
Temporary authentication | ❌ | ✅ (short-lived token is fine) |
Automated scripts (CI/CD) | ✅ | ❌ (avoids refresh logic) |
One-time API access | ❌ | ✅ (short-lived is fine) |
Service Principal in Databricks | ✅ | ❌ (SPN cannot log in with OAuth) |
In our case we are running an application that requires persistent access to the Azure Databricks Workspace-instance. Hence, we need Managed Databricks ServicePrincipal Token Automation process.
Terraform does not support generating PATs for Azure Databricks service principals natively via the databricks_obo_token
resource. This resource only works with Databricks on AWS. Without a workaround for Azure Databricks, this would require manual token creation, increasing security risks and deployment complexity.
To overcome this challenge, we combine Terraform for infrastructure provisioning with a PowerShell script to generate and store the PAT securely. This ensures a fully automated and repeatable setup whenever we need it.
We want a Databricks managed Service Principal to securely and privately connect to our databricks workspace through it's PAT. Enabling it to perform queries on external tables in Unity catalog or perform other actions (based on permissions) without human interaction. Before we start, let's get clear what FFA TITAN 2.0 platform-components are involved to reach our goal:
# | Platform component | Location | Public access |
1 | FFA Titan Workspace-instance | FFA Titan Azure VNET | Disabled |
2 | Azure Databricks Service Principal | Databricks account level | Disabled |
To achieve secure and private connectivity between [1] and [2] we need to achieve the following in our Infrastructure-as-Code:
# | IaC-goal | language |
1 | Create Databricks Service Principal in Databricks account | terraform |
2 | Generate a secret for the Databricks Service Principal | terraform |
3 | Store the secret in Azure Key Vault | terraform |
4 | Add the Databricks Service Principal to the Databricks workspace group | terraform |
5 | Assign permissions to allow Databricks Service Principal to generate PAT | terraform |
6 | Execute a PowerShell script to generate a PAT | terraform |
This step creates a Databricks managed Service Principal on the Databricks account level
resource "databricks_service_principal" "ffa_titan_databricks_asktitan_spn" {
provider = databricks.accounts
display_name = "${var.titan_prefix}${var.customer_abb}${var.environment}-databricks-asktitan-spn"
force = true
}
This step creates a secret for the Databricks managed Service Principal on the Databricks account level.
resource "databricks_service_principal_secret" "ffa_titan_databricks_asktitan_spn_secret" {
provider = databricks.accounts
service_principal_id = databricks_service_principal.ffa_titan_databricks_asktitan_spn.id
}
This step stores the secret created in step 2 in an Azure Key Vault for safe keeping and usage later on.
resource "azurerm_key_vault_secret" "ffa_titan_databricks_asktitan_spn_keyvault_secret" {
name = "ffatitan-databricks-asktitan-spn-secret"
value = databricks_service_principal_secret.ffa_titan_databricks_asktitan_spn_secret.secret
key_vault_id = data.azurerm_key_vault.ds_ffa_titan_key_vault.id
}
This step adds the Databricks managed Service Principal to the respective Databricks workspace group.
resource "databricks_group_member" "ffa_titan_databricks_workspace_spns_group_asktitan_spn_member" {
provider = databricks.accounts
group_id = data.databricks_group.ffa_titan_databricks_workspace_group_spns.id
member_id = databricks_service_principal.ffa_titan_databricks_asktitan_spn.id
}
This step grants the Databricks managed Service Principal rights to generate PAT in workspace.
resource "databricks_permissions" "token_usage" {
provider = databricks.workspace
authorization = "tokens"
access_control {
service_principal_name = databricks_service_principal.ffa_titan_databricks_asktitan_spn.application_id
permission_level = "CAN_USE"
}
This step executes the Powershell script that generates the PAT for the Databricks managed Service Principal (replacing the terraform obo-token resource).
# params
param (
[Parameter(Mandatory = $true)][string]$azureKeyVaultName,
[Parameter(Mandatory = $true)][string]$azureKeyVaultSecretName,
[Parameter(Mandatory = $true)][string]$azureDatabricksAccountId,
[Parameter(Mandatory = $true)][string]$azureDatabricksHost,
[Parameter(Mandatory = $true)][string]$azureDatabricksSpnClientId,
[Parameter(Mandatory = $true)][string]$azureDatabricksSpnSecret
)
# get env vars
$azureTenantId = "${env:ARM_TENANT_ID}"
$azurePrincipalAppId = "${env:ARM_CLIENT_ID}"
$azurePrincipalSecret = "${env:ARM_CLIENT_SECRET}"
$azureSubscriptionId = "${env:VAR_AZURESUBSCRIPTIONID}"
# Install powershell Azure Module
if(-not (Get-Module Az.KeyVault -ListAvailable)){
Install-Module -Name Az.KeyVault -AllowPrerelease -Force -Verbose -Scope CurrentUser
}
# Functions
#Function to authenticate to Mircosoft Azure with Terraform Service Prinpical (SPN)
function Connect-AzServicePrincipal {
param (
[Parameter(Mandatory = $true)][string]$AzurePrincipalAppId,
[Parameter(Mandatory = $true)][string]$AzurePrincipalSecret,
[Parameter(Mandatory = $true)][string]$AzureTenantId,
[Parameter(Mandatory = $true)][string]$AzureSubscriptionId
)
try {
# Create a secure string for the Service Principal's secret
$SecuredPassword = $AzurePrincipalSecret | ConvertTo-SecureString -AsPlainText -Force
# Create a PSCredential object using the App ID and secure password
$Credential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $AzurePrincipalAppId, $SecuredPassword
# Connect to Azure using the Service Principal
Connect-AzAccount -ServicePrincipal -TenantId $AzureTenantId -Credential $Credential
Write-Output "Successfully connected to Azure as Service Principal: $AzurePrincipalAppId"
# Select the specified subscription
Select-AzSubscription -Subscription $AzureSubscriptionId
Write-Output "Successfully selected subscription: $AzureSubscriptionId"
} catch {
Write-Output "Error while connecting to Azure or selecting subscription:" -ForegroundColor Red
Write-Output $_.Exception.Message
}
}
# Function to get a Databricks AccessToken based on Service Principal secret (SPN)
function Get-AccessToken {
param (
[Parameter(Mandatory = $true)][string]$databricksAccountId,
[Parameter(Mandatory = $true)][string]$ClientId,
[Parameter(Mandatory = $true)][string]$ClientSecret,
[Parameter(Mandatory = $false)][string]$Scope = "all-apis" # Default scope
)
# construct TokenEndpointUrl
$TokenEndpointUrl = "https://accounts.azuredatabricks.net/oidc/accounts/${azureDatabricksAccountId}/v1/token"
try {
# Encode the client ID and secret for Basic Auth
$EncodedAuth = [Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes("${azureDatabricksSpnClientId}:${azureDatabricksSpnSecret}"))
# Define the body of the POST request
$Body = @{
grant_type = "client_credentials"
scope = $Scope
}
# Perform the POST request
$Response = Invoke-RestMethod -Uri $TokenEndpointUrl -Method Post -Headers @{
Authorization = "Basic $EncodedAuth"
} -Body $Body -ContentType "application/x-www-form-urlencoded"
# Return the access token
return $Response.access_token
} catch {
Write-Host "Error while fetching access token:" -ForegroundColor Red
Write-Host $_.Exception.Message
return $null
}
}
function Create-ServicePrincipalPersonalAccessToken {
param (
[Parameter(Mandatory = $true)][string]$DatabricksHost,
[Parameter(Mandatory = $true)][string]$ServicePrincipalAccessToken,
[Parameter(Mandatory = $false)][int]$LifetimeSeconds = 3600, # Default to 1 hour
[Parameter(Mandatory = $false)][string]$Comment = "Service Principal PAT" # Default comment
)
try {
# Define the headers
$Headers = @{
Authorization = "Bearer $ServicePrincipalAccessToken"
"Content-Type" = "application/json"
}
# Define the body
$Body = @{
# lifetime_seconds = $LifetimeSeconds
comment = $Comment
} | ConvertTo-Json -Depth 2
# Make the POST request
$Response = Invoke-RestMethod -Uri "https://$DatabricksHost/api/2.0/token/create" -Method Post -Headers $Headers -Body $Body
# Output the token
Write-Host "Token created successfully:" -ForegroundColor Green
return $Response.token_value
} catch {
Write-Host "Error while creating token:" -ForegroundColor Red
Write-Host $_.Exception.Message
return $null
}
}
# Create Azure KeyVault Secret
function Create-AzKeyVaultSecret {
param (
[Parameter(Mandatory = $true)][string]$AzureKeyVaultName,
[Parameter(Mandatory = $true)][string]$SecretName,
[Parameter(Mandatory = $true)][string]$SecretValue
)
try {
# Get existing Key Vaults
$PresentKV = Get-AzKeyVault
# Check if the Key Vault exists
if ($PresentKV.VaultName -eq $AzureKeyVaultName) {
Write-Output "Key Vault exists: ${AzureKeyVaultName}"
# Get existing secrets from the Key Vault
$PresentSecrets = Get-AzKeyVaultSecret -VaultName $AzureKeyVaultName
# Check if the secret already exists
if ($PresentSecrets.Name -eq $SecretName) {
Write-Output "Secret '${SecretName}' already exists in Key Vault '${AzureKeyVaultName}'. Nothing to do."
} else {
Write-Output "Creating secret '${SecretName}' in Key Vault '${AzureKeyVaultName}'..."
# Create the new secret
$SecureSecret = ConvertTo-SecureString -String $SecretValue -AsPlainText -Force
Set-AzKeyVaultSecret -VaultName $AzureKeyVaultName -Name $SecretName -SecretValue $SecureSecret
Write-Output "=> Finished creating secret '${SecretName}' in Key Vault '${AzureKeyVaultName}'."
}
} else {
# If the Key Vault does not exist
Write-Output "Key Vault '${AzureKeyVaultName}' does not exist. Secret '${SecretName}' cannot be created."
Write-Output "=> !!!Please create the Key Vault '${AzureKeyVaultName}'!!!"
}
} catch {
Write-Output "Error while managing the Key Vault secret: $_" -ForegroundColor Red
}
}
# MAIN LOGIC
# Step 1: Authenticate to Azure for Keyvault action with Terraform Service Principal
Write-Host "Connecting to Microsoft Azure with Terraform Service Principal..." -ForegroundColor Cyan
$AzureConnection = $null
try {
$AzureConnection = Connect-AzServicePrincipal -AzurePrincipalAppId $azurePrincipalAppId `
-AzurePrincipalSecret $azurePrincipalSecret `
-AzureTenantId $azureTenantId `
-AzureSubscriptionId $azureSubscriptionId
if ($AzureConnection) {
Write-Host "Azure connection successful." -ForegroundColor Green
} else {
Write-Host "Failed to connect to Azure. Exiting the script." -ForegroundColor Red
return
}
} catch {
Write-Host "Error during Azure connection:" -ForegroundColor Red
Write-Host $_.Exception.Message
return
}
# Step 2: Retrieve Databricks Access Token
Write-Host "Retrieving Databricks AccessToken for Service Principal ID from Databricks..." -ForegroundColor Cyan
$databricksAccessToken = $null
try {
$databricksAccessToken = Get-AccessToken -databricksAccountId $azureDatabricksAccountId `
-ClientId $azureDatabricksSpnClientId `
-ClientSecret $azureDatabricksSpnSecret
if ($databricksAccessToken) {
Write-Output "OUTPUT ACCESSTOKEN succesfully created"
} else {
Write-Host "Failed to retrieve Databricks Access Token. Exiting the script." -ForegroundColor Red
return
}
} catch {
Write-Host "Error while retrieving Databricks Access Token:" -ForegroundColor Red
Write-Host $_.Exception.Message
return
}
# Step 3: Create Personal Access Token (PAT) for Databricks Service Principal
Write-Host "Creating a Personal Access Token for the ASK TITAN Service Principal..." -ForegroundColor Cyan
$PersonalAccessToken = $null
try {
$PersonalAccessToken = Create-ServicePrincipalPersonalAccessToken -DatabricksHost $azureDatabricksHost `
-ServicePrincipalAccessToken $databricksAccessToken `
-Comment "pat-ask-titan-created-by-terraform"
if ($PersonalAccessToken) {
Write-Host "OUTPUT PAT succesfully created" -ForegroundColor Green
} else {
Write-Host "Failed to create Personal Access Token. Exiting the script." -ForegroundColor Red
return
}
} catch {
Write-Host "Error while creating Personal Access Token:" -ForegroundColor Red
Write-Host $_.Exception.Message
return
}
# Step 4: Store Personal Access Token in Azure Key Vault
Write-Host "Storing Databricks Personal Access Token for Service Principal as Azure Key Vault secret..." -ForegroundColor Cyan
try {
Create-AzKeyVaultSecret -AzureKeyVaultName $azureKeyVaultName `
-SecretName $azureKeyVaultSecretName `
-SecretValue $PersonalAccessToken
Write-Host "Successfully stored the secret in Azure Key Vault." -ForegroundColor Green
} catch {
Write-Host "Error while storing secret in Azure Key Vault:" -ForegroundColor Red
Write-Host $_.Exception.Message
return
}
Additionally we have the following null-resource terraform-block to execute the powershell-script as part of the terraform-process.
resource "null_resource" "ffa_titan_databricks_asktitan_pat" {
provisioner "local-exec" {
command = ".'${path.cwd}/${var.stack}/helper/06_azure_databricks_pat_generation.ps1' -azureKeyVaultName '${data.azurerm_key_vault.ds_ffa_titan_key_vault.name}' -azureKeyVaultSecretName 'ffatitan-databricks-asktitan-spn-pat' -azureDatabricksAccountId '${var.databricks_account_id}' -azureDatabricksHost '${data.azurerm_databricks_workspace.ds_ffa_titan_databricks_workspace.workspace_url}' -azureDatabricksSpnClientId '${databricks_service_principal.ffa_titan_databricks_asktitan_spn.application_id}' -azureDatabricksSpnSecret '${databricks_service_principal_secret.ffa_titan_databricks_asktitan_spn_secret.secret}'"
interpreter = ["pwsh", "-Command"]
}
}
By automating the setup of a Databricks Service Principal and PAT using Terraform and PowerShell, we ensure a seamless and secure deployment process. This approach eliminates manual intervention, enhances security, and ensures consistency across different environments.
The PAT is present within the respective databricks workspace-instance (visible via databricks-REST-API-call):
And in addition, the Databricks Service Principal' PAT has been safely stored in Azure KeyVault.
We can now safely use the PAT by sourcing it from the Azure Keyvault secret in downstream steps or applications where we want to authenticate to the databricks workspace-instance with the Databricks managed Service Principal
Automating this process with Terraform ensures that our platform remains ISO27001 compliant while leveraging a fully managed SQL environment.
Would you like assistance in setting up a secure Databricks environment? Contact us!
Set an explicit expiration date using the lifetime_seconds
parameter. Example API request:Invoke-RestMethod -Uri "https://{
databricks_host}/api/2.0/token/create" -Method POST
-Headers @{ "Authorization" = "Bearer $accessToken" } `
-Body @{ lifetime_seconds = 7776000; comment = "90-day PAT for SPN" } | ConvertTo-Json
Tooling
- Azure Subscription: Active subscription with necessary permissions.
- Azure Databricks: Workspace with the ability to create Service Principals and generate PATs.
- Terraform: Installed locally or in CI/CD, compatible with Azure/Databricks providers.
- Azure CLI: Installed for managing Azure resources.
- PowerShell: Required for executing the PAT generation script.
Variables
- Azure Subscription & Resource Group IDs: Needed for deploying Databricks and Key Vault resources.
- Databricks Workspace URL: Required for API interactions (https://adb-xxxx.x.azuredatabricks.net
).
- Azure Key Vault Name: Used to store the Databricks PAT securely.
- Tenant ID, Client ID, Client Secret: Credentials for the Azure Service Principal.
- Environment Variables: Terraform authentication variables (ARM_CLIENT_ID
, ARM_TENANT_ID
, etc.).
Libraries
- Terraform Providers:azurerm
: For managing Azure resources.databricks
: For configuring Databricks Service Principals.
- Azure PowerShell Modules:Az.Accounts
: Authentication to Azure.Az.KeyVault
: Managing Key Vault secrets.
Permissions
- Azure RBAC: Service Principal must have Contributor
or Owner
access to the Databricks workspace and Key Vault.
- Databricks Workspace Permissions:Must have "CAN_USE"
permission on token management.
Assigned to a Databricks group with appropriate access.
- Key Vault Access: SPN must have "Key Vault Secrets User"
role.
Networking Setup
- Azure Databricks Workspace Connectivity: Ensure connectivity from Terraform execution environment.
- Azure Key Vault Access: Ensure no firewall restrictions preventing access to secrets.
Azure Databricks managed Service Principal and PAT automation
Secure Databricks Serverless Compute environment