Skip to content

Disaster recovery

Disaster recovery

This document provides disaster recovery procedures for the App Store for Intune, including backup strategies, recovery runbooks, and high availability options.

Last updated: May 2026


  1. Recovery Objectives
  2. Built-in Protection (Tier 2)
  3. Recovery Procedures
  4. High Availability Setup (Tier 3)
  5. Backup Verification
  6. Emergency Contacts

MetricTarget
RTO (Recovery Time Objective)1-2 hours
RPO (Recovery Point Objective)5 minutes (SQL), 0 minutes (Storage with GRS)
Data Loss RiskMinimal, geo-redundant backups
MetricTarget
RTO (Recovery Time Objective)15-30 minutes
RPO (Recovery Point Objective)Near-zero with active geo-replication
Data Loss RiskNear-zero

The default ARM template deployment includes these disaster recovery features:

FeatureConfiguration
Automated BackupsEnabled by default
Point-in-Time Restore7 days (Basic tier)
Backup StorageGeo-redundant (cross-region)
Full BackupsWeekly
Differential BackupsEvery 12-24 hours
Transaction Log BackupsEvery 5-10 minutes

What’s protected:

  • All application data (apps, requests, settings)
  • User configurations and branding
  • Audit logs and history
FeatureConfiguration
RedundancyGeo-Redundant Storage (GRS)
Replication6 copies (3 primary + 3 secondary region)
FailoverAutomatic with RA-GRS or manual

What’s protected:

  • WinGet packaging queue messages
  • Package blobs (temporary)
FeatureConfiguration
Soft DeleteEnabled (7-day retention)
Purge ProtectionOptional (prevents permanent deletion)

What’s protected:

  • Entra ID client secret
  • SQL connection string
  • Storage connection string
FeatureProtection
Source CodeGitHub repository
Deployment PackageGitHub Releases (immutable)
ARM TemplateGitHub repository

Recovery: Redeploy from ARM template. No code backup needed.


Symptoms: User accidentally deleted requests, apps, or settings.

Recovery steps:

  1. Identify the deletion time from audit logs or user report

  2. Restore SQL Database to point-in-time:

    Terminal window
    # Azure CLI
    az sql db restore \
    --resource-group <resource-group> \
    --server <sql-server-name> \
    --name AppRequestPortal \
    --dest-name AppRequestPortal-Restored \
    --time "2026-02-14T10:00:00Z"
  3. Verify restored data by connecting to the restored database

  4. Option A - Full restore: Swap the restored database

    Terminal window
    # Rename original (backup)
    az sql db rename --resource-group <rg> --server <server> \
    --name AppRequestPortal --new-name AppRequestPortal-Old
    # Rename restored to production
    az sql db rename --resource-group <rg> --server <server> \
    --name AppRequestPortal-Restored --new-name AppRequestPortal
  5. Option B - Selective restore: Copy specific records from restored DB to production

  6. Restart App Service to reconnect:

    Terminal window
    az webapp restart --resource-group <rg> --name <app-name>

Estimated recovery time: 30-60 minutes


Symptoms: Application fails to start, “Key Vault reference failed” errors.

Recovery steps:

  1. Check if secret is soft-deleted:

    Terminal window
    az keyvault secret list-deleted --vault-name <vault-name>
  2. Recover the deleted secret:

    Terminal window
    az keyvault secret recover \
    --vault-name <vault-name> \
    --name AzureAdClientSecret
  3. If purged (permanently deleted), recreate the secret:

    Terminal window
    # Get new client secret from Entra ID App Registration
    az keyvault secret set \
    --vault-name <vault-name> \
    --name AzureAdClientSecret \
    --value "<new-secret-value>"
  4. Restart App Service:

    Terminal window
    az webapp restart --resource-group <rg> --name <app-name>

Estimated recovery time: 15-30 minutes


Symptoms: All Azure services in primary region unavailable.

Recovery steps:

  1. Create new resource group in secondary region:

    Terminal window
    az group create --name apprequest-dr --location westus2
  2. Deploy ARM template to secondary region:

    Terminal window
    az deployment group create \
    --resource-group apprequest-dr \
    --template-uri https://raw.githubusercontent.com/powerstacks-corp/app-store-for-intune/main/azuredeploy.json \
    --parameters \
    environmentName=prod \
    apiClientId=<client-id> \
    apiClientSecret=<client-secret> \
    frontendClientId=<frontend-client-id> \
    sqlAdminPassword=<password>
  3. Restore SQL Database from geo-backup:

    Terminal window
    # List available geo-backups
    az sql db geo-backup list \
    --resource-group <original-rg> \
    --server <original-server>
    # Restore to new server
    az sql db geo-restore \
    --resource-group apprequest-dr \
    --server <new-server-name> \
    --name AppRequestPortal \
    --geo-backup-id <backup-id>
  4. Update Entra ID redirect URIs:

    • Add new App Service URL to Frontend SPA app registration
    • Add new URL to Backend API if needed
  5. Update DNS (if using custom domain):

    • Point DNS to new App Service
    • Request new SSL certificate or migrate existing
  6. Verify application functionality

Estimated recovery time: 2-4 hours


Symptoms: WinGet packaging jobs fail, queue errors.

Recovery steps:

  1. For GRS accounts, initiate failover:

    Terminal window
    az storage account failover \
    --name <storage-account-name> \
    --resource-group <rg>

    Note: Failover may take up to 1 hour and makes secondary region the new primary.

  2. Alternative - Create new storage account:

    Terminal window
    az storage account create \
    --name <new-storage-name> \
    --resource-group <rg> \
    --location <location> \
    --sku Standard_GRS
  3. Update Key Vault with new connection string:

    Terminal window
    CONNECTION_STRING=$(az storage account show-connection-string \
    --name <new-storage-name> --resource-group <rg> -o tsv)
    az keyvault secret set \
    --vault-name <vault-name> \
    --name StorageConnectionString \
    --value "$CONNECTION_STRING"
  4. Restart App Service:

    Terminal window
    az webapp restart --resource-group <rg> --name <app-name>

Estimated recovery time: 1-2 hours


Scenario 5: Application corruption / bad deployment

Section titled “Scenario 5: Application corruption / bad deployment”

Symptoms: Application crashes, unexpected behavior after update.

Recovery steps:

  1. Roll back the application package:

    Terminal window
    # Point the App Service at the previous version on the package host.
    # Each release is published with a versioned filename in its channel folder, e.g.:
    # https://bi.powerstacks.com/appstoreforintune/bin/latest/AppRequestPortal-1.33.0.zip
    # (preview-channel builds live under .../bin/preview/)
    az webapp config appsettings set \
    --resource-group <rg> \
    --name <app-name> \
    --settings WEBSITE_RUN_FROM_PACKAGE="https://bi.powerstacks.com/appstoreforintune/bin/latest/AppRequestPortal-<version>.zip"
  2. Restart App Service:

    Terminal window
    az webapp restart --resource-group <rg> --name <app-name>
  3. Verify the rollback was successful.

Estimated recovery time: 5-15 minutes (add 30-60 minutes if a database restore is required)


For organizations requiring higher availability, implement these additional measures manually.

Creates a readable secondary database in another region with continuous replication.

Setup:

  1. Create geo-replica:

    Terminal window
    az sql db replica create \
    --resource-group <primary-rg> \
    --server <primary-server> \
    --name AppRequestPortal \
    --partner-server <secondary-server> \
    --partner-resource-group <secondary-rg>
  2. Configure failover group (recommended):

    Terminal window
    az sql failover-group create \
    --resource-group <primary-rg> \
    --server <primary-server> \
    --name apprequest-fog \
    --partner-server <secondary-server> \
    --partner-resource-group <secondary-rg> \
    --failover-policy Automatic \
    --grace-period 1
  3. Update connection string to use failover group endpoint:

    Server=tcp:apprequest-fog.database.windows.net,1433;...

Cost: ~$5/month additional (Basic tier replica)

Recovery time: Automatic failover in 1-2 minutes

Distributes traffic across multiple App Service instances.

Setup:

  1. Deploy secondary App Service in another region using ARM template

  2. Create Traffic Manager profile:

    Terminal window
    az network traffic-manager profile create \
    --resource-group <rg> \
    --name apprequest-tm \
    --routing-method Priority \
    --unique-dns-name apprequest-portal
  3. Add endpoints:

    Terminal window
    # Primary
    az network traffic-manager endpoint create \
    --resource-group <rg> \
    --profile-name apprequest-tm \
    --name primary \
    --type azureEndpoints \
    --target-resource-id <primary-app-id> \
    --priority 1
    # Secondary
    az network traffic-manager endpoint create \
    --resource-group <rg> \
    --profile-name apprequest-tm \
    --name secondary \
    --type azureEndpoints \
    --target-resource-id <secondary-app-id> \
    --priority 2

Cost: $0.75/million queries + secondary App Service ($55/month)

Enables read access to secondary region for faster failover decisions.

Setup:

Change storage redundancy parameter during deployment:

"storageRedundancy": "Standard_RAGRS"

Cost: ~$0.50/month additional over GRS


  • SQL Point-in-Time Restore Test

    • Restore database to 24 hours ago
    • Verify data integrity
    • Delete test database
  • Key Vault Recovery Test

    • List soft-deleted secrets
    • Verify recovery capability
    • Document secret expiration dates
  • ARM Template Deployment Test

    • Deploy to test resource group
    • Verify all resources created
    • Delete test deployment
  • Document Recovery Time

    • Record actual time for each recovery step
    • Update estimates if needed

Configure Azure Monitor alerts for:

AlertConditionAction
SQL DTU> 90% for 15 minEmail admins
App Service HealthUnhealthyEmail + webhook
Key Vault Access DeniedAnyEmail security team
Storage Availability< 99%Email admins

RoleContactResponsibility
Primary Admin[Your contact]First responder for incidents
Azure Supporthttps://portal.azure.com/#blade/Microsoft_Azure_SupportSeverity A for production down
Security Team[Security contact]Data breach or security incidents
PlanResponse Time (Sev A)Cost
BasicNo SLAFree
Developer8 hours~$29/month
Standard1 hour~$100/month
Professional Direct15 minutes~$1,000/month

Recommendation: Standard support for production deployments.


Is the application accessible?
├── YES: Can users sign in?
│ ├── YES: Is data correct?
│ │ ├── YES: No DR needed
│ │ └── NO: → Scenario 1 (Data Deletion)
│ └── NO: Key Vault issue? → Scenario 2
└── NO: Is it a single resource or entire region?
├── Single resource:
│ ├── SQL → Restore from backup
│ ├── Storage → Scenario 4
│ ├── App Service → Redeploy from ARM template
│ └── Key Vault → Scenario 2
└── Entire region: → Scenario 3 (Region Failure)

After deployment, record these values for emergency use:

ResourceHow to Find
Resource GroupAzure Portal → Resource Groups
SQL Server NameDeployment outputs or Portal
Storage Account NameDeployment outputs or Portal
Key Vault NameDeployment outputs or Portal
App Service NameDeployment outputs or Portal

Store this information securely outside of Azure (e.g., password manager, printed runbook).