Azure Role Deep Dives - Complete Guide

Role Deep Dive: Azure Administrator


Role Overview

Azure Administrators implement, manage, and monitor an organization’s Azure environment. They are the day-to-day operators of Azure infrastructure — provisioning resources, managing identities, configuring networking, and ensuring everything runs smoothly. Think of them as the sysadmin of the cloud.

Alternative Titles: Cloud Administrator, Azure Infrastructure Administrator, Cloud Ops Engineer

Typical Salary Range: $75,000 – $130,000 (US)


Core Responsibilities

1. Identity & Access Management (20% of role)

Granular Tasks: - Create dynamic groups based on user attributes (e.g., department == “Engineering”) - Configure Conditional Access: “Require MFA for all users accessing Azure Management Portal from outside corporate IP range” - Set up PIM: Make User Access Administrator eligible (not permanent), require approval and MFA on activation, 4-hour max duration - Configure named locations in Entra ID for trusted IP ranges - Set up Entra Connect / Cloud Sync for hybrid identity (on-prem AD → Entra ID sync) - Manage password writeback for hybrid environments - Configure Entra ID audit logs → Log Analytics workspace - Create custom roles when built-in roles don’t fit

2. Infrastructure Management (25% of role)

Granular Tasks: - Create VM from marketplace image or custom image - Attach managed disks (Premium SSD, Standard SSD, Ultra Disk) to VMs - Configure VMSS with autoscale rules: scale out when CPU > 70%, scale in when CPU < 30% - Set up Azure Backup: daily backup at 2 AM, retain 30 days, configure vault with geo-redundancy - Apply VM extensions: Custom Script Extension, Azure Monitor Agent, MDE extension - Configure Proximity Placement Groups for latency-sensitive workloads - Enable boot diagnostics and serial console access - Manage Spot VMs for non-critical batch workloads - Resize VMs based on Azure Advisor recommendations

3. Storage Management (15% of role)

Granular Tasks: - Create storage account with appropriate SKU: Standard_GRS for production, Standard_LRS for dev/test - Configure lifecycle management: move blobs to Cool after 30 days, Archive after 90, delete after 365 - Generate SAS tokens with minimal required permissions and short expiry (1 hour for dev, 24 hours max for prod) - Configure storage account firewall: allow only specific VNets and IP ranges - Set up private endpoints for storage accounts - Enable soft delete on blob containers (7-365 day retention) - Configure Azure File Sync: sync on-prem file server to Azure Files, enable cloud tiering - Monitor storage metrics: ingress, egress, availability, latency

4. Virtual Networking (20% of role)

Granular Tasks: - Design VNet address space: 10.0.0.0/16 with subnets: web (/24), app (/24), data (/24), management (/28) - Create NSG rules: allow 443 from Internet to web subnet, allow 8080 from web to app subnet, deny all other inbound - Set up VNet peering between hub and spokes, enable “Allow gateway transit” on hub - Configure Private DNS Zone for internal resolution: link to VNet, enable auto-registration - Deploy Azure Bastion in AzureBastionSubnet (/26 minimum) - Create route table: 0.0.0.0/0 → Azure Firewall for forced tunneling - Configure VPN Gateway: route-based, VpnGw2, with BGP for dynamic routing - Set up Application Gateway: listener on port 443, SSL certificate, backend pool with VMs, WAF enabled in prevention mode - Configure private endpoints for Storage, SQL, Key Vault — create DNS A records in private zone

5. Monitoring, Backup, and Disaster Recovery (15% of role)

Granular Tasks: - Create Log Analytics workspace, configure data retention (30-730 days) - Deploy Azure Monitor Agent (AMA) via Data Collection Rules (DCR) - Create alert rules: CPU > 80% for 5 minutes → email ops team via action group - Configure diagnostic settings: send resource logs to Log Analytics + Storage Account - Set up Azure Site Recovery: replicate VMs to secondary region, configure failover policies, run test failovers quarterly - Create Azure Backup: daily backup, 30-day retention, enable soft delete on vault - Build workbooks for operational dashboards - Configure VM Insights for performance monitoring - Write KQL queries for common investigations: failed logins, resource changes, security events

6. Governance & Compliance (5% of role)


Azure Services Used Daily

Category Services
Compute VMs, VM Scale Sets, Azure Bastion
Storage Blob Storage, Azure Files, Managed Disks
Networking VNet, NSG, VNet Peering, VPN Gateway, Load Balancer, App Gateway, Azure Firewall, DNS, Private Link
Identity Entra ID, PIM, Managed Identities
Monitoring Azure Monitor, Log Analytics, Application Insights, Network Watcher
Backup/DR Recovery Services Vault, Azure Site Recovery, Azure Backup
Governance Azure Policy, Resource Locks, Tags, Management Groups
Management ARM, Azure Portal, CLI, PowerShell

Day-in-the-Life Scenarios

Scenario 1: New Application Onboarding

  1. Receive request: “Deploy infrastructure for new web app”
  2. Create resource group in appropriate subscription
  3. Provision VNet/subnet or use existing
  4. Deploy App Service or VMs based on requirements
  5. Configure NSG rules, private endpoints for backend services
  6. Set up monitoring (App Insights, alerts)
  7. Configure backup
  8. Apply tags (project, environment, cost-center)
  9. Document in runbook

Scenario 2: Security Incident Response

  1. Alert fires: “Unusual sign-in from Russia on admin account”
  2. Check Entra ID sign-in logs → confirm suspicious activity
  3. Disable user account / revoke sessions
  4. Enable Conditional Access: block sign-in from non-approved countries
  5. Review audit logs for actions taken by compromised account
  6. Escalate to security team
  7. Document incident, update policies

Scenario 3: Cost Optimization Sprint

  1. Run Azure Advisor cost recommendations
  2. Identify underutilized VMs → right-size or schedule auto-shutdown
  3. Find unattached managed disks → delete
  4. Review storage accounts → move cool data to Cool/Archive tier
  5. Check for idle public IPs → release
  6. Implement budget alerts for each resource group

Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals (start here)
AZ-104 Associate Core cert for this role — Azure Administrator
AZ-500 Associate Azure Security Engineer (good complement)
AZ-305 Expert Azure Solutions Architect (next step)

AZ-104 Exam Breakdown

Domain Weight
Manage Azure identities and governance 20-25%
Implement and manage storage 15-20%
Deploy and manage Azure compute resources 20-25%
Implement and manage virtual networking 20-25%
Monitor and maintain Azure resources 10-15%

Interview Focus Areas

Must-Know Questions

  1. How do you implement least privilege in Azure? → RBAC at appropriate scope, PIM for admin roles, Conditional Access, just-in-time VM access

  2. How do you recover a deleted VM? → If soft-delete enabled on vault: recover from backup. If not: redeploy from image/template, restore data from backup.

  3. How do you force all outbound traffic through Azure Firewall? → UDR 0.0.0.0/0 → Azure Firewall IP, associate with all subnets

  4. Explain VNet peering and its limitations. → Non-transitive, both sides must agree, charge for cross-peering traffic, gateway transit possible

  5. How do you automate VM deployment? → ARM templates / Bicep / Terraform, custom images in Shared Image Gallery, VM extensions for post-deploy config

  6. Storage account access: key vs SAS vs Entra ID? → Key = full access, avoid in prod. SAS = granular, time-limited. Entra ID = recommended, RBAC-based, audit-logged.

  7. How do you monitor Azure resources at scale? → Azure Monitor + Log Analytics + diagnostic settings on all resources, alert rules with action groups, workbooks for dashboards

  8. How do you handle DR for VMs? → Azure Site Recovery: replicate to paired region, configure RTO/RPO, test failover quarterly, automate failover with runbooks

  9. What’s the difference between Availability Set and Availability Zone? → Set = FD+UD within DC (99.95% SLA). Zone = separate DCs in region (99.99% SLA). Zone is stronger but not all regions support it.

  10. How do you manage secrets for applications? → Key Vault + Managed Identity. App gets token from IMDS, authenticates to Key Vault. No credentials in code or config.


Tools & Scripts

Common Azure CLI Commands

# Create resource group
az group create --name myRG --location eastus

# Create VM
az vm create --resource-group myRG --name myVM --image Ubuntu2204 --size Standard_D2s_v5 --admin-username azureuser --generate-ssh-keys

# Create storage account
az storage account create --name mystorageacct --resource-group myRG --location eastus --sku Standard_GRS --kind StorageV2

# Create NSG rule
az network nsg rule create --resource-group myRG --nsg-name myNSG --name Allow-HTTP --priority 100 --protocol Tcp --destination-port-ranges 80 443 --access Allow

# Enable backup
az backup protection enable-for-vm --resource-group myRG --vault-name myVault --vm myVM --policy-name DefaultPolicy

# List all VMs by size
az vm list --query "[].{Name:name, Size:hardwareProfile.vmSize, RG:resourceGroup}" -o table

Common PowerShell Commands

# Create resource group
New-AzResourceGroup -Name myRG -Location eastus

# Create VM
New-AzVM -ResourceGroupName myRG -Name myVM -Image Ubuntu2204 -Size Standard_D2s_v5

# Get all public IPs (cost review)
Get-AzPublicIpAddress | Select-Object Name, IpAddress, ResourceGroupName

# Stop VMs in a resource group
Get-AzVM -ResourceGroupName myRG | Stop-AzVM -Force

# Set diagnostic settings
Set-AzDiagnosticSetting -ResourceId $vm.Id -WorkspaceId $workspace.Id -Enabled $true

Role Deep Dive: Azure Consultant


Role Overview

Azure Consultants advise organizations on cloud strategy, architecture, migration, and optimization. They bridge business requirements and technical implementation. Part strategist, part architect, part translator between business and technology.

Alternative Titles: Cloud Consultant, Azure Advisory Consultant, Cloud Strategy Consultant, Digital Transformation Consultant

Typical Salary Range: $100,000 – $175,000 (US)


Core Responsibilities

1. Cloud Strategy & Advisory (25% of role)

Granular Tasks: - Conduct cloud readiness assessment: inventory on-prem workloads, categorize by migration strategy (6 Rs: Rehost, Refactor, Rearchitect, Rebuild, Replace, Retain) - Build TCO calculator: compare on-prem 5-year cost (hardware, power, cooling, staff, licensing) vs Azure 3-year reserved instance pricing - Create executive presentation: projected savings, timeline, risk mitigation, ROI - Define landing zone architecture (enterprise-scale or smaller) - Write cloud adoption plan: phases, priorities, dependencies, quick wins - Assess organizational readiness: skills gap analysis, training plan, change management - Define governance baseline: policies, RBAC model, cost management, security baseline - Create RACI matrix for cloud operations

2. Migration Planning & Execution (25% of role)

Granular Tasks: - Deploy Azure Migrate appliance in VMware/Hyper-V environment - Run dependency mapping to understand application interconnections - Categorize workloads: Wave 1 (easy wins — dev/test, web apps), Wave 2 (databases), Wave 3 (complex/legacy) - Define migration strategy per workload: - Rehost (lift & shift): VMs → Azure VMs, minimal changes - Refactor: VMs → App Service, SQL Server → Azure SQL - Rearchitect: Monolith → microservices on AKS - Rebuild: Rewrite as cloud-native (Functions, Cosmos DB) - Replace: Move to SaaS (e.g., on-prem CRM → Dynamics 365) - Retain: Keep on-prem (regulatory, not ready, not worth migrating) - Create runbook for each migration: pre-migration checks, migration steps, validation, rollback plan - Plan DNS cutover strategy (low TTL, staged cutover) - Define testing approach: smoke tests, performance tests, user acceptance testing - Plan for data migration: offline (Data Box) vs online (ExpressRoute), estimate transfer time

3. Architecture Design (20% of role)

Granular Tasks: - Design hub-spoke network topology: hub VNet with Azure Firewall, VPN/ExpressRoute gateway; spoke VNets per workload/department - Choose compute platform: App Service (web apps) vs AKS (microservices) vs VMs (legacy/custom OS) vs Container Apps (serverless microservices) - Design data architecture: Azure SQL (relational) vs Cosmos DB (NoSQL) vs Synapse (analytics) based on workload profile - Design HA/DR: Availability Zones for compute, geo-redundant storage, Azure Site Recovery, active-active or active-passive multi-region - Apply Azure Well-Architected Framework 5 pillars: - Reliability: multi-region, auto-failover, health probes, circuit breakers - Security: zero-trust, private endpoints, WAF, MFA, encryption at rest and in transit - Cost Optimization: right-sizing, reserved instances, auto-shutdown, lifecycle policies - Operational Excellence: IaC, CI/CD, monitoring, alerting, runbooks - Performance Efficiency: autoscaling, CDN, caching (Redis), partitioning (Cosmos DB) - Create architecture decision records (ADRs) for key decisions - Present design to ARB: justify choices, trade-offs, alternatives considered

4. Governance & Compliance Advisory (15% of role)

Granular Tasks: - Design management group hierarchy: Tenant root → Platform (Identity, Connectivity, Management) → Landing Zones (Corp, Online, SAP, etc.) - Define Azure Policy initiatives per landing zone: allowed regions, required tags, enforce encryption, prevent public endpoints - Design RBAC model: platform ops (Owner at platform MG), workload ops (Contributor at RG), auditors (Reader at subscription), break-glass account (Owner at root, PIM-activated) - Create governance scoreboard: track policy compliance, cost vs budget, security score - Define tagging strategy: Environment, Project, CostCenter, Owner, DataClassification - Design sandbox subscriptions for dev/experimentation (relaxed policies, budget caps) - Map compliance controls to Azure Policy: e.g., PCI DSS 3.4 → enforce encryption at rest, HIPAA → audit logging enabled

5. Cost Optimization & FinOps Advisory (10% of role)

Granular Tasks: - Review Azure Cost Management: top spenders by service, resource group, tag - Identify quick wins: unused resources, right-sizing recommendations, dev/test auto-shutdown - Model Reserved Instance savings: 1-year vs 3-year, analyze usage patterns for RI recommendations - Design chargeback/showback model: cost allocation by tags, budget per department - Implement budget alerts: 50% warning, 80% critical, 100% action - Review storage tiering: move infrequent data to Cool/Archive - Evaluate Azure Hybrid Benefit for Windows and SQL Server licensing

6. Stakeholder Communication (5% of role)


Azure Services Used

Category Services
Migration Azure Migrate, Data Migration Assistant, Database Migration Service, Data Box
Governance Management Groups, Azure Policy, Blueprints, RBAC, PIM, Cost Management
Networking VNet, VNet Peering, ExpressRoute, VPN Gateway, Azure Firewall, Private Link
Compute VMs, App Service, AKS, Functions, Container Apps
Storage Blob Storage, Azure Files, ADLS Gen2
Databases Azure SQL, SQL MI, PostgreSQL, Cosmos DB
Monitoring Azure Monitor, Log Analytics, Application Insights, Advisor
Security Entra ID, Key Vault, Defender for Cloud, Sentinel
IaC ARM Templates, Bicep, Terraform

Key Frameworks & Methodologies

Microsoft Cloud Adoption Framework (CAF)

Azure Well-Architected Framework (WAF)

5 pillars: Reliability, Security, Cost Optimization, Operational Excellence, Performance Efficiency

6 Rs of Migration

  1. Rehost — Lift & shift. VM to VM. Fastest, least risk.
  2. Refactor — Minor changes. VM to App Service. Some optimization.
  3. Rearchitect — Major redesign. Monolith to microservices. Significant effort.
  4. Rebuild — Rewrite from scratch. Cloud-native. Highest effort.
  5. Replace — Move to SaaS. On-prem CRM → Dynamics 365.
  6. Retain — Don’t migrate. Regulatory, not ready, or not worth it.

Day-in-the-Life Scenarios

Scenario 1: Cloud Migration Assessment

  1. Meet with client CTO to understand business drivers (reduce datacenter costs, improve DR)
  2. Deploy Azure Migrate appliance in their VMware environment
  3. Run 2-week discovery: 200 VMs, 15 SQL Servers, 5 legacy apps
  4. Analyze dependency maps: identify app groupings
  5. Categorize: 120 VMs → Rehost, 30 → Refactor (App Service), 15 SQL → Azure SQL MI, 5 legacy → Replace, 30 → Retain
  6. Build TCO: $2.1M on-prem (3-year) vs $1.4M Azure (3-year RI) = 33% savings
  7. Create phased migration plan: Wave 1 (dev/test, 4 weeks), Wave 2 (web apps, 8 weeks), Wave 3 (databases, 12 weeks)
  8. Present to stakeholders with risk assessment and mitigation plan

Scenario 2: Landing Zone Design

  1. Client needs enterprise-scale landing zone for 50 workloads
  2. Design management group hierarchy (CAF enterprise-scale)
  3. Define platform landing zones: Identity, Connectivity, Management
  4. Define application landing zones: Corp (internal apps), Online (external apps)
  5. Implement Azure Policies per landing zone
  6. Design hub-spoke network with Azure Firewall Premium
  7. Implement ExpressRoute for hybrid connectivity
  8. Define RBAC model and PIM configuration
  9. Document everything, get sign-off, hand off to implementation team

Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals
AZ-104 Associate Azure Administration (practical foundation)
AZ-305 Expert Core cert for this role — Solutions Architect
DP-900 Foundational Data fundamentals (complement)
SC-900 Foundational Security fundamentals (complement)

Interview Focus Areas

  1. Walk me through a cloud migration you led. → Assessment → 6 Rs categorization → phased plan → landing zone → execution → validation

  2. How do you build a business case for Azure migration? → TCO calculator, 3-year cost comparison, include indirect benefits (DR, agility, security), present ROI timeline

  3. What is the Cloud Adoption Framework? → Microsoft’s proven guidance: Strategy, Plan, Ready, Adopt, Govern, Manage. Landing zones for scalable foundation.

  4. How do you handle a client who wants to lift-and-shift everything? → Acknowledge it’s valid for speed, but explain they’ll miss cloud benefits (auto-scaling, PaaS, serverless). Propose phased approach: lift-and-shift Phase 1, optimize in Phase 2.

  5. How do you design a landing zone? → CAF enterprise-scale: management groups, platform LZs (connectivity, identity, management), application LZs (corp, online), policies, RBAC, networking hub-spoke.

  6. What governance would you implement from day 1? → Allowed locations policy, required tags, prevent public endpoints on PaaS, RBAC with PIM, budget alerts, diagnostic settings on all resources.

  7. How do you handle multi-region architecture? → Active-passive (primary region + DR with ASR/failover groups) vs active-active (Front Door, multi-region writes with Cosmos DB). Trade-off: cost vs RTO.

  8. A client has regulatory requirements (HIPAA/PCI). How do you approach? → Compliance matrix, Azure Policy for compliance controls, Defender for Cloud regulatory compliance dashboard, private endpoints everywhere, encryption at rest/transit/use, audit logging to immutable storage.

  9. How do you estimate migration timelines? → Per wave: assessment (2 weeks), landing zone (2-4 weeks), migration (2-8 weeks per wave depending on complexity), testing (1-2 weeks), optimization (ongoing). Total 3-12 months for enterprise.

  10. What’s your approach to cost optimization? → Right-size VMs, Reserved Instances for steady-state, Spot for fault-tolerant, auto-shutdown dev/test, storage tiering, delete orphaned resources, chargeback model with budgets.

Role Deep Dive: Azure Cloud Support Engineer


Role Overview

Azure Cloud Support Engineers troubleshoot, diagnose, and resolve technical issues in Azure environments. They are the frontline responders when things break — working tickets, performing root cause analysis, and ensuring service reliability. Strong debugging skills and deep Azure knowledge are essential.

Alternative Titles: Azure Support Engineer, Cloud Operations Engineer, Azure SRE (Site Reliability Engineer — overlaps), Cloud Infrastructure Support

Typical Salary Range: $70,000 – $120,000 (US)


Core Responsibilities

1. Incident Response & Troubleshooting (35% of role)

Granular Tasks: - VM won’t start: Check boot diagnostics, serial console, review activity log for recent changes, check disk health, verify quota limits, check if deallocated or stopped - App Service returns 502/503: Check App Service logs, verify backend health, check if scaling is maxed, review CPU/memory metrics, check deployment slot status, verify application errors in App Insights - VNet connectivity issue: Use Network Watcher IP Flow Verify, Next Hop, Connection Troubleshoot. Check NSG rules, UDR routes, VNet peering status, DNS resolution - Storage account access denied: Check SAS token expiry, firewall rules, private endpoint configuration, RBAC permissions, CORS settings - AKS pod in CrashLoopBackOff: Check pod logs (kubectl logs), describe pod for events, check resource limits, verify image pull, check ConfigMap/Secret references - VPN tunnel down: Check gateway health, shared key, BGP status, on-prem device configuration, bandwidth utilization - SQL Database connection failures: Check firewall rules, VNet service endpoints, connection limits, DTU/vCore utilization, failover status - Entra ID sign-in failures: Check Conditional Access policies, MFA status, user account status, sign-in logs for error codes

2. Monitoring & Alerting (20% of role)

Granular Tasks: - Create metric alert: VM CPU > 85% for 5 minutes → email + PagerDuty - Create log alert: KQL query for failed logins > 10 in 5 minutes → Security team - Configure action groups: email, SMS, webhook, Logic App for auto-remediation - Review alert fatigue: merge overlapping alerts, adjust thresholds, suppress during maintenance windows - Build operational dashboard: top 10 resources by CPU, storage utilization, active alerts, cost trend - Configure Service Health alerts: track Azure service incidents affecting your resources - Monitor SLA: track uptime per service, create SLA breach alerts

3. Platform Maintenance (15% of role)

Granular Tasks: - Configure Update Management Center: schedule monthly patching, exclude production during business hours - Rotate storage account access keys (key1 and key2 rotation strategy) - Renew App Service SSL certificates (use Key Vault auto-renewal) - Scale up App Service Plan during peak events, scale down after - Clean up old backups beyond retention period - Rotate service principal secrets before expiry - Update VM extensions (Azure Monitor Agent, MDE) - Perform scheduled maintenance: resize VMs, migrate to newer SKU, update networking

4. User & Access Support (10% of role)

Granular Tasks: - User can’t access Azure Portal → Check RBAC assignment, Conditional Access policy, MFA registration, account enabled status - User locked out of VM → Reset via Azure Bastion or VM access extension, verify NSG allows their IP - Service principal expired → Renew secret/certificate, update application config - User needs temporary elevated access → Activate via PIM with approval workflow - External user can’t access resource → Check B2B guest account, external collaboration settings, RBAC assignment

5. Documentation & Knowledge Management (10% of role)

Granular Tasks: - Write runbook: “Troubleshooting App Service 502 Errors” with step-by-step diagnosis - Create SOP: “New Environment Provisioning Checklist” - Post-incident report: timeline, root cause, impact, remediation, preventive measures - Document escalation procedures: when to open Microsoft support ticket, severity levels, required information - Create FAQ for common user requests

6. Automation & Tooling (10% of role)

Granular Tasks: - Automation runbook: auto-restart App Service when health check fails - Logic App: when alert fires → create ticket in ServiceNow → notify team on Teams - PowerShell script: bulk tag resources, bulk stop/start dev VMs on schedule - CLI script: generate weekly cost report and email to stakeholders - Self-service: Logic App that lets users request temporary NSG rule opening (with approval)


Azure Services Used Daily

Category Services
Compute VMs, App Service, AKS, Functions
Networking VNet, NSG, VPN Gateway, Load Balancer, App Gateway, DNS, Private Link, Network Watcher
Storage Blob Storage, Azure Files, Managed Disks
Databases Azure SQL, PostgreSQL, Cosmos DB, Redis
Identity Entra ID, PIM, RBAC, Key Vault
Monitoring Azure Monitor, Log Analytics, App Insights, Service Health, Network Watcher
Automation Azure Automation, Logic Apps, CLI, PowerShell
Support Azure Support Plans, Service Health, Resource Health

Troubleshooting Decision Trees

VM Issues

VM Issue
├── Can't Start
│   ├── Check Activity Log → recent changes?
│   ├── Check Boot Diagnostics → screenshot of console
│   ├── Check Serial Console → login and check logs
│   ├── Check Disk → attached? healthy? size?
│   ├── Check Quota → vCPU limit reached?
│   └── Check Platform → Azure service incident?
├── Slow Performance
│   ├── Check Metrics → CPU, Memory, Disk I/O, Network
│   ├── Check Disk Type → Standard HDD? Upgrade to Premium SSD
│   ├── Check VM Size → undersized? Right-size
│   ├── Check Network → bandwidth, latency, DNS
│   └── Check Processes → top/htop, runaway process?
└── Can't Connect
    ├── Check NSG → inbound rule allows your IP/port
    ├── Check Public IP → assigned? DNS resolves?
    ├── Check Bastion → configured? subnet correct?
    ├── Check VPN → if accessing via VPN, is tunnel up?
    └── Check Just-in-Time → is port access enabled?

App Service Issues

App Service Issue
├── 502/503 Errors
│   ├── Check App Service Logs → application errors
│   ├── Check App Insights → dependency failures
│   ├── Check CPU/Memory → instance overwhelmed?
│   ├── Check Scaling → at max instances?
│   ├── Check Backend → database/storage accessible?
│   └── Check Deployment → recent deployment caused issue?
├── Slow Response
│   ├── Check Response Time metrics
│   ├── Check App Insights → slow dependencies
│   ├── Check Database → query performance
│   ├── Check Cold Start → if using Functions
│   └── Check Scaling → need more instances?
└── Deployment Fails
    ├── Check Deployment Logs → error message
    ├── Check App Service Plan → disk space?
    ├── Check Build → dependency issues?
    └── Check Permissions → deployment credential valid?

Networking Issues

Network Issue
├── Can't Reach VM
│   ├── NSG: IP Flow Verify → allowed?
│   ├── UDR: Next Hop → where does traffic go?
│   ├── VNet Peering: Connected? Allow forwarded traffic?
│   ├── DNS: Resolves correctly? nslookup/dig
│   └── Firewall: Azure Firewall rules allow?
├── Can't Reach PaaS Service
│   ├── Private Endpoint: Configured? DNS resolves to private IP?
│   ├── Service Endpoint: Enabled on subnet?
│   ├── Firewall: Storage/SQL firewall allows your IP/VNet?
│   └── RBAC: You have access to the resource?
└── VPN Not Working
    ├── Gateway: Running? BGP session up?
    ├── Shared Key: Match on both sides?
    ├── Routes: BGP advertising correct prefixes?
    ├── On-prem: Device configured correctly?
    └── Bandwidth: Tunnel at capacity?

Key Diagnostic Commands

# Check VM connectivity
az network nic show-effective-route-table -g myRG -n myNIC -o table
az network nic show-effective-network-security-groups -g myRG -n myNIC -o table

# Network Watcher diagnostics
az network watcher test-ip-flow --resource-group myRG --vm myVM --direction Inbound --protocol TCP --local 10.0.0.4:80 --remote 1.2.3.4:50000
az network watcher show-next-hop --resource-group myRG --vm myVM --source-ip 10.0.0.4 --dest-ip 8.8.8.8
az network watcher test-connectivity --resource-group myRG --source-resource myVM --dest-resource myOtherVM --protocol TCP --port 443

# App Service diagnostics
az webapp log tail --resource-group myRG --name myApp
az webapp show --resource-group myRG --name myApp --query "state"

# AKS troubleshooting
kubectl get pods -A
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl get events -n <namespace> --sort-by='.lastTimestamp'

# SQL connectivity
az sql db show-connection-string --server myServer --name myDB --client ado.net

Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals
AZ-104 Associate Core cert — Azure Administrator
AZ-500 Associate Azure Security (complement)
AZ-801 Associate Windows Server Hybrid Administrator

Interview Focus Areas

  1. Walk me through troubleshooting a VM that can’t be reached. → Check NSG (IP Flow Verify), check UDR (Next Hop), check VNet peering, check DNS, check platform health, check if VM is running

  2. How do you handle a P1 incident? → Acknowledge, assess impact, assemble team, communicate status, troubleshoot using decision trees, implement fix, verify recovery, post-incident review

  3. A web app is returning 502 errors. How do you diagnose? → Check App Service logs, App Insights for dependency failures, check if instances are healthy, check scaling, check backend DB, check recent deployments

  4. How do you reduce alert fatigue? → Tune thresholds, consolidate overlapping alerts, use action groups wisely, suppress during maintenance, implement severity-based routing, review monthly

  5. How do you escalate to Microsoft support? → Open support ticket with: subscription ID, resource ID, error message, activity log entries, repro steps, business impact. Choose appropriate severity (Critical = business impact, Minimal = no business impact).

Role Deep Dive: Azure Solutions Architect


Role Overview

Azure Solutions Architects design end-to-end Azure solutions that meet business requirements for reliability, security, scalability, cost, and operations. They are the most senior technical role in the Azure ecosystem — translating business needs into architecture, making critical design decisions, and ensuring solutions align with the Well-Architected Framework.

Alternative Titles: Cloud Architect, Azure Architect, Cloud Solutions Architect, Enterprise Architect (broader), Technical Architect

Typical Salary Range: $130,000 – $210,000 (US)


Core Responsibilities

1. Solution Design & Architecture (30% of role)

Granular Tasks: - Design multi-tier web application: Front Door (global) → App Gateway (WAF) → App Service (web tier) → Azure SQL (data tier) + Redis Cache - Design microservices architecture: Front Door → AKS (ingress controller) → microservices → Cosmos DB + Service Bus + Redis - Design event-driven architecture: Event Grid → Functions → Service Bus → Logic Apps → Cosmos DB - Design data pipeline: IoT Hub → Event Hubs → Stream Analytics → Synapse → Power BI - Design hybrid architecture: ExpressRoute → Hub VNet (Azure Firewall) → Spoke VNets → On-prem resources - Design multi-region DR: Active-passive (primary + ASR) vs Active-active (Front Door + multi-region App Service + Cosmos DB multi-write) - Create ADRs (Architecture Decision Records): context, decision, consequences, alternatives considered - Define NFRs (Non-Functional Requirements): RPO, RTO, throughput, latency, availability %

Architecture Decision Framework:

Requirement IaaS (VMs) PaaS (App Service/AKS) Serverless (Functions/Container Apps)
Full OS control
Custom software ✅ (containers) ✅ (containers)
Management overhead High Medium Low
Auto-scaling VMSS Built-in Built-in (scale to zero)
Cost model VM hours Plan + instances Per-execution
Cold start No No Yes (Consumption)
VNet integration Native Regional/ASE Native
Best for Legacy, custom OS Web apps, APIs, containers Event-driven, bursty

2. Network Architecture (20% of role)

Granular Tasks: - Hub-Spoke Design: - Hub VNet: 10.0.0.0/16 (AzureFirewallSubnet /26, GatewaySubnet /27, AzureBastionSubnet /26, Management /28) - Spoke-Web: 10.1.0.0/16 (WebFrontend /24, WebBackend /24, DataSubnet /24) - Spoke-App: 10.2.0.0/16 (AppFrontend /24, AppBackend /24, DataSubnet /24) - VNet peering: Hub ↔︎ each spoke (gateway transit enabled on hub) - UDR on all spoke subnets: 0.0.0.0/0 → Azure Firewall - Azure Firewall Premium: network rules, app rules, TLS inspection, IDPS

3. Data Architecture (15% of role)

Decision Matrix:

Requirement Azure SQL Cosmos DB PostgreSQL Synapse Cache (Redis)
Relational data
NoSQL/documents
Global distribution Geo-replication Native (multi-write) Read replicas N/A Geo-replication
Analytics/warehouse Hyperscale Change Feed N/A
Sub-ms latency
Real-time streaming Change Feed Pub/Sub
Consistency Strong only 5 levels Strong Strong Eventual

Cosmos DB Partition Key Design: - Choose key with high cardinality and even distribution - Align with most common query patterns (filter by partition key = most efficient) - Avoid hot partitions (e.g., partition by status where 90% are “active”) - Cannot change partition key after creation — design carefully - Cross-partition queries cost more RUs and are slower

Data Pipeline Architecture:

Source → Ingest → Process → Store → Serve
─────────────────────────────────────────────
IoT Hub  → Event Hubs → Stream Analytics → Cosmos DB → App Service
Blob     → Data Factory → Databricks    → Synapse   → Power BI
API      → Event Grid  → Functions      → ADLS Gen2 → Synapse SQL

4. Security Architecture (15% of role)

Security Architecture Checklist: - [ ] All users have MFA enabled - [ ] Conditional Access policies for all privileged roles - [ ] PIM for all admin roles (no permanent elevated access) - [ ] Managed Identities for all service-to-service auth (no credentials in code) - [ ] Key Vault for all secrets, keys, certificates - [ ] Private Endpoints for all PaaS services - [ ] Azure Firewall Premium with IDPS and TLS inspection - [ ] WAF on all web entry points (Prevention mode) - [ ] DDoS Protection Standard for public-facing workloads - [ ] Encryption at rest (AES-256, customer-managed keys for sensitive data) - [ ] Encryption in transit (TLS 1.2+, enforce via App Service) - [ ] Microsoft Defender for Cloud enabled on all resources - [ ] Sentinel for SIEM + SOAR - [ ] Azure Policy enforcing security baseline - [ ] NSGs with least-privilege rules (deny-all default) - [ ] JIT VM access for management - [ ] Audit logging to immutable storage - [ ] Network segmentation (separate subnets per tier) - [ ] Data classification and protection (sensitivity labels, DLP) - [ ] Regular access reviews

5. High Availability & Disaster Recovery (10% of role)

HA/DR Patterns:

Pattern RTO RPO Cost Complexity
Single region + AZ Minutes Zero |Low||**ActivePassive(ASR)**|Minutes − Hours|Minutes| Medium
Active-Passive (Geo-replicated) Seconds-Minutes Seconds $$$ | Medium | | **Active-Active (Multi-region)** | Seconds | Zero | $$$$ High

Design by Service: - Compute (App Service): Availability Zones (Zone-redundant), auto-scaling. DR: paired region with App Service clone + Traffic Manager/Front Door - Compute (AKS): Multi-zone node pools, cluster autoscaler. DR: multi-region AKS + Front Door - Database (Azure SQL): Zone-redundant configuration. DR: Geo-replication + Failover Groups - Database (Cosmos DB): Multi-region writes (strong or multi-master). Automatic failover - Storage (Blob): ZRS/GZRS. Read-access secondary for failover - Storage (Files): GRS. File sync for on-prem DR

6. Cost Architecture (5% of role)

Cost Optimization Strategies: - Right-size VMs (Advisor recommendations, actual usage analysis) - Reserved Instances for 1+ year steady-state workloads (up to 72% savings) - Spot VMs for fault-tolerant workloads (up to 90% savings) - Auto-shutdown for dev/test environments - Storage tiering (Hot → Cool → Archive lifecycle policies) - Serverless for bursty workloads (pay per use) - AKS cluster autoscaler (don’t pay for idle nodes) - Azure Hybrid Benefit for Windows/SQL licensing - Enterprise Dev/Test subscriptions for non-prod (discounted rates)

7. Governance Architecture (5% of role)


Architecture Patterns

Pattern 1: Enterprise Web Application

Users → Front Door (WAF, global routing)
      → App Gateway (WAF, SSL termination, path routing)
      → App Service (web/frontend)
      → Azure SQL (Hyperscale) + Redis Cache
      → Blob Storage (static assets)
      → Key Vault (secrets) via Managed Identity

HA: Zone-redundant App Service, SQL zone-redundant
DR: Geo-replicated SQL + Failover Group, Front Door failover to secondary region
Monitoring: App Insights + Log Analytics + Alerts
Security: Private Endpoints for SQL/Storage/Key Vault, WAF, Managed Identity

Pattern 2: Microservices on AKS

Users → Front Door (global, WAF)
      → App Gateway (AGIC ingress)
      → AKS (microservices with Istio/Cilium)
      → Cosmos DB (NoSQL) + Azure SQL (relational)
      → Service Bus (async messaging)
      → Event Grid (event routing)
      → Redis Cache (caching)
      → Key Vault via CSI Driver + Workload Identity

HA: Multi-zone AKS, Cosmos DB multi-region
DR: Multi-region AKS + Front Door, Cosmos DB multi-master
Monitoring: Container Insights + App Insights + Prometheus + Grafana
Security: Network policies, Workload Identity, Private Endpoints, WAF

Pattern 3: Event-Driven Serverless

Event Source (HTTP/API/Blob/Queue/Timer)
      → Event Grid / Service Bus
      → Azure Functions (Durable Functions for orchestration)
      → Cosmos DB / Azure SQL
      → Blob Storage / ADLS Gen2
      → Logic Apps (integration/orchestration)

HA: Functions auto-scale, Cosmos DB multi-region
DR: Multi-region Functions + Cosmos DB, Front Door for HTTP
Cost: Pay per execution (Consumption), scale to zero
Security: Managed Identity, Private Endpoints, VNet-integrated Functions (Premium)

Pattern 4: Data Analytics Platform

Sources (IoT, APIs, Databases, Files)
      → Event Hubs / Data Factory (ingest)
      → ADLS Gen2 (raw/curated zones)
      → Databricks / Synapse Spark (process)
      → Synapse SQL / Azure SQL (serve)
      → Power BI (visualize)

Governance: Purview (catalog, lineage, classification)
Monitoring: Azure Monitor + Log Analytics
Security: Private Endpoints, Managed Identity, encryption, RBAC on data

Pattern 5: Hybrid Enterprise

On-Prem Datacenter ←→ ExpressRoute (1 Gbps primary) + VPN (backup)
                      → Hub VNet
                        ├── Azure Firewall Premium (IDPS, TLS inspection)
                        ├── VPN Gateway / ExpressRoute Gateway
                        └── Azure Bastion
                      → Spoke-VNet (Corp Apps)
                        ├── App Service (internal apps)
                        ├── Azure SQL MI (migrated databases)
                        └── Azure Files (file shares, File Sync to on-prem)
                      → Spoke-VNet (Web Apps)
                        ├── AKS (external-facing microservices)
                        └── Cosmos DB (NoSQL)

Azure Arc: manage on-prem servers from Azure
Azure Migrate: ongoing migration assessment
Microsoft Defender: unified security posture

Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals
AZ-104 Associate Azure Administration (recommended prerequisite)
AZ-305 Expert Core cert — Solutions Architect
AZ-500 Associate Security Engineer (strongly recommended complement)
DP-203 Associate Data Engineering (complement for data-heavy roles)

AZ-305 Exam Breakdown

Domain Weight
Design identity, governance, and monitoring solutions 25-30%
Design data storage solutions 20-25%
Design business continuity solutions 10-15%
Design infrastructure solutions 25-30%
Design application architectures 10-15%

Interview Focus Areas

Architecture Design Questions

  1. Design a highly available web application on Azure. → Front Door → App Gateway → App Service (zone-redundant) → Azure SQL (geo-replication + failover group) + Redis. Private endpoints, WAF, managed identity, Key Vault.

  2. Design a microservices architecture on Azure. → AKS with Azure CNI Overlay, multi-zone node pools, Workload Identity, CSI Driver for Key Vault, AGIC for ingress, Service Bus for async, Cosmos DB for state. Monitoring with Container Insights.

  3. Design a multi-region DR strategy. → Active-passive: primary region + paired region with ASR/geo-replication, Front Door for failover (health probes). Active-active: multi-region App Service, Cosmos DB multi-write, Front Door routing. Trade-offs: cost vs RTO.

  4. Design a hybrid cloud architecture. → ExpressRoute (primary) + VPN (backup) → hub VNet with Azure Firewall → spoke VNets. Azure Arc for on-prem management. Entra Connect for identity sync. Azure File Sync for file DR.

  5. How do you choose between Azure SQL and Cosmos DB? → SQL for relational, transactional, strong consistency, existing SQL Server skills. Cosmos DB for NoSQL, global distribution, flexible consistency, massive scale, multi-model APIs, sub-ms latency.

Decision-Making Questions

  1. App Service vs AKS vs Container Apps — when to use which? → App Service: simple web apps, standard frameworks, fast time-to-market. AKS: complex microservices, custom networking, full K8s control. Container Apps: serverless containers, scale-to-zero, simpler than AKS.

  2. How do you design for compliance (HIPAA/PCI/GDPR)? → Private endpoints everywhere, encryption at rest/transit, customer-managed keys, audit logging to immutable storage, Azure Policy for compliance, Defender for Cloud regulatory dashboard, data residency via allowed-locations policy.

  3. How do you handle secrets across 50 microservices? → Key Vault per environment, Managed Identity for each service, CSI Driver for AKS, no secrets in environment variables or config files. Automate rotation. Monitor access with Key Vault audit logs.

  4. How do you design a cost-optimized architecture? → Right-size from day 1, Reserved Instances for steady-state, serverless for bursty, auto-scaling, storage tiering, dev/test auto-shutdown, chargeback model with tags and budgets.

  5. What’s your approach to the Well-Architected Framework? → Review all 5 pillars: Reliability (HA/DR, health probes, circuit breakers), Security (zero-trust, private endpoints, WAF, MFA), Cost (right-sizing, RI, lifecycle), Ops (IaC, CI/CD, monitoring, runbooks), Performance (autoscaling, CDN, caching, partitioning).

Role Deep Dive: Azure Developer


Role Overview

Azure Developers build cloud-native applications on Azure. They write code that leverages Azure PaaS and serverless services — building APIs, web apps, event-driven functions, and microservices. They focus on application code, SDKs, and Azure-integrated development patterns.

Alternative Titles: Cloud Developer, Backend Developer (Azure), Full-Stack Cloud Developer

Typical Salary Range: $90,000 – $160,000 (US)


Core Responsibilities

1. Application Development (40% of role)

Granular Tasks: - Build REST API using ASP.NET Core / Node.js / Python, deploy to App Service - Write Azure Functions with proper triggers and bindings: - HTTP trigger for API endpoints - Blob trigger for file processing - Service Bus trigger for message processing - Timer trigger for scheduled jobs - Event Grid trigger for reactive processing - Implement Durable Functions for stateful orchestration (chaining, fan-out/fan-in, human interaction) - Use Azure SDKs: Azure.Identity (DefaultAzureCredential), Azure.Storage.Blobs, Azure.Messaging.ServiceBus, Microsoft.Azure.Cosmos, Azure.Security.KeyVault.Secrets - Implement Managed Identity in code: DefaultAzureCredential auto-detects environment (dev = VS/CLI credentials, prod = managed identity) - Build containerized microservices, deploy to Container Apps or AKS - Implement pub/sub patterns with Service Bus topics - Build real-time applications with SignalR Service

Code Pattern — Managed Identity + Key Vault:

// C# - DefaultAzureCredential handles dev + prod
var credential = new DefaultAzureCredential();
var client = new SecretClient(new Uri("https://myvault.vault.azure.net/"), credential);
var secret = await client.GetSecretAsync("DatabaseConnectionString");
# Python
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

credential = DefaultAzureCredential()
client = SecretClient(vault_url="https://myvault.vault.azure.net/", credential=credential)
secret = client.get_secret("DatabaseConnectionString")

Code Pattern — Blob Storage Upload:

var blobServiceClient = new BlobServiceClient(new Uri("https://mystorage.blob.core.windows.net"), new DefaultAzureCredential());
var container = blobServiceClient.GetBlobContainerClient("uploads");
await container.UploadBlobAsync("file.pdf", stream);

2. API Design & Development (15% of role)

Granular Tasks: - Design API with OpenAPI/Swagger specification - Implement JWT validation in API middleware - Configure APIM policies: rate-limit-by-key, validate-jwt, cors, cache-lookup - Version APIs (URL path: /v1/, /v2/) with APIM revisions - Implement backend for frontend (BFF) pattern for different client types - Use API Management developer portal for documentation

3. Data Access & Storage (15% of role)

Granular Tasks: - Cosmos DB: design partition key, implement stored procedures, use change feed for real-time processing - Azure SQL: use Entity Framework with migrations, implement connection pooling, optimize queries - Redis: implement cache-aside pattern, session storage, distributed locking - Blob Storage: upload/download with SAS tokens or Managed Identity, implement blob lease for concurrency - Implement repository pattern with dependency injection - Use Azure SDK pagination for large result sets

4. Messaging & Event-Driven Development (10% of role)

Granular Tasks: - Service Bus: send/receive messages, implement dead-letter handling, use sessions for FIFO - Event Grid: publish custom events, subscribe to Azure resource events - Event Hubs: produce/consume event streams, use Kafka-compatible endpoint - Implement saga pattern for distributed transactions (compensating transactions) - Implement outbox pattern for reliable event publishing - Handle idempotency in message consumers

5. Testing & Quality (10% of role)

Granular Tasks: - Unit test Functions: mock bindings and triggers - Integration test: use Azurite (local Blob/Queue/Table), use Cosmos DB emulator - Load test: Azure Load Testing service, JMeter/k6 - Set up CI pipeline: build → test → deploy to staging slot → smoke test → swap - Implement code quality: SonarQube, linting, code coverage > 80%

6. DevOps & Deployment (10% of role)

Granular Tasks: - Write Bicep templates for infrastructure - Configure GitHub Actions: build → test → deploy to App Service slot → swap - Use App Configuration for dynamic settings (no redeployment needed) - Implement feature flags for gradual rollout - Set up deployment slots for zero-downtime deployment - Configure managed identity for deployment (no service principal secrets)


Azure Services Used Daily

Category Services
Compute App Service, Azure Functions, Container Apps, AKS
Storage Blob Storage, Azure Files, ADLS Gen2, Table Storage
Databases Azure SQL, Cosmos DB, Redis Cache
Messaging Service Bus, Event Grid, Event Hubs
Identity Entra ID, Managed Identity, Key Vault
API API Management, App Configuration
DevOps Azure DevOps, GitHub Actions, Bicep
Monitoring Application Insights, Log Analytics
Real-time SignalR Service

Key Design Patterns for Azure Developers

1. Cache-Aside Pattern

App → Check Redis Cache → Hit? Return cached data
                         → Miss? Query DB → Store in Redis → Return data

2. Competing Consumers Pattern

Service Bus Queue → Multiple Function instances consume messages concurrently

3. Saga Pattern (Distributed Transactions)

Order Service → Payment Service → Inventory Service → Shipping Service
     ↓ (fail)         ↓ (fail)          ↓ (fail)
  Compensate       Compensate        Compensate

4. Outbox Pattern

1. Write to Database + Write event to Outbox table (same transaction)
2. Background process reads Outbox → Publishes to Event Grid/Service Bus

5. Gateway Routing Pattern

Client → API Management → Route to different backends based on path/version

6. Strangler Fig Pattern (Migration)

Old Monolith → API Management → Route new features to new services, old features to monolith
Gradually replace all monolith endpoints → Full migration

Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals
AZ-204 Associate Core cert — Azure Developer
AZ-400 Expert DevOps Engineer (next step)

AZ-204 Exam Breakdown

Domain Weight
Develop Azure compute solutions 25-30%
Develop for Azure storage 15-20%
Implement Azure security 20-25%
Monitor, troubleshoot, and optimize 15-20%
Connect to and consume Azure services 15-20%

Interview Focus Areas

  1. How do you authenticate to Azure services from code? → DefaultAzureCredential (Managed Identity in prod, VS/CLI credentials in dev). Never hardcode credentials. Key Vault for secrets via Managed Identity.

  2. Explain the Function triggers and bindings. → Triggers start the function (HTTP, Timer, Blob, Service Bus, Event Grid). Bindings connect to data declaratively (input: read from Blob/SQL, output: write to Queue/Service Bus). No SDK code needed for bindings.

  3. How do you handle distributed transactions in microservices? → Saga pattern with Durable Functions or Service Bus. Compensating transactions for rollback. No distributed locks. Eventual consistency.

  4. How do you implement feature flags in Azure? → Azure App Configuration with feature flags. Check flag in code, no redeployment needed. Use for canary releases, A/B testing, gradual rollout.

  5. How do you deploy with zero downtime? → App Service deployment slots. Deploy to staging, test, swap to production. Or blue/green with AKS + Front Door traffic weighting.

  6. How do you handle message ordering and idempotency? → Ordering: Service Bus sessions (same session ID = same consumer, FIFO). Idempotency: track processed message IDs, implement deduplication logic, make operations idempotent by design.

  7. Explain the change feed in Cosmos DB. → Persistent log of changes (inserts, updates). Read from a point in time, process changes incrementally. Use for ETL, materialized views, notifications, event-driven processing.

  8. How do you secure an API? → Validate JWT in APIM or middleware, use Entra ID for OAuth2/OIDC, rate limit with APIM policies, HTTPS only, CORS configuration, input validation, OWASP best practices.

Role Deep Dive: Azure DevOps Engineer


Role Overview

Azure DevOps Engineers design and implement CI/CD pipelines, infrastructure as code, and DevOps practices on Azure. They automate everything — builds, tests, deployments, infrastructure provisioning — and ensure reliable, repeatable, secure delivery of software to production.

Alternative Titles: DevOps Engineer, Cloud DevOps Engineer, Platform Engineer, CI/CD Engineer, Release Engineer

Typical Salary Range: $100,000 – $165,000 (US)


Core Responsibilities

1. CI/CD Pipeline Design & Implementation (30% of role)

Granular Tasks: - Azure DevOps Pipeline (YAML): yaml trigger: branches: include: [main] pool: vmImage: 'ubuntu-latest' stages: - stage: Build jobs: - job: Build steps: - task: DotNetCoreCLI@2 inputs: command: 'build' - task: DotNetCoreCLI@2 inputs: command: 'test' - task: DotNetCoreCLI@2 inputs: command: 'publish' publishWebProjects: true - publish: $(Build.ArtifactStagingDirectory) artifact: webapp - stage: Deploy_Dev dependsOn: Build condition: succeeded() jobs: - deployment: DeployDev environment: 'dev' strategy: runOnce: deploy: steps: - download: current artifact: webapp - task: AzureWebApp@1 inputs: azureSubscription: 'myServiceConnection' appName: 'myapp-dev' package: '$(Pipeline.Workspace)/webapp/**/*.zip' - stage: Deploy_Prod dependsOn: Deploy_Dev condition: succeeded() jobs: - deployment: DeployProd environment: 'prod' strategy: canary: increments: [10, 50, 100] preDeploy: steps: - script: echo "Canary deployment starting" deploy: steps: - task: AzureWebApp@1 inputs: azureSubscription: 'myServiceConnection' appName: 'myapp-prod' package: '$(Pipeline.Workspace)/webapp/**/*.zip'

2. Infrastructure as Code (IaC) (25% of role)

Granular Tasks: - Bicep Template: ```bicep param location string = resourceGroup().location param appName string param environment string

var tags = { Environment: environment Application: appName ManagedBy: ‘bicep’ }

resource appServicePlan ‘Microsoft.Web/serverfarms@2023-01-01’ = { name: ‘appName − plan{environment}’ location: location sku: { name: environment == ‘prod’ ? ‘P1v3’ : ‘B1’ tier: environment == ‘prod’ ? ‘PremiumV3’ : ‘Basic’ } tags: tags }

resource webApp ‘Microsoft.Web/sites@2023-01-01’ = { name: ‘appName{environment}’ location: location properties: { serverFarmId: appServicePlan.id httpsOnly: true siteConfig: { minTlsVersion: ‘1.2’ ftpsState: ‘Disabled’ remoteDebuggingEnabled: false } } tags: tags }

resource appInsights ‘Microsoft.Insights/components@2020-02-02’ = { name: ‘appName − ai{environment}’ location: location kind: ‘web’ properties: { Application_Type: ‘web’ WorkspaceResourceId: logAnalytics.id } tags: tags } ```

3. Container & Kubernetes Operations (15% of role)

Granular Tasks: - Build Docker images in CI pipeline, push to ACR - ACR tasks: auto-build on git push, image scanning (Defender for Containers) - ACR geo-replication for multi-region deployments - AKS deployment: Helm chart with values per environment - AKS upgrades: plan upgrade, cordon/drain nodes, upgrade node pools - Configure AKS: private cluster, Azure CNI Overlay, network policies, workload identity - Implement pod disruption budgets for zero-downtime AKS upgrades

4. Environment Management (10% of role)

Granular Tasks: - Create environment per PR: deploy to temporary App Service slot or AKS namespace - Environment configuration: App Configuration stores + Key Vault per environment - Feature flags for environment-specific features - Auto-cleanup of ephemeral environments when PR is closed - Production environment: approval gates, manual checks, deployment windows

5. Monitoring & Observability for DevOps (10% of role)

Granular Tasks: - Track deployment frequency, lead time, MTTR, change failure rate (DORA metrics) - App Insights: track deployments, correlate errors with deployments - Alert on pipeline failures, long-running deployments - Dashboard: deployment pipeline health, environment status, DORA metrics

6. Security in DevOps (DevSecOps) (10% of role)

Granular Tasks: - Pipeline Security: - SAST (Static Analysis): SonarQube, CodeQL in CI - SCA (Software Composition Analysis): Dependabot, Snyk, Defender for DevOps - Container scanning: Trivy, Defender for Containers - IaC scanning: checkov, tfsec - DAST (Dynamic Analysis): OWASP ZAP in CD (staging) - Secret scanning: GitHub secret scanning, detect-secrets - Secret Management: - Variable groups linked to Key Vault in Azure DevOps - GitHub Secrets for sensitive values - Never commit secrets to repo (pre-commit hooks, branch policies) - Rotate service principal secrets before expiry - Use Managed Identity for pipeline-to-Azure auth (no service principals) - Supply Chain Security: - Sign container images (Notation/Notary) - Pin image digests (not tags) in production - Review Dependabot PRs promptly - SBOM (Software Bill of Materials) generation


Azure Services Used Daily

Category Services
CI/CD Azure DevOps (Pipelines, Repos, Boards, Artifacts), GitHub Actions
IaC Bicep, Terraform, ARM Templates
Containers ACR, AKS, Container Apps
Compute App Service, Functions
Config App Configuration, Key Vault, Feature Flags
Security Defender for DevOps, Key Vault, Managed Identity
Monitoring Application Insights, Log Analytics, Azure Monitor
Networking VNet, Private Endpoints (for ACR, Key Vault, etc.)

DORA Metrics (Key Performance Indicators)

Metric What It Measures Elite High Medium Low
Deployment Frequency How often code deploys to prod On demand Weekly Monthly Yearly
Lead Time for Changes Commit to production <1 hour <1 day <1 week >1 month
Change Failure Rate % of deployments causing failures 0-15% 16-30% 31-45% >45%
MTTR Time to restore service after failure <1 hour <1 day <1 week >1 week

Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals
AZ-204 Associate Azure Developer (recommended prerequisite)
AZ-400 Expert Core cert — DevOps Engineer

AZ-400 Exam Breakdown

Domain Weight
Develop an instrumentation strategy 5-10%
Develop a Site Reliability Engineering (SRE) strategy 5-10%
Develop a security and compliance plan 10-15%
Manage source control 10-15%
Facilitate communication and collaboration 10-15%
Define and implement continuous integration 20-25%
Define and implement continuous delivery 10-15%

Interview Focus Areas

  1. Walk me through your CI/CD pipeline. → PR triggers build → compile, unit test, SAST scan → publish artifact → deploy to staging → integration tests → approval gate → deploy to production (canary/blue-green) → smoke tests → monitoring

  2. How do you implement blue/green deployment? → App Service: deploy to staging slot, test, swap to production. AKS: two deployments, switch service selector. Front Door: shift traffic weight.

  3. How do you manage secrets in pipelines? → Key Vault linked to Azure DevOps variable groups. GitHub Secrets for Actions. Managed Identity for pipeline-to-Azure (no service principals). Never in repo or plain text.

  4. How do you implement IaC? → Bicep/Terraform in git repo. PR triggers plan/validate. Apply in CD with approval. State in Azure Storage (Terraform). What-if/plan review before apply.

  5. How do you handle AKS upgrades? → Plan upgrade window. Upgrade control plane first, then node pools. Cordon/drain nodes. Pod disruption budgets ensure availability. Test in staging first.

  6. What is GitOps? → Git = single source of truth for infrastructure/app state. Flux/ArgoCD automatically syncs cluster to git. No manual kubectl. PR-based changes. Audit trail in git.

  7. How do you implement DevSecOps? → Shift security left: SAST in CI, SCA for dependencies, container scanning, IaC scanning, DAST in staging. Secret scanning in repos. SBOM generation. Sign images.

  8. How do you handle database migrations in CI/CD? → Entity Framework migrations as part of deployment. Run migrations in staging first. Use deployment slots (App Service) or init containers (AKS). Rollback: backward-compatible migrations only, or restore database backup.

  9. How do you manage multi-environment deployments? → Same IaC template with parameter files per environment. App Configuration per environment. Key Vault per environment. Pipeline stages with approval gates for prod.

  10. What DORA metrics do you track? → Deployment frequency, lead time, change failure rate, MTTR. Track in dashboards. Target: elite performance (deploy on demand, <1 hour lead time, <15% failure rate, <1 hour MTTR).

Role Deep Dive: Azure Security Engineer


Role Overview

Azure Security Engineers design, implement, and maintain the security posture of Azure environments. They protect data, identities, applications, and infrastructure from threats. They implement zero-trust architecture, manage identity security, configure network defenses, and respond to security incidents.

Alternative Titles: Cloud Security Engineer, Azure Security Architect, Cloud Security Analyst, Information Security Engineer

Typical Salary Range: $110,000 – $180,000 (US)


Core Responsibilities

1. Identity & Access Security (25% of role)

Granular Tasks: - Conditional Access Policies (Build in this order): 1. Block legacy authentication protocols (IMAP, POP, SMTP) 2. Require MFA for all users (with exclusions for break-glass accounts) 3. Require MFA for Azure management (all admin portals) 4. Block access from untrusted countries 5. Require compliant device for corporate resources 6. Require MFA + compliant device for privileged roles 7. Block sign-in for high-risk users (Identity Protection) 8. Require MFA for medium-risk sign-ins (Identity Protection) 9. Require terms of use for guest access 10. Session controls: limited access for unmanaged devices

2. Network Security (20% of role)

Granular Tasks: - Zero-Trust Network Architecture: - Verify explicitly: authenticate and authorize every request (MFA, Conditional Access, RBAC) - Use least privilege: minimal access, just-in-time, PIM - Assume breach: microsegmentation, Private Endpoints, blast radius containment - All PaaS services: Private Endpoints (disable public access) - All management: Bastion (no public RDP/SSH), JIT VM access - All outbound: Azure Firewall (inspect, filter, log) - All web traffic: WAF (Prevention mode) - Network segmentation: separate subnets per tier, NSGs deny-all default

3. Data Protection & Encryption (15% of role)

Granular Tasks: - Key Vault Security: - RBAC authorization model (not access policies) - Private Endpoint for all access - Firewall: deny public access, allow specific VNets - Soft delete: enabled (default), purge protection: enabled - Audit logging: send to Log Analytics + Storage Account - Key rotation: auto-rotate on creation (set rotation policy) - Certificate auto-renewal: integrate with DigiCert/GlobalSign - HSM-backed keys for cryptographic operations (Premium or Managed HSM)

4. Threat Detection & Response (15% of role)

Granular Tasks: - Defender for Cloud: - Enable all Defender plans (Servers, App Service, SQL, Storage, Containers, Key Vault, DNS, IoT, Databases) - Review Secure Score weekly, remediate critical recommendations - Configure email notifications for critical alerts - Enable JIT VM access for all internet-facing VMs - Export alerts to Sentinel for centralized investigation

5. Governance & Compliance (15% of role)

Granular Tasks: - Key Security Policies: - Deny public endpoints on PaaS services (Storage, SQL, Key Vault) - Require encryption at rest (audit resources without encryption) - Require HTTPS only on App Service - Deny resource creation in non-approved regions - Require tags (DataClassification, Owner) - Audit diagnostic settings (ensure logging enabled) - Deny privileged containers in AKS - Require SQL TDE enabled - Audit NSG rules allowing unrestricted inbound access

6. Application Security (10% of role)

Granular Tasks: - App Service: HTTPS only, min TLS 1.2, disable FTP, disable remote debugging, managed identity, private endpoint - AKS: private cluster, network policies (Calico/Cilium), workload identity, pod security standards, secret store CSI driver, image scanning - APIM: validate JWT, rate limit, CORS policy, IP filtering, client certificate auth - Container security: scan images (Defender for Containers), sign images, pin digests


Azure Services Used Daily

Category Services
Identity Entra ID, PIM, Identity Protection, Conditional Access, Managed Identity
Network Azure Firewall, WAF, NSG, Private Link, Bastion, DDoS Protection, Front Door
Data Protection Key Vault, Managed HSM, TDE, Always Encrypted, Dynamic Data Masking, Purview
Threat Detection Defender for Cloud, Sentinel, Defender for Endpoint, Identity Protection
Governance Azure Policy, Blueprints, Management Groups, RBAC
Compliance Microsoft Purview, Compliance Manager, Regulatory Compliance

Security Architecture Checklist

Identity

Network

Data

Monitoring

Governance


Certification Path

Certification Level Focus
SC-900 Foundational Security fundamentals
AZ-500 Associate Core cert — Azure Security Engineer
SC-100 Expert Cybersecurity Architect
SC-200 Associate Security Operations Analyst (Sentinel focus)
SC-300 Associate Identity & Access Administrator (Entra ID focus)

AZ-500 Exam Breakdown

Domain Weight
Manage identity and access 20-25%
Secure networking 20-25%
Secure compute, storage, and databases 25-30%
Manage security operations 25-30%

Interview Focus Areas

  1. How do you implement zero-trust in Azure? → Verify explicitly (MFA, Conditional Access, RBAC), use least privilege (PIM, JIT), assume breach (Private Endpoints, microsegmentation, WAF, blast radius containment).

  2. Walk me through your Conditional Access strategy. → Block legacy auth → MFA for all → MFA for Azure management → geo-block → compliant device → risk-based policies. Report-only mode first, then enforce.

  3. How do you secure PaaS services? → Private Endpoints (no public access), Azure Firewall for outbound, WAF for web entry, Managed Identity for auth, CMK for encryption, diagnostic logging, Azure Policy to enforce.

  4. How do you detect and respond to threats? → Defender for Cloud for detection, Sentinel for SIEM+SOAR, automated playbooks for response (block IP, disable user, isolate VM), incident response process (detect-triage-contain-investigate-remediate-recover).

  5. How do you manage secrets across the organization? → Key Vault per environment, RBAC authorization, Private Endpoints, audit logging, auto-rotation, Managed Identity for all service auth, no credentials in code or config.

  6. How do you implement compliance (HIPAA/PCI/GDPR)? → Azure Policy for compliance controls, Defender for Cloud regulatory dashboard, Private Endpoints, encryption, audit logging to immutable storage, data classification, DLP, access reviews.

  7. How do you secure AKS? → Private cluster, network policies, workload identity, CSI driver for secrets, pod security standards, image scanning, RBAC with Entra ID, Azure Policy for AKS, Defender for Containers.

  8. What is PIM and why is it critical? → Just-in-time privileged access. No permanent admin roles. Eligible assignments require approval + MFA + time limit. Reduces attack surface from compromised admin accounts.

  9. How do you handle a security incident? → Detect (alert) → Triage (severity, scope) → Contain (isolate resources) → Investigate (logs, timeline) → Remediate (fix, rotate, patch) → Recover (restore, verify) → Post-incident (RCA, update policies).

  10. How do you implement encryption strategy? → At rest: AES-256, PMK default, CMK for sensitive data (Key Vault auto-rotate). In transit: TLS 1.2+ enforced. In use: Confidential Computing (SGX/TEE) for sensitive workloads. Keys in Key Vault/Managed HSM.

Role Deep Dive: Cloud Auditor / Compliance Specialist


Role Overview

Cloud Auditors assess Azure environments against regulatory standards, organizational policies, and best practices. They ensure compliance, identify gaps, and provide remediation guidance. They bridge the gap between regulatory requirements and technical implementation.

Alternative Titles: Cloud Compliance Analyst, IT Auditor, GRC Specialist (Governance, Risk, Compliance), Cloud Auditor, Compliance Engineer

Typical Salary Range: $85,000 – $145,000 (US)


Core Responsibilities

1. Compliance Assessment & Auditing (30% of role)

Granular Tasks: - Use Microsoft Defender for Cloud Regulatory Compliance dashboard: select frameworks (ISO 27001, SOC 2, PCI DSS 4.0, HIPAA, NIST 800-53, CIS Benchmarks, GDPR, Azure CIS 1.4.0) - Review each control: passed/failed/Not applicable. For failed: identify resources, assess risk, assign remediation owner - Export compliance report (PDF/CSV) for auditors - Map Azure controls to framework requirements (control mapping document) - Track findings in a register: Finding ID, Severity, Resource, Control, Status, Owner, Due Date, Evidence - Re-assess after remediation to verify closure

2. Policy & Governance Review (25% of role)

Granular Tasks: - Azure Policy compliance report: % compliant per policy, list non-compliant resources - RBAC audit: list all Owner/Contributor assignments, verify justification, flag unused assignments, verify PIM usage - Review permanent admin role assignments (should be PIM-eligible only) - Tag audit: resources missing required tags (Environment, Owner, CostCenter, DataClassification) - Management group audit: verify policy inheritance, verify no policies accidentally applied at wrong scope - Subscription audit: orphaned resources, unattached disks, idle public IPs, unused VNets - Cross-subscription consistency: same policies applied across all subscriptions in management group

3. Data Governance & Privacy (20% of role)

Granular Tasks: - Data classification audit: verify all storage accounts/databases have data classification tags - Data residency: Azure Policy “allowed locations” enforced, verify no resources in non-compliant regions - Encryption audit: verify AES-256 at rest, TLS 1.2+ in transit, CMK for sensitive data - Retention audit: verify backup retention matches policy, verify log retention (7 years for compliance) - GDPR: verify data processing agreements, privacy impact assessments, right-to-erasure capability, data breach notification process - Data access audit: review who has access to sensitive data, verify least privilege, review sharing controls - Microsoft Purview: review data catalog, classification scan results, sensitivity labels

4. Audit Evidence & Documentation (15% of role)

Granular Tasks: - Evidence collection per control: - Screenshots of Azure Policy compliance - Export of RBAC assignments - Defender for Cloud compliance reports - Diagnostic settings configuration (proof of logging) - Key Vault access policies / RBAC assignments - Network configuration (NSG rules, Private Endpoints) - Conditional Access policy configurations - PIM configuration and assignment reports - Organize evidence by framework control (ISO 27001 A.9.2.1 → evidence: RBAC report, PIM config) - Maintain evidence repository (SharePoint/Teams) with version control - Quarterly evidence refresh (re-collect to show ongoing compliance)

5. Continuous Monitoring & Improvement (10% of role)

Granular Tasks: - Azure Monitor alerts for compliance drift (new non-compliant resources) - Sentinel analytics rules for suspicious compliance-related activity (new Owner assignments without PIM) - Monthly compliance scorecard per subscription - Quarterly trend report: compliance score over time, top failing controls, remediation velocity - Recommend new policies based on audit findings


Regulatory Frameworks & Azure Mapping

ISO 27001 (Information Security Management)

Control Area Azure Implementation
Access Control (A.9) Entra ID, RBAC, PIM, Conditional Access, MFA
Cryptography (A.10) Key Vault, TDE, Always Encrypted, TLS enforcement, CMK
Operations Security (A.12) Azure Monitor, Log Analytics, Defender for Cloud, Sentinel
Communications Security (A.13) NSGs, Azure Firewall, Private Endpoints, ExpressRoute, WAF
System Maintenance (A.11) Update Management, AKS upgrades, patch management

PCI DSS 4.0 (Payment Card Industry)

Requirement Azure Implementation
Network segmentation VNet, NSG, Azure Firewall, Private Endpoints
Encryption of cardholder data TDE, Always Encrypted, TLS 1.2+, Key Vault CMK
Access control on need-to-know RBAC, PIM, Conditional Access, Row-Level Security
Track and monitor access Azure Monitor, Log Analytics, Sentinel, SQL Auditing
Regular security testing Defender for Cloud, vulnerability assessment, penetration testing

HIPAA (Health Insurance Portability)

Rule Azure Implementation
Access controls Entra ID, RBAC, PIM, Conditional Access, MFA
Audit controls Azure Monitor, Log Analytics, SQL Auditing, Activity Log
Integrity controls Immutable Storage, versioning, backup, checksums
Transmission security TLS 1.2+, ExpressRoute, VPN, Private Endpoints
Encryption AES-256 at rest, TLS in transit, CMK, Always Encrypted

GDPR (General Data Protection Regulation)

Article Azure Implementation
Data minimization Store only necessary data, retention policies, auto-delete
Right to erasure Implement data deletion workflows, verify deletion
Data portability Export capabilities, standard formats
Breach notification Sentinel alerts, incident response process, 72-hour notification
Data Processing Agreement Microsoft DPA, sub-processor list
Cross-border transfer Data residency via allowed-locations policy, EU regions only

Certification Path

Certification Level Focus
SC-900 Foundational Security, Compliance, and Identity fundamentals
AZ-500 Associate Azure Security Engineer
SC-300 Associate Identity & Access Administrator
CISA Professional Certified Information Systems Auditor (ISACA)
CISM Professional Certified Information Security Manager (ISACA)
ISO 27001 Lead Auditor Professional ISO 27001 auditing certification

Interview Focus Areas

  1. How do you assess Azure compliance against ISO 27001? → Defender for Cloud regulatory compliance dashboard, select ISO 27001 benchmark, review each control, identify non-compliant resources, track remediation. Map Azure controls to ISO clauses.

  2. How do you audit RBAC assignments? → Export all role assignments, identify Owner/Contributor, verify PIM usage, flag permanent assignments, review least privilege, quarterly access reviews.

  3. How do you ensure data residency compliance? → Azure Policy “allowed locations” enforced at management group level. Verify no resources in non-compliant regions. Monitor with compliance dashboard.

  4. What evidence do you collect for an audit? → Policy compliance reports, RBAC exports, Defender for Cloud reports, diagnostic settings proof, network configs, Conditional Access configs, PIM reports, Key Vault audit logs.

  5. How do you handle a compliance finding? → Document finding (severity, resource, control, risk), assign remediation owner, set due date, track in register, verify remediation, close with evidence.

Role Deep Dive: FinOps / Cost Optimization Specialist


Role Overview

FinOps Specialists optimize cloud spending while ensuring teams get the resources they need. They implement cost governance, drive accountability, identify savings, and build a cost-conscious culture across engineering, finance, and leadership.

Alternative Titles: Cloud FinOps Analyst, Cloud Cost Engineer, Cloud Economics Specialist, FinOps Practitioner

Typical Salary Range: $90,000 – $155,000 (US)


Core Responsibilities

1. Cost Visibility & Allocation (25% of role)

Granular Tasks: - Azure Cost Management: configure cost analysis views by tag, resource group, service - Tagging strategy: Environment (prod/staging/dev), Project, CostCenter, Owner, Application - Enforce tagging with Azure Policy: “Require tag CostCenter on resource creation” (deny if missing) - Create Power BI dashboard from Cost Management exports: spend by team, service trend, forecast - Set up Cost Management exports: daily export to Storage Account, import to Power BI - Implement showback reports: monthly email to each team showing their Azure spend - Implement chargeback: allocate costs to business units based on tags, integrate with finance systems - Anomaly detection: alert when daily spend exceeds 20% above 7-day average

2. Cost Optimization (30% of role)

Granular Tasks: - Right-Sizing: - Azure Advisor recommendations: review VMs with <5% CPU for 7 days - Analyze actual CPU/memory/disk I/O: right-size from D4s to D2s if underutilized - Use Azure Monitor metrics to validate before downsizing - Test in staging first, monitor for 48 hours after resize

3. Budget Management & Governance (20% of role)

Granular Tasks: - Budget per subscription: monthly budget with 50%/80%/100% alerts - Budget per resource group: team-level budget tracking - Action groups: email team lead at 80%, email VP at 100%, Logic App to auto-stop dev VMs at 120% - Azure Policy: restrict VM sizes (deny expensive VMs like M-series in dev subscriptions) - Sandbox subscriptions: monthly budget cap, auto-delete resources when budget exceeded - EA/MCA commitment tracking: ensure committed spend is being utilized

4. Pricing & Licensing Optimization (15% of role)

Granular Tasks: - Azure Hybrid Benefit (AHB): - Windows Server: bring existing licenses, save up to 40% on VM compute - SQL Server: bring existing licenses, save up to 55% on SQL compute - Verify license coverage before enabling (don’t double-count) - Use AHUB for all production VMs with existing licenses

5. FinOps Culture & Process (10% of role)

Granular Tasks: - Monthly FinOps review: each team presents their spend, optimizations, and forecast - Cost as a first-class citizen in architecture reviews (right-size from day 1) - Cost estimation template for new projects: expected monthly cost, growth projection, RI eligibility - “Cost of delay” analysis: show cost of delaying optimization - Gamification: reward teams with highest cost savings


Cost Optimization Checklist

Compute

Storage

Networking

Databases

General


Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals (pricing, SLA, lifecycle)
AZ-104 Associate Azure Administrator (includes cost management)
FinOps Certified Practitioner Professional Core cert — FinOps Foundation
FinOps Certified Professional Professional Advanced FinOps

Interview Focus Areas

  1. How do you approach cost optimization? → Visibility (tagging, dashboards) → Optimize (right-size, RI, waste elimination) → Govern (budgets, policies) → Culture (accountability, training)

  2. What’s your Reserved Instance strategy? → Analyze 30-day usage, identify 24/7 workloads, buy 1-year for uncertain, 3-year for committed. Track utilization >80%. Exchange when needed.

  3. How do you implement chargeback/showback? → Tag all resources (CostCenter, Project). Export cost data to Power BI. Showback: monthly report. Chargeback: integrate with finance system. Enforce tagging with Azure Policy.

  4. A team’s Azure spend doubled this month. How do you investigate? → Cost Analysis: compare this month vs last by service/resource. Check for new resources, scale-up events, data transfer costs, storage growth. Review activity log for changes. Identify root cause, implement fix.

  5. How do you prevent cost overruns? → Budgets with alerts, Azure Policy restricting VM sizes and regions, auto-shutdown for dev/test, sandbox with budget caps, monthly FinOps reviews.

Role Deep Dive: Azure Network Engineer


Role Overview

Azure Network Engineers design, implement, and manage Azure networking infrastructure. They build the backbone that connects everything — VNets, firewalls, load balancers, DNS, hybrid connectivity, and traffic routing. Networking is the foundation of every Azure architecture.

Alternative Titles: Cloud Network Engineer, Azure Network Architect, Network Security Engineer (overlap)

Typical Salary Range: $95,000 – $155,000 (US)


Core Responsibilities

1. Virtual Network Design & Implementation (25% of role)

Granular Tasks: - Hub-Spoke Topology: - Hub VNet: 10.0.0.0/16 - GatewaySubnet: /27 (VPN/ExpressRoute) - AzureFirewallSubnet: /26 (Azure Firewall) - AzureBastionSubnet: /26 (Bastion) - ManagementSubnet: /28 (jumpboxes, management) - Spoke-Web: 10.1.0.0/16 - WebFrontend: /24 - WebBackend: /24 - DataSubnet: /24 - Spoke-App: 10.2.0.0/16 (same structure) - Never overlap address spaces (plan for peering from day 1) - Reserve room for growth: /16 per VNet minimum for production

2. Hybrid Connectivity (20% of role)

Granular Tasks: - ExpressRoute Design: - Choose provider: exchange provider (Equinix) or network service provider (AT&T, Verizon) - Circuit: 1 Gbps Standard, geo-redundant (2 circuits at different peering locations) - Private peering: connect to hub VNet via ExpressRoute gateway (ErGw1Az or ErGw3Az for 10 Gbps) - Microsoft peering: access M365 and Azure public services - ExpressRoute Global Reach: connect branch offices via Microsoft backbone - Failover: ExpressRoute primary + S2S VPN backup (automatic via BGP AS-path)

3. Load Balancing & Traffic Management (15% of role)

Decision Matrix:

Need Service
L4 TCP/UDP load balancing Azure Load Balancer
L7 HTTP/HTTPS with WAF Application Gateway
Global L7 with edge caching Front Door
DNS-based global routing (any protocol) Traffic Manager
Internal traffic distribution Internal Load Balancer

4. DNS Architecture (10% of role)

Granular Tasks: - Public DNS: host external domains in Azure DNS (A, CNAME, MX, TXT records) - Private DNS Zones: one zone per service type (privatelink.blob.core.windows.net, privatelink.database.windows.net) - Auto-registration: enable for VM A records in private zones - Split-brain DNS: same domain (contoso.com) → different records internally vs externally - Private Endpoint DNS: create A record in private zone pointing to private IP - Hybrid DNS: on-prem DNS → forward to Azure (168.63.129.16), Azure → forward to on-prem via ExpressRoute - Azure Firewall DNS proxy: enable for FQDN-based network rules

5. Network Security (15% of role)

Granular Tasks: - Azure Firewall Premium: deploy in hub, configure TLS inspection, IDPS (alert + deny), DNS proxy - WAF: Prevention mode, OWASP 3.2, custom rules (geo-filter, rate limit), exclusions for false positives - DDoS Protection Standard: enable on public IPs for production workloads - Private Endpoints: create for all PaaS services, disable public network access - Network Watcher: IP Flow Verify (debug NSG), Next Hop (debug routing), Connection Troubleshoot (end-to-end connectivity), NSG Flow Logs (traffic analytics) - JIT VM Access: open RDP/SSH ports only when needed

6. Network Monitoring & Troubleshooting (15% of role)

Granular Tasks: - NSG Flow Logs → Traffic Analytics: visualize traffic flows, identify top talkers, verify rules - Connection Monitor: monitor connectivity between endpoints (on-prem ↔︎ Azure, VNet ↔︎ VNet) - ExpressRoute monitor: track bandwidth utilization, BGP route changes - VPN monitor: tunnel status, bandwidth, packet drops - Network Performance Monitor: monitor bandwidth, latency, packet loss - Alert on: VPN tunnel down, ExpressRoute BGP session down, Firewall health issues


Network Architecture Patterns

Pattern 1: Simple Hub-Spoke

On-Prem ←→ ExpressRoute/VPN ←→ Hub VNet (Azure Firewall, Gateway, Bastion)
                                    ↕ (VNet Peering)
                                  Spoke-1 (App Workload)
                                  Spoke-2 (Data Workload)

Pattern 2: Multi-Region Hub-Spoke

Region A Hub ←→ Region B Hub (VNet Peering / Virtual WAN)
     ↕                    ↕
  Spoke-A1             Spoke-B1
  Spoke-A2             Spoke-B2

Pattern 3: Virtual WAN (Enterprise Scale)

Branch Offices ←→ Virtual WAN Hub ←→ VNet Spokes
Remote Users   ←→ (P2S VPN)     ←→ Azure Firewall
Partners       ←→ (S2S VPN)     ←→ ExpressRoute to on-prem

Certification Path

Certification Level Focus
AZ-900 Foundational Azure fundamentals
AZ-104 Associate Azure Administrator (includes networking)
AZ-700 Associate Core cert — Azure Network Engineer
AZ-500 Associate Security (complement)
AZ-305 Expert Solutions Architect (next step)

AZ-700 Exam Breakdown

Domain Weight
Design, implement, and manage hybrid networking 10-15%
Design and implement core networking infrastructure 20-25%
Design and implement routing 25-30%
Secure and monitor networks 15-20%
Design and implement Private Access to Azure Services 10-15%

Interview Focus Areas

  1. Design a hub-spoke network for a 3-workload organization. → Hub VNet with Firewall/Gateway/Bastion. Spoke per workload. VNet peering hub↔︎spokes. UDR on spokes → Firewall. Private Endpoints for PaaS.

  2. How do you implement hybrid connectivity with redundancy? → Dual ExpressRoute at different peering locations + S2S VPN backup. BGP for automatic failover. ExpressRoute preferred (shorter AS-path). VPN activates on ExpressRoute failure.

  3. When to use Front Door vs App Gateway vs Load Balancer? → Front Door = global, anycast, CDN. App Gateway = regional, WAF, L7. Load Balancer = L4, any protocol, high performance. Many use Front Door → App Gateway for global + regional.

  4. How do you implement zero-trust networking? → Private Endpoints for all PaaS (no public access), Azure Firewall for all outbound, WAF for all web traffic, NSGs deny-all default, Bastion for management, JIT VM access, network segmentation.

  5. How do you troubleshoot a VNet connectivity issue? → Network Watcher: IP Flow Verify (NSG check), Next Hop (routing check), Connection Troubleshoot (end-to-end). Check NSG rules, UDR routes, VNet peering status, DNS resolution.

  6. How do you design DNS for Private Endpoints? → Create Private DNS Zone per service (privatelink.blob.core.windows.net). Link to VNet. Create A record for private endpoint IP. Auto-registration for VMs. Azure Firewall DNS proxy for FQDN rules.

  7. What’s the difference between VNet peering and VPN? → Peering: connects VNets within Azure (Microsoft backbone), high bandwidth, low latency, not transitive. VPN: connects VNet to on-prem over internet (encrypted IPsec), lower bandwidth, higher latency, can be transitive with BGP.

  8. How do you implement Azure Firewall in a hub-spoke? → Deploy in AzureFirewallSubnet (/26) in hub. UDR on all spoke subnets: 0.0.0.0/0 → Firewall. Configure network rules, app rules, NAT rules. Premium for TLS inspection + IDPS. Firewall Manager for policy management.

Role Deep Dive: Azure Data Engineer


Role Overview

Azure Data Engineers design and implement data pipelines, data storage, and data processing solutions on Azure. They build the infrastructure that moves data from source to destination — transforming, cleaning, and organizing it along the way. They are the plumbers of the data world.

Alternative Titles: Data Engineer, Cloud Data Engineer, ETL Engineer, Data Platform Engineer

Typical Salary Range: $100,000 – $165,000 (US)


Core Responsibilities

1. Data Pipeline Design & Implementation (30% of role)

Granular Tasks: - Data Factory Pipeline: - Linked services: connections to sources (SQL Server, Blob, API, SFTP) and sinks (Synapse, ADLS, Cosmos DB) - Datasets: define data structure (schema, format, compression) - Activities: Copy Data (move data), Get Metadata (check existence), ForEach (iterate), If Condition (branch), Stored Procedure, Databricks Notebook - Triggers: Schedule (daily at 2 AM), Tumbling Window (hourly), Event-based (blob created) - Parameters: make pipelines reusable across environments - Error handling: set dependency conditions (on success, on failure, on completion) - Logging: pipeline run history, activity output, error messages

2. Data Storage Design (25% of role)

Granular Tasks: - Medallion Architecture: - Bronze (Raw): ingest all data as-is. Partition by source/date. Parquet/Delta format. - Silver (Cleaned): deduplicated, validated, conformed. Schema enforced. Delta format. - Gold (Business): aggregated, business-ready. Star schema. Delta format. Serve to BI. - Each layer in separate ADLS container or folder hierarchy

3. Data Integration & Orchestration (15% of role)

Granular Tasks: - CDC patterns: read change feed (Cosmos DB), query CDC tables (SQL Server), use watermark columns (UpdatedDate) - Incremental load: only process new/changed data (more efficient than full load) - Full load: truncate and reload (simple but slower, use for small tables) - SCD (Slowly Changing Dimension): Type 1 (overwrite), Type 2 (add row with valid dates), Type 3 (add column) - Orchestrate: Data Factory pipeline chains activities, handles dependencies, retries on failure - Self-hosted Integration Runtime: for on-prem data sources (install on on-prem server)

4. Data Quality & Governance (15% of role)

Granular Tasks: - Data quality checks in pipeline: - Row count validation (source count ≈ destination count) - Null check on required columns - Schema validation (column exists, correct type) - Business rule validation (values within expected range) - Referential integrity (foreign keys exist) - On failure: alert team, quarantine bad data, don’t load to production - Microsoft Purview: scan data sources, auto-classify sensitive data, track lineage (source → transform → destination) - Data catalog: document datasets, owners, freshness, quality score

5. Data Processing with Databricks (15% of role)

Granular Tasks: - Delta Lake operations: - MERGE INTO target USING source ON target.id = source.id WHEN MATCHED THEN UPDATE WHEN NOT MATCHED THEN INSERT - Time travel: SELECT * FROM table VERSION AS OF 5 - VACUUM: remove old versions (retain 168 hours minimum) - OPTIMIZE: compact small files into larger ones - Z-ORDER: cluster data by frequently filtered columns - Spark optimization: - Partition data by frequently filtered column - Use broadcast joins for small-large table joins - Cache frequently accessed DataFrames - Avoid UDFs when possible (use built-in functions) - Right-size cluster: auto-scale, use job clusters (ephemeral)


Data Architecture Patterns

Pattern 1: Lambda Architecture (Batch + Real-time)

Source → Event Hubs → Stream Analytics → Cosmos DB (real-time serving)
                                    → ADLS (real-time archival)
Source → Data Factory → ADLS (Bronze) → Databricks (Silver) → Synapse (Gold) → Power BI

Pattern 2: Modern Data Warehouse

Sources → Data Factory → ADLS Gen2 (raw) → Synapse/Spark (transform) → Synapse SQL (serve) → Power BI

Pattern 3: Real-time Analytics

IoT Devices → IoT Hub → Event Hubs → Stream Analytics → Cosmos DB (hot path)
                                                  → ADLS (cold path) → Synapse (batch analytics)

Certification Path

Certification Level Focus
DP-900 Foundational Azure Data fundamentals
DP-203 Associate Core cert — Azure Data Engineer
DP-300 Associate Azure Database Administrator

DP-203 Exam Breakdown

Domain Weight
Design and implement data storage 10-15%
Design and develop data processing 25-30%
Design and implement data security 10-15%
Monitor and optimize data storage and data processing 10-15%
Design and implement data governance 10-15%

Interview Focus Areas

  1. Design a data pipeline from on-prem SQL to Azure. → Self-hosted IR → Data Factory → Copy Activity → ADLS (Bronze) → Databricks (Silver/Gold) → Synapse SQL → Power BI. Incremental load using watermark.

  2. Explain the medallion architecture. → Bronze (raw, as-is), Silver (cleaned, validated, conformed), Gold (business aggregates, star schema). Each layer adds quality and structure.

  3. How do you handle incremental data loads? → Watermark column (UpdatedDate), store last watermark value, next run reads > last watermark. CDC for real-time. Delta Lake MERGE for upserts.

  4. When to use Synapse vs Databricks? → Synapse: integrated platform (SQL + Spark + Pipelines), good for SQL-heavy teams. Databricks: superior Spark experience, Delta Lake native, collaborative notebooks, better for data science. Many use both.

  5. How do you ensure data quality in pipelines? → Validation at each layer: row counts, null checks, schema validation, business rules. Quarantine bad data. Alert on failures. Track quality metrics over time.

  6. What is Delta Lake and why use it? → Storage layer on data lake adding ACID transactions, time travel, schema enforcement, MERGE (upsert). Solves data reliability. Default in Databricks.

  7. How do you optimize Synapse dedicated SQL pool? → Choose correct distribution (Hash for large facts, Replicated for small dims), use Columnstore indexes, partition by date, use PolyBase for loading, CTAS for transforms, avoid data movement.

  8. How do you design for real-time vs batch? → Real-time: Event Hubs + Stream Analytics + Cosmos DB (sub-second latency). Batch: Data Factory + ADLS + Databricks + Synapse (minutes-hours latency). Lambda: both paths, merge at serving layer.

Role Deep Dive: Azure Data Scientist


Role Overview

Azure Data Scientists build and deploy machine learning models on Azure. They experiment with algorithms, train models at scale, and operationalize ML solutions. They combine data science expertise with Azure ML platform knowledge.

Alternative Titles: ML Engineer, Applied Data Scientist, Cloud Data Scientist

Typical Salary Range: $110,000 – $175,000 (US)


Core Responsibilities

1. Model Development & Experimentation (35% of role)

Granular Tasks: - Data preparation in Databricks notebooks (PySpark, pandas) - Feature engineering: create derived features, handle missing values, encode categoricals, scale numericals - Model training: scikit-learn, XGBoost, LightGBM, PyTorch, TensorFlow - Experiment tracking: log parameters, metrics, artifacts with MLflow - Hyperparameter tuning: Azure ML SweepJob (random, grid, Bayesian sampling) - Cross-validation: stratified k-fold for classification, time-series split for forecasting - Model evaluation: accuracy, precision, recall, F1, AUC-ROC, RMSE, MAE - Interpretability: SHAP values, feature importance, partial dependence plots

2. Azure Machine Learning Platform (25% of role)

Granular Tasks: - AML Workspace: central resource for all ML artifacts - Compute Instance: dev environment (Jupyter, VS Code integrated) - Compute Cluster: auto-scaling training cluster (GPU/CPU), set min/max nodes - Datastores: connect to Blob, ADLS, SQL, PostgreSQL - Datasets: versioned, typed data references (Tabular, File) - Model registration: register trained model with version, description, tags - AML Pipelines: orchestrate training workflow (data prep → train → evaluate → register) - AutoML: automated algorithm selection + hyperparameter tuning + feature engineering - Responsible AI: assess fairness, interpretability, error analysis

3. Model Deployment & Operationalization (20% of role)

Granular Tasks: - Real-time Endpoint (Managed Online Endpoint): - Deploy model with scoring script (entry_script), environment (conda/Docker), compute - Blue/green deployment: deploy v2 alongside v1, shift traffic gradually - Auto-scaling: based on request count - Authentication: key-based or AAD token

4. Responsible AI & Ethics (10% of role)

Granular Tasks: - Fairness assessment: compare model performance across groups (gender, race, age) - SHAP explanations: show which features influenced each prediction - Error analysis: identify subgroups with higher error rates - Model cards: document intended use, limitations, performance, ethical considerations - Privacy: differential privacy, data anonymization

5. Advanced Analytics (10% of role)

Granular Tasks: - Recommendations: collaborative filtering, content-based, Azure AI Search with personalizer - NLP: text classification, NER, sentiment analysis (Language Service), custom models - Computer Vision: image classification, object detection (Custom Vision, Vision Studio) - LLM: Azure OpenAI for text generation, summarization, RAG (Retrieval-Augmented Generation) - RAG architecture: Azure OpenAI + AI Search (index documents) → ground LLM responses in your data


Certification Path

Certification Level Focus
DP-900 Foundational Data fundamentals
AI-900 Foundational AI fundamentals
DP-100 Associate Core cert — Azure Data Scientist
AI-102 Associate Azure AI Engineer (complement)

DP-100 Exam Breakdown

Domain Weight
Set up an Azure Machine Learning workspace 5-10%
Manage data in Azure Machine Learning 5-10%
Run experiments and train models 20-25%
Optimize and manage models 15-20%
Deploy and consume models 25-30%

Interview Focus Areas

  1. Walk me through an ML project lifecycle on Azure. → Define problem → Explore data (Databricks) → Feature engineering → Train models (AML + MLflow) → Evaluate → Register → Deploy (managed endpoint) → Monitor (drift) → Retrain

  2. How do you handle model drift? → Monitor input distribution and prediction distribution. When drift exceeds threshold, trigger retraining pipeline. Compare new model vs champion model on test set. Deploy if improved.

  3. How do you deploy a model with zero downtime? → Blue/green deployment on managed online endpoint. Deploy v2 alongside v1. Shift traffic gradually (10% → 50% → 100%). Rollback = route traffic back to v1.

  4. Explain MLOps. → ML lifecycle automation: CI (test code), CD (deploy model), CT (continuous training). Pipeline: data change → retrain → evaluate → register → deploy. Model registry with approval gates.

  5. What is RAG and how do you implement it on Azure? → Retrieval-Augmented Generation. Index documents in AI Search. When user queries: retrieve relevant docs → pass as context to Azure OpenAI → generate grounded response. Reduces hallucinations.

Role Deep Dive: Azure AI Engineer


Role Overview

Azure AI Engineers build AI-powered solutions using Azure AI Services (Cognitive Services), Azure OpenAI, and Azure AI Search. They integrate pre-built AI capabilities into applications, build custom models, and implement generative AI solutions. They focus on applied AI — making AI work in production.

Alternative Titles: AI Engineer, Applied AI Engineer, Cognitive Services Engineer, GenAI Engineer

Typical Salary Range: $110,000 – $180,000 (US)


Core Responsibilities

1. Azure OpenAI Solutions (30% of role)

Granular Tasks: - Azure OpenAI Setup: - Create Azure OpenAI resource - Deploy model: GPT-4o (general), GPT-4 (complex reasoning), GPT-3.5-turbo (fast/cheap), DALL-E 3 (image generation), text-embedding-ada-002 (embeddings) - Configure content filters: hate, sexual, violence, self-harm (default medium). Adjust per use case. - Private Endpoint for secure access - RBAC: Cognitive Services OpenAI User, Cognitive Services OpenAI Contributor

2. Azure AI Services (Cognitive Services) (25% of role)

Granular Tasks: - Vision: - Computer Vision: analyze images (tags, objects, faces, description), OCR (read text), spatial analysis - Custom Vision: train custom image classifier or object detector with your labeled images - Face: detect faces, verify identity, find similar faces, group faces

3. Azure AI Search (Cognitive Search) (20% of role)

Granular Tasks: - Index Design: - Fields: define searchable, filterable, sortable, facetable, retrievable per field - Analyzer: choose language analyzer (English, etc.) or custom analyzer - Suggesters: configure autocomplete/suggestions - Scoring profiles: boost results by recency, location, or custom weight

4. Bot Development (10% of role)

Granular Tasks: - Bot Framework SDK: build bot in C#/Python/Node.js - Bot Service: host bot, configure channels (Teams, Web Chat, Slack, Facebook, etc.) - Conversational flow: dialogs, waterfalls, prompts - Connect to CLU (Language Understanding) for intent recognition - Connect to Azure OpenAI for generative responses - Copilot Studio: low-code bot builder (formerly Power Virtual Agents)

5. AI Solution Architecture (15% of role)

Granular Tasks: - Architecture decision: pre-built API (fast, less control) vs custom model (slow, more control) - Latency optimization: cache responses, use smaller models for real-time, batch for offline - Cost optimization: choose right pricing tier, throttle API calls, cache embeddings - Responsible AI: assess fairness, reliability, privacy, transparency, accountability - Multi-region: deploy AI services in nearest region for low latency


AI Solution Architecture Patterns

Pattern 1: Document Intelligence Pipeline

Documents (PDF/images) → Document Intelligence (extract structure)
                       → AI Search (index + search)
                       → Azure OpenAI (summarize, Q&A over documents)
                       → Web App (user interface)

Pattern 2: Conversational AI

User → Bot Service → CLU (intent/entity recognition)
                   → Azure OpenAI (generative response)
                   → AI Search (knowledge base lookup)
                   → Custom APIs (action execution)
                   → User

Pattern 3: Real-time Vision Pipeline

Camera/IoT → Event Hubs → Functions → Computer Vision API
                                            → Cosmos DB (store results)
                                            → Alerts (anomaly detection)

Pattern 4: Enterprise RAG

Internal Documents → AI Search Indexer (chunk + embed) → AI Search Index
User Query → Frontend → API → Embed query → AI Search (vector + semantic)
                                           → Azure OpenAI (generate response)
                                           → Frontend (display with citations)

Certification Path

Certification Level Focus
AI-900 Foundational AI fundamentals
AI-102 Associate Core cert — Azure AI Engineer
DP-100 Associate Data Scientist (complement for ML)

AI-102 Exam Breakdown

Domain Weight
Plan and manage an Azure AI solution 5-10%
Implement decision support solutions 15-20%
Implement computer vision solutions 15-20%
Implement natural language processing solutions 20-25%
Implement knowledge mining and document intelligence solutions 15-20%
Implement generative AI solutions 15-20%

Interview Focus Areas

  1. How do you build a RAG solution on Azure? → Chunk documents → embed with text-embedding-ada-002 → store in AI Search index (vector field). On query: embed query → vector search → retrieve context → send to GPT-4o with system prompt + context → return grounded response with citations.

  2. How do you prevent hallucinations in LLM responses? → RAG (ground responses in retrieved context), prompt engineering (explicit instructions to use only provided context), content safety filters, groundedness detection, citation tracking, lower temperature (0-0.3 for factual).

  3. How do you choose between pre-built AI services and custom models? → Pre-built: fast to implement, no training data needed, good for common tasks (sentiment, OCR, translation). Custom: when pre-built doesn’t cover your domain, need higher accuracy, unique classification. Start with pre-built, build custom only if needed.

  4. How do you implement document processing at scale? → Document Intelligence for extraction → AI Search for indexing → AI enrichment skillset for deeper analysis → Azure OpenAI for Q&A. Indexer runs on schedule. Batch processing for large volumes.

  5. How do you secure Azure OpenAI in production? → Private Endpoint (no public access), RBAC (OpenAI User/Contributor), content filters, prompt shields, rate limiting, API management for external consumers, log all requests/responses, monitor for abuse.

  6. What is semantic search vs keyword search vs vector search? → Keyword: exact term match. Vector: semantic similarity via embeddings. Semantic: re-ranks keyword results by meaning. Best: hybrid (keyword + vector) + semantic ranker. Semantic understanding improves relevance significantly.

  7. How do you handle multi-language AI solutions? → Translator API for translation, language-specific analyzers in AI Search, multi-language Document Intelligence models, Azure OpenAI works in 50+ languages, store language preference per user, auto-detect language.

  8. How do you implement responsible AI? → Fairness: test across demographic groups. Reliability: test edge cases, adversarial inputs. Privacy: no PII in prompts, data retention policies. Transparency: document model limitations. Accountability: human review for high-stakes decisions. Content safety filters.