Category: Windows Azure

Microsoft Azure Devops – Dynamic Docker Agent (Ansible)

Microsoft Azure Devops – Dynamic Docker Agent (Ansible)

Often you may require a unique custom build/release agent with a specific set of tools.

A good example is a dynamic Ansible Agent that can manage post deployment configuration. This ensures configuration drift is minimised.

Secondly this part of a release is not too critical, so we can afford to spend a bit of time downloading a docker image if it is not already cached.

This article demonstates how you can dynamically spawn a docker container during your release pipeline to apply configuration leveraging Ansible. It will also demonstrate how to use Ansible Dynamic Inventory to detect Azure Virtual machine scale set instances – in the past you would run hacks on facter.

Prerequsites

You will require:

  • A docker image with ansible – You can use mine as a starting point – https://github.com/Romiko/DockerUbuntuDev
    The above is hosted at: dockerhub – romiko/ansible:latest (See reference at bottom of this page)
  • A Self-host Azure Devops Agent – Linux
  • Docker installed on the self-hosted agent
  • Docker configured to expose Docker Socket
    docker run -v /var/run/docker.sock:/var/run/docker.sock -d –name some_container some_image

Release Pipeline

Configure a CLI Task in your release pipeline.

variables:
  env: 'dev'

steps:
- task: AzureCLI@2
  displayName: 'Azure CLI Ansible'
  inputs:
    azureSubscription: 'RangerRom'
    scriptType: bash
    scriptLocation: inlineScript
    inlineScript: |
     set -x
     
     docker run --rm -v $(System.DefaultWorkingDirectory)/myproject/config:/playbooks/ romiko/ansible:latest \
      "cd  /playbooks/ansible; ansible-playbook --version; az login --service-principal --username $servicePrincipalId --password $servicePrincipalKey --tenant $tenantId; az account set --subscription $subscription;ansible-playbook my-playbook.yaml -i inventory_$(env)_azure_rm.yml --extra-vars \"ansible_ssh_pass=$(clientpassword)\""
    addSpnToEnvironment: true
    workingDirectory: '$(System.DefaultWorkingDirectory)/myproject/config/ansible'

In the above the code that is causing a SIBLING container to spawn on the self-hosted devops agent is:

docker run –rm -v $(System.DefaultWorkingDirectory)/myproject/config:/playbooks/ romiko/ansible:latest \ <command to execute inside the container>

Here we have a mount point occuring where the config folder in the repo will be mounted into the docker container.

-v <SourceFolder>:<MountPointInDockerContainer>

The rest of the code after the \ will execute on the docker container. So in the above,

  • The container will become a sibling,
  • Entry into a bash shell
  • Container will mount a /playbooks folder containing the source code from the build artifacts
  • Connect to azure
  • Run an anisble playbook.
  • The playbook will find all virtual machine scale sets in a resoruce group with a name pattern
  • Apply a configuration by configuring logstash to auto reload config files when they change
  • Apply a configuration by copying files

Ansible

The above is used to deploy configurations to an Azure Virtual Machine Scale Set. Ansible has a feature called dynamica inventory. We will leverage this feature to detect all active nodes/instances in a VMSS.

The structure of ansible is as follows:

Ansible Dynamic Inventory

So lets see how ansible can be used to detect all running instances in an Azure Virtual machine Scale Set

inventory_dev_azure_rm.yml

Below it will detect any VMSS cluster in resourcegroup rom-dev-elk-stack that has logstash in the name

plugin: azure_rm

include_vmss_resource_groups:
- rom-dev-elk-stack

conditional_groups:
  logstash_hosts: "'logstash' in name"

auth_source: auto

logstash_hosts.yml (Ensure this lives in a group_vars folder)

Now, I can configure ssh using a username or ssh keys.

---
ansible_connection: ssh
ansible_ssh_user: logstash

logstash-playbook.yaml

Below I now have ansible doing some configuration checks for me on a logstash pipeline (upstream/downstream architecture).


    - name: Logstash auto reloads check interval
      lineinfile:
        path: /etc/logstash/logstash.yml
        regexp: '^config\.reload\.interval'
        line: "config.reload.interval: 30s"
      become: true
      notify:
        - restart_service

    - name: Copy pipeline configs
      copy:
        src: ../pipelines/conf.d/
        dest: /etc/logstash/conf.d/
        owner: logstash
        group: logstash
      become: true
    
    - name: Copy pipeline settings
      copy:
        src: ../pipelines/register/
        dest: /etc/logstash/
        owner: logstash
        group: logstash
      become: true

To improve security – replace user/password ansible login with an SSH key pair.

References

To read up more about Docker Socket mount points. Check out

https://www.develves.net/blogs/asd/2016-05-27-alternative-to-docker-in-docker/

https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html

Thanks to Shawn Wang and Ducas Francis for the inspirations on Docker Socket.

https://azure.microsoft.com/en-au/services/devops/

Advertisement
Azure Streaming Analytics – Beware – JSON Engine

Azure Streaming Analytics – Beware – JSON Engine

After investigating an issue with Azure Streamiung Analytics, we discovered it cannot deserialise JSON that have the same property names but differ in case e.g.

{
  "Proxy": "abc",
  "proxy": "def"
}

If you send the above payload to a Streaming Analytics Job, it will fail.

Source ‘<unknown_location>’ had 1 occurrences of kind ‘InputDeserializerError.InvalidData’ between processing times ‘2020-03-30T00:19:27.8689879Z’ and ‘2020-03-30T00:19:27.8689879Z’. Could not deserialize the input event(s) from resource ‘Partition: [8], Offset: [1], SequenceNumber: [1]’ as Json. Some possible reasons: 1) Malformed events 2) Input source configured with incorrect serialization format

We opened a ticket with Microsoft. This was the response.

“Hi Romiko,

Thank you for being patience with us. I had further discussion with our ASA PG and here’s our findings.

Findings

ASA unfortunately does not support case sensitive column. We understand it is possible for json documents to add to have two columns that differ only in case and that some libraries support it. However there hasn’t been a compelling use case to support it. We will update the documentation as well. 

We are sorry for the inconvenience. If you have any questions or concerns, please feel free to reach out to me. I will be happy to assist you.”

Indeed other libraries do support this, such as powershell, c#, python etc.

C#

using Newtonsoft.Json;

data = “{‘name’: ‘Ashley’, ‘Name’: ‘Romiko’}”

dynamic message = JsonConvert.DeserializeObject(data);

var text = JsonConvert.SerializeObject(message);

I have built the following tool on github as a workaround for streaming data to elastic. https://github.com/Romiko/EventHub-CheckMalformedEvents

A significant reason why Microsoft should support it – is the Elastic Common Schema. (ECS), a new specification that provides a consistent and customizable way to structure your data in Elasticsearch, facilitating the analysis of data from diverse sources. With ECS, analytics content such as dashboards and machine learning jobs can be applied more broadly, searches can be crafted more narrowly, and field names are easier to remember.

When introducing a new schema, there is always dealing with existing/custom data. Elastic have an ingenious way to solve this. All fields in ECS are lower case. So your existing data can be guarnteed to not conflict if you use an UpperCase.

Let us reference Elastic’s advice

https://www.elastic.co/guide/en/ecs/current/ecs-custom-fields-in-ecs.html

image.png

Elastic, who deal with Big Data all the time recommend using Proxy vs proxy to ensure migrations to ECS is a vaiable/conflic free solution.

Conclusion

If you are migrating huge amounts of data to Elastic Common Schema (ECS), Consider if Azure Streaming Analytics is a good fit due to the JSON limits.

You can also vote to fix this issue here and improve Microsoft’s product offering 🙂

https://feedback.azure.com/forums/270577-stream-analytics/suggestions/40122079-azure-stream-analytics-to-be-able-to-handle-case-s

Debugging Azure Event Hubs and Stream Analytics Jobs

When you are dealing with millions of events per day (Json format). You need a debugging tool to deal with events that do no behave as expected.

Recently we had an issue where an Azure Streaming analytics job was in a degraded state. A colleague eventually found the issue to be the output of the Azure Streaming Analytics Job.

The error message was very misleading.

[11:36:35] Source 'EventHub' had 76 occurrences of kind 'InputDeserializerError.TypeConversionError' between processing times '2020-03-24T00:31:36.1109029Z' and '2020-03-24T00:36:35.9676583Z'. Could not deserialize the input event(s) from resource 'Partition: [11], Offset: [86672449297304], SequenceNumber: [137530194]' as Json. Some possible reasons: 1) Malformed events 2) Input source configured with incorrect serialization format\r\n"

The source of the issue was CosmosDB, we need to increase the RU’s. However the error seemed to indicate a serialization issue.

We developed a tool that could subscribe to events at exactly the same time of the error, using the sequence number and partition.

We also wanted to be able to use the tool for a large number of events +- 1 Million per hour.

Please click link to the EventHub .Net client. This tool is optimised to use as little memory as possible and leverage asynchronous file writes for the an optimal event subscription experience (Console app of course).

Have purposely avoided the newton soft library for the final file write to improve the performance.

The output will be a json array of events.

The next time you need to be able to subscribe to event hubs to diagnose an issue with a particular event, I would recommend using this tool to get the events you are interested in analysing.

Thank you.

Creating a Cloud Architecture Roadmap

Creating a Cloud Architecture Roadmap

Image result for cloud architecture jpg

Overview

When a product has been proved to be a success and has just come out of a MVP (Minimal Viable Product) or MMP (Minimal Marketable Product) state, usually a lot of corners would have been cut in order to get a product out and act on the valuable feedback. So inevitably there will be technical debt to take care of.

What is important is having a technical vision that will reduce costs and provide value/impact/scaleable/resilient/reliable which can then be communicated to all stakeholders.

A lot of cost savings can be made when scaling out by putting together a Cloud Architecture Roadmap. The roadmap can then be communicate with your stakeholders, development teams and most importantly finance. It will provide a high level “map” of where you are now and where you want to be at some point in the future.

A roadmap is every changing, just like when my wife and I go travelling around the world. We will have a roadmap of where want to go for a year but are open to making changes half way through the trip e.g. An earthquake hits a country we planned to visit etc. The same is true in IT, sometimes budgets are cut or a budget surplus needs to be consumed, such events can affect your roadmap.

It is something that you want to review on a regular schedule. Most importantly you want to communicate the roadmap and get feedback from others.

Feedback from other engineers and stakeholders is crucial – they may spot something that you did not or provide some better alternative solutions.

Decomposition

The first stage is to decompose your ideas. Below is a list that helps get me started in the right direction. This is by no means an exhausted list, it will differ based on your industry.

Component Description Example
Application Run-timeWhere apps are hostedAzure Kubernetes
Persistent StorageNon-Volatile DataFile Store
Block Store
Object Store
CDN
Message
Database
Cache
Backup/RecoveryBackup/Redundant SolutionsManaged Services
Azure OMS
Recovery Vaults
Volume Images
GEO Redundancy
Data/IOTConnected Devices / SensorsStreaming Analytics
Event Hubs
AI/Machine Learning
GatewayHow services are accessedAzure Front Door, NGIX, Application Gateway, WAF, Kubernetes Ingress Controllers
Hybrid ConnectivityOn-Premise Access
Cross Cloud
Express Route
Jumpboxes
VPN
Citrix
Source ControlWhere code lives
Build – CI/CD
Github, Bitbucket
Azure Devops, Octopus Deploy, Jenkins
Certificate ManagementSSL CertificatesAzure Key Vault
SSL Offloading strategies
Secret ManagementStore sensitive configurationPuppet (Hiera), Azure Keyvault, Lastpass, 1Password
Mobile Device ManagementGoogle Play
AppStore
G-Suite Enterprise MDM etc

Once you have an idea of all your components. The next step is to breakdown your road-map into milestones that will ultimately assist in reaching your final/target state. Which of course will not be final in a few years time 😉 or even months!

Sample Roadmap

Below is a link to a google slide presentation that you can use for your roadmap.

https://docs.google.com/presentation/d/1Hvw46vcWJyEW5b7o4Xet7jrrZ17Q0PVzQxJBzzmcn2U/edit?usp=sharing

Query Azure AppInsights with Powershell

In order to query AppInsights using powershell, you will need your AppInsights AppId and APIKey.

The important consideration is to ensure your JSON is valid, so always run it through a parser and use the correct escape characters for both JSON and PowerShell. Have a look at the string in $queryData.

The following code will query appinsights and generate csv files based on the batch size. It also using paging by leveraging:

| serialize | extend rn = row_number()

Happy DevOps Reporting 🙂

param (
[Parameter(Mandatory = $true)]
[string] $AppInsightsId,

[Parameter(Mandatory = $true)]
[string] $apiKey,

[Parameter(Mandatory = $false)]
[string]
$Timespan = "P7D",

[Parameter(Mandatory = $false)]
[int]
$batchSize = "10000",

[Parameter(Mandatory = $false)]
[string]
$OutputFolder = "C:\Output\",

[Parameter(Mandatory = $false)]
[string]$logFileName = "AppQuery.log",

[Parameter(Mandatory = $false)]
[string]$logFolder = "C:\Logs\"
)

Add-Type -AssemblyName System.Web

Import-Module .\Helpers.psm1 -Force -ErrorAction Stop
Import-Module .\Shared.Logging.psm1 -Force -Global
Import-Module .\Security.Helpers.psm1 -Force -Global

function prepareFileHeader($filenumber, $columnNames) {
$csvString = ""
ForEach ( $Property in $columnNames )
{
$csvString += "$($Property.Name),"
}
$csvString = $csvString.Substring(0,$csvString.Length -1)
$file = Join-Path $OutputFolder "batch-$i.csv"
$csvString | Out-File -filepath $file -Encoding utf8
$csvString = $null
}

function writeRecordsToFile($records) {
ForEach ( $record in $records )
{
$csvString = ""
foreach($cell in $record) {
$csvString += "$cell,"
}
$csvString = $csvString.Substring(0,$csvString.Length -1)
$file = Join-Path $OutputFolder "batch-$i.csv"
$csvString | Out-File -filepath $file -Encoding utf8 -Append -NoClobber
$csvString = $null
}
}

$logFilePath = PrepareToLog $logFolder $logFileName

try {
$url = "https://api.applicationinsights.io/v1/apps/$AppInsightsId/query"
$headers = @{"Content-Type" = "application/json"}
$headers.add("x-api-key", $apiKey)
$queryString = "?timespan=$Timespan"
$fullUrl = $url + $queryString

$queryTotalMessageCount = "traces\r\n | where message contains \`"Max Retry Count reached\`" and message contains \`"MessageService\`"\r\n | summarize count()"
$queryTotalMessageCountBody = "{
`"query`": `"$queryTotalMessageCount`"
}"
$resultCount = Invoke-WebRequest -Uri $fullUrl -Headers $headers -Method POST -Body $queryTotalMessageCountBody -ErrorAction Continue
$totalObject = ConvertFrom-Json $resultCount.Content

$totalRecords = $totalObject.tables.rows[0]
$pages = [math]::ceiling($totalRecords/$batchSize)
$startRow = 0
$endRow = $batchSize

Write-Host "Total Files: $pages for Batch Size: $batchSize"
For ($i=1; $i -le $pages; $i++) {
Write-Host "Processing File: C:\batch-$i.csv"
$queryData = "traces\r\n | extend TenantId = extract(\`"Tenant Id \\\\[[a-z0-9A-Z-.]*\\\\]\`", 0, message) | extend UniqueTransactionId = extract(\`"\\\\[[a-z0-9A-Z-. _\\\\^]*\\\\]\`",0 ,extract(\`"Message Transaction \\\\[[a-z0-9A-Z-._]*\\\\]\`", 0, message))\r\n | extend TransactionId = trim_start(\`"\\\\[\`", tostring(split(UniqueTransactionId, \`"_\`") [0]))\r\n | extend TransactionDateTicks = tostring(split(UniqueTransactionId, \`"_\`") [1])\r\n | extend PrincipalId = trim_end(\`"\\\\]\`", tostring(split(UniqueTransactionId, \`"_\`") [2]))\r\n | where message contains \`"Max Retry Count reached\`" and message contains \`"MessageService\`"\r\n | project TransactionId, TransactionDateTicks, PrincipalId, TenantId\r\n | summarize ErrorCount = count(TransactionId) by TransactionId, TransactionDateTicks, PrincipalId, TenantId\r\n | serialize | extend rn = row_number()\r\n | where rn > $startRow and rn <= $endRow"
$queryBody = "{
`"query`": `"$queryData`"
}"
$result = Invoke-WebRequest -Uri $fullUrl -Headers $headers -Method POST -Body $queryBody -ErrorAction Continue
$data = ConvertFrom-Json $result.Content
$startRow += $batchSize
$endRow += $batchSize

if($i -eq 1) {
$columnNames = $data.tables.columns | select name
}

prepareFileHeader $i $columnNames
writeRecordsToFile $data.tables.rows
}
} catch {
LogErrorMessage -msg $error[0] -filePath $logFilePath -fatal $true
}

ARM – Modular Templates – Reference resources already created

Hi,

I noticed the Microsoft documentation related to the following function is a little bit vague.

reference(resourceName or resourceIdentifier, [apiVersion], [‘Full’])

The second issue is see a lot of people having is how do you reference a resource already created in ARM and get some of that objects properties e.g. FQDN on a public IP already created etc.

The clue to solve this issue, so that ARM Template B can reference a resource created in ARM Template A can be found here:

By using the reference function, you implicitly declare that one resource depends on another resource if the referenced resource is provisioned within same template and you refer to the resource by its name (not resource ID). You don’t need to also use the dependsOn property. The function isn’t evaluated until the referenced resource has completed deployment.

Or use linked templates (Linked templates is a huge rework and you need to host the files on the net). Lets see if we can do it via resourceId.

Therefore if we do reference a resource by resourceId, we will remove the implicit “depends on”, allowing ARM Template B to use a resource created in a totally different ARM template.

A great example might be the FQDN on an IP Address.

Imagine ARM Template A creates the IP Address


"resources": [
{
"apiVersion": "[variables('publicIPApiVersion')]",
"type": "Microsoft.Network/publicIPAddresses",
"name": "[variables('AppPublicIPName')]",
"location": "[variables('computeLocation')]",
"properties": {
"dnsSettings": {
"domainNameLabel": "[variables('AppDnsName')]"
},
"publicIPAllocationMethod": "Dynamic"
},
"tags": {
"resourceType": "Service Fabric",
"scaleSetName": "[parameters('scaleSetName')]"
}
}]

Now Imagine we need to get the FQDN of the IP Address in a ARM Template B

What we going to do is try this:

reference(resourceIdentifier, [apiVersion]) ->
reference(resourceId(), [apiVersion]) ->
e.g.

Here is an example where ARM template B references a resource in A and gets a property.


"managementEndpoint": "[concat('https://',reference(resourceId('Microsoft.Network/publicIPAddresses/',variables('AppPublicIPName')), variables('publicIPApiVersion')).dnsSettings.fqdn,':',variables('nodeTypePrimaryServiceFabricHttpPort'))]",

The important thing here is to ensure you always include the API Version. This pattern is a very powerful way to create smaller and more modular ARM templates.

Note: In the above pattern, you do not need to define DependsOn in ARM Template B, as we are explicitly defining a reference to an existing resource. ARM Template B is not responsible for creating a public IP. If you need it, you run ARM Template A.

So if you need a reference to existing resources use the above. If you need a reference to resources created in the SAME ARM template use:

reference(resourceName)

Cheers

Service Fabric – Upgrading VMSS Disks, Operating System on Primary Node Type

How do you upgrade the existing Data Disk on a primary Node Type Virtual Machine ScaleSet in Service Fabric?

How do you upgrade the existing Operating System on a primary Node Type VMSS in Service Fabric?

How do you move the Data Disk on a primary Node Type VMSS in Service Fabric?

How do you monitor the status during the upgrade, so you know exactly how many seed nodes have migrated over to the new scale set?

note – We successfully increased the SKU size as well, however this is not supported by Microsoft. However just increase your SKU in ARm and later, after the successful transfer to the new VMSS, run Update-AzureRmServiceFabricDurability.

Considerations

  • You have knowledge to use ARM to deploy an Azure Load Balancer
  • You have knowledge to use ARM to deploy a VMSS Scale Set
  • Service Fabric Durability Tier/Reliability Tier must be at least Silver
  • Keep the original Azure DNS name on the Load Balancer that is used to connect to the Service Fabric Endpoint. Very Important to write it down as a backup
  • You will need to reduce the TTL of all your DNS settings to reduce downtime during the upgrade which will just be the TTL value e.g. 10 minutes. (Ensure you have access to your primary DNS provider to do this)
  • Prepare an ARM template to add the new Azure Load Balancer that the new VMSS scaleset will attach to (Backend Pool)
  • Prepare an ARM template to add the new VMSS to an existing Service Fabric primary Node Type
  • Deploy the new Azure Load Balancer + Virtual Machine Scale Set to the Service Fabric Primary node
  • Run the RemoveScaleSetFromClusterController.ps1 – Run this script on the NEW node in the NEW VMSS. This script will monitor and facilitate moving the Primary Node Type to the new VMSS for you.  It will show you the status of the Seed nodes moving from the original Primary Node Type to the new VMSS.
  • When it completed, the last part will be to update DNS.
  • Run MoveDNSToNewPublicIPController.ps1

ARM Templates

You will need only 2 templates. One to Deploy a new Azure Load Balancer and one to Deploy the new VMSS Scale Set to the existing Service Fabric Cluster.

You will also need a powershell script that will run a custom script extension.

Custom Script – prepare_sf_vm.ps1


$disks = Get-Disk | Where partitionstyle -eq 'raw' | sort number

$letters = 70..89 | ForEach-Object { [char]$_ }
$count = 0
$label = "datadisk"

foreach ($disk in $disks) {
    $driveLetter = $letters[$count].ToString()
    $disk | 
    Initialize-Disk -PartitionStyle GPT -PassThru |
    New-Partition -UseMaximumSize -DriveLetter $driveLetter |
    Format-Volume -FileSystem NTFS -NewFileSystemLabel "$label$count" -Confirm:$false -Force
$count++
}

# Disable Windows Update
Set-ItemProperty -Path 'HKLM:\SOFTWARE\Policies\Microsoft\Windows\WindowsUpdate\AU' -Name NoAutoUpdate -Value 1

 

Load Balancer – azuredeploy_servicefabric_loadbalancer.json

Use your particular Load Balancer ARM Templates. No need to attached a backend pool, as this will be done by the VMSS script below.

Service Fabric attach new VMSS – azuredeploy_add_new_VMSS_to_nodeType.json

Create your own VMSS scaleset that you attach to Service fabric. The important aspect are the following.

nodeTypeRef (To attach VMSS to existing PrimaryNodeType).
dataPath (To use a new Disk for data)
dataDisk (to add a new managed physical disk)

We use F:\ onwards as D is reserved for Temp storage and E: is reserved for a CD ROM in Azure VM’s.


{
                                "name": "[concat('ServiceFabricNodeVmExt',variables('vmNodeType0Name'))]",
                                "properties": {
                                    "type": "ServiceFabricNode",
                                    "autoUpgradeMinorVersion": true,
                                    "protectedSettings": {
                                        "StorageAccountKey1": "[listKeys(resourceId('Microsoft.Storage/storageAccounts', variables('supportLogStorageAccountName')),'2015-05-01-preview').key1]",
                                        "StorageAccountKey2": "[listKeys(resourceId('Microsoft.Storage/storageAccounts', variables('supportLogStorageAccountName')),'2015-05-01-preview').key2]"
                                    },
                                    "publisher": "Microsoft.Azure.ServiceFabric",
                                    "settings": {
                                        "clusterEndpoint": "[parameters('existingClusterConnectionEndpoint')]",
                                        "nodeTypeRef": "[parameters('existingNodeTypeName')]",
                                        "dataPath": "F:\\\\SvcFab",
                                        "durabilityLevel": "Silver",
                                        "enableParallelJobs": true,
                                        "nicPrefixOverride": "[variables('subnet0Prefix')]",
                                        "certificate": {
                                            "thumbprint": "[parameters('certificateThumbprint')]",
                                            "x509StoreName": "[parameters('certificateStoreValue')]"
                                        }
                                    },
                                    "typeHandlerVersion": "1.0"
                                }
                            },
....
.......
.........
"storageProfile": {
                        "imageReference": {
                            "publisher": "[parameters('vmImagePublisher')]",
                            "offer": "[parameters('vmImageOffer')]",
                            "sku": "2016-Datacenter-with-Containers",
                            "version": "[parameters('vmImageVersion')]"
                        },
                        "osDisk": {
                            "managedDisk": {
                                "storageAccountType": "[parameters('storageAccountType')]"
                            },
                            "caching": "ReadWrite",
                            "createOption": "FromImage"
                        },
                        "dataDisks": [
                            {
                                "managedDisk": {
                                    "storageAccountType": "[parameters('storageAccountType')]"
                                },
                                "lun": 0,
                                "createOption": "Empty",
                                "diskSizeGB": "[parameters('dataDiskSize')]",
                                "caching": "None"
                            }
                        ]
                    }

...
....
.....
 "virtualMachineProfile": {
                    "extensionProfile": {
                        "extensions": [
                            {
                                "name": "PrepareDataDisk",
                                "properties": {
                                    "publisher": "Microsoft.Compute",
                                    "type": "CustomScriptExtension",
                                    "typeHandlerVersion": "1.8",
                                    "autoUpgradeMinorVersion": true,
                                    "settings": {
                                    "fileUris": [
                                        "[variables('vmssSetupScriptUrl')]"
                                    ],
                                    "commandToExecute": "[concat('powershell -ExecutionPolicy Unrestricted -File prepare_sf_vm.ps1 ')]"
                                    }
                                }
                            },


 

Once you have a new VMSS scale set attached to the existing NodeType, you should see in Service Fabric the extra nodes. the next step is to disable and remove the existing VMSS scaleset. This is an online operation, so you should be fine. However later we will need to update DNS for the Cluster Endpoint. This is important for Powershell Admin tools to still connect to the Service Fabric cluster.

RemoveScaleSetFromClusterController.ps1

Remote into one of the NEW VMSS virtual machines and run the following command. It will make dead sure that your seed nodes migrate over. it can take a long time (Microsoft docs say it takes a long time, how long?). it depends, for a cluster with 5 seed nodes, it took nearly 4 hours! So be patient and update the loop timeout to match your environment, increase the timeout if you have more than 5 seed nodes. My general rule is allow 45 minutes per seed node transfer.


#Requires -Version 5.0
#Requires -RunAsAdministrator



param (
    [Parameter(Mandatory = $true)]
    [string]
    $subscriptionName,

    [Parameter(Mandatory = $true)]
    [string] 
    $scaleSetToDisable,

    [Parameter(Mandatory = $true)]
    [string]
    $scaleSetToEnable,

    [Parameter(Mandatory = $true)]
    [string] 
    $resourceGroupName
)

Install-Module AzureRM.Compute -Force

Import-Module ServiceFabric -Force
Import-Module AzureRM.Compute -Force

function Disable-InternetExplorerESC {
    $AdminKey = "HKLM:\SOFTWARE\Microsoft\Active Setup\Installed Components\{A509B1A7-37EF-4b3f-8CFC-4F3A74704073}"
    $UserKey = "HKLM:\SOFTWARE\Microsoft\Active Setup\Installed Components\{A509B1A8-37EF-4b3f-8CFC-4F3A74704073}"
    Set-ItemProperty -Path $AdminKey -Name "IsInstalled" -Value 0
    Set-ItemProperty -Path $UserKey -Name "IsInstalled" -Value 0
    Stop-Process -Name Explorer
    Write-Host "IE Enhanced Security Configuration (ESC) has been disabled." -ForegroundColor Green
}

function Enable-InternetExplorerESC {
    $AdminKey = "HKLM:\SOFTWARE\Microsoft\Active Setup\Installed Components\{A509B1A7-37EF-4b3f-8CFC-4F3A74704073}"
    $UserKey = "HKLM:\SOFTWARE\Microsoft\Active Setup\Installed Components\{A509B1A8-37EF-4b3f-8CFC-4F3A74704073}"
    Set-ItemProperty -Path $AdminKey -Name "IsInstalled" -Value 1
    Set-ItemProperty -Path $UserKey -Name "IsInstalled" -Value 1
    Stop-Process -Name Explorer
    Write-Host "IE Enhanced Security Configuration (ESC) has been enabled." -ForegroundColor Green
}

$ErrorActionPreference = "Stop"

Disable-InternetExplorerESC

Login-AzureRmAccount -SubscriptionName $subscriptionName

Write-Host "Before you continue:  Ensure IE Enhanced Security is off."
Write-Host "Before you continue:  Ensure your new scaleset is ALREADY added to the Service Fabric Cluster"
Pause

try {
    Connect-ServiceFabricCluster
    Get-ServiceFabricClusterHealth
} catch {
    Write-Error "Please run this script from one of the new nodes in the cluster."
}

Write-Host "Please do not continue unless the Cluster is healthy and both Scale Sets are present in the SFCluster."
Pause

$nodesToDisable = Get-ServiceFabricNode | Where NodeName -match "_($scaleSetToDisable)_\d+"
$OldSeedCount = ( $nodesToDisable | Where IsSeedNode -eq  $true | Measure-Object).Count
$nodesToEnable = Get-ServiceFabricNode | Where NodeName -match "_($scaleSetToEnable)_\d+"

if($OldSeedCount -eq 0){
    Write-Error "Node Seed count must be greater than zero."
    exit
}

if($nodesToDisable.Count -eq 0){
    Write-Error "No nodes to disable found."
    exit
}

if($nodesToEnable.Count -eq 0){
    Write-Error "No nodes to enable found."
    exit
}

If (-not ($nodesToEnable.Count -ge $OldSeedCount)) {
    Write-Error "The new VM Scale Set must have at least $OldSeedCount nodes in order for the Seed Nodes to migrate over."
    exit
}

Write-Host "Disabling nodes in VMSS $scaleSetToDisable. Are you sure?"
Pause

foreach($node in $nodesToDisable){
    Disable-ServiceFabricNode -NodeName $node.NodeName -Intent RemoveNode -Force
}

Write-Host "Checking node status..."
$loopTimeout = 360
$loopWait = 60
$oldNodesDeactivated = $false
$newSeedNodesReady = $false

while ($loopTimeout -ne 0) {
    Get-Date -Format o
    Write-Host
    Write-Host "Nodes To Remove"

    foreach($nodeToDisable in $nodesToDisable) {
        $state = Get-ServiceFabricNode -NodeName $nodeToDisable.NodeName
        $msg = "{0} NodeDeactivationInfo: {1} IsSeedNode: {2} NodeStatus {3}" -f $nodeToDisable.NodeName, $state.NodeDeactivationInfo.Status, $state.IsSeedNode, $state.NodeStatus
        Write-Host $msg
    }

    $oldNodesDeactivated = ($nodesToDisable |  Where-Object { ($_.NodeStatus -eq [System.Fabric.Query.NodeStatus]::Disabled) -and ($_.NodeDeactivationInfo.Status -eq "Completed") } | Measure-Object).Count -eq $nodesToDisable.Count

    Write-Host
    Write-Host "Nodes To Add Status"

    foreach($nodeToEnable in $nodesToEnable) {
        $state = Get-ServiceFabricNode -NodeName $nodeToEnable.NodeName
        $msg = "{0} IsSeedNode: {1}, NodeStatus: {2}" -f $nodeToEnable.NodeName, $state.IsSeedNode, $state.NodeStatus
        Write-Host $msg
    }
    $newSeedNodesReady = ($nodesToEnable |  Where-Object { ($_.NodeStatus -eq [System.Fabric.Query.NodeStatus]::Up) -and $_.IsSeedNode} | Measure-Object).Count -ge $OldSeedCount
    if($oldNodesDeactivated -and $newSeedNodesReady) {
        break
    }
    $loopTimeout -= 1
    Start-Sleep $loopWait
}

if (-not ($oldNodesDeactivated)) {
    Write-Error "A node failed to deactivate within the time period specified."
    exit
}

$loopTimeout = 180
while ($loopTimeout -ne 0) {
    Write-Host
    Write-Host "Nodes To Add Status"

    foreach($nodeToEnable in $nodesToEnable) {
        $state = Get-ServiceFabricNode -NodeName $nodeToEnable.NodeName
        $msg = "{0} IsSeedNode: {1}, NodeStatus: {2}" -f $nodeToEnable.NodeName, $state.IsSeedNode, $state.NodeStatus
        Write-Host $msg
    }
    $newSeedNodesReady = ($nodesToEnable |  Where-Object { ($_.NodeStatus -eq [System.Fabric.Query.NodeStatus]::Up) -and $_.IsSeedNode} | Measure-Object).Count -ge $OldSeedCount
    if($newSeedNodesReady) {
        break
    }
    $loopTimeout -= 1
    Start-Sleep $loopWait
}

$NewSeedNodes = Get-ServiceFabricNode | Where-Object {($_.NodeName -match "_($scaleSetToEnable)_\d+") -and ($_.IsSeedNode -eq $True)}
Write-Host "New Seed Nodes are:"
$NewSeedNodes | Select NodeName
$NewSeedNodesCount = ($NewSeedNodes  | Measure-Object).Count

if($NewSeedNodesCount -ge $OldSeedCount) {
    Write-Host "Removing the scale set $scaleSetToDisable"
    Remove-AzureRmVmss -ResourceGroupName $ResourceGroupName -VMScaleSetName $scaleSetToDisable -Force
    Write-Host "Removed scale set $scaleSetToDisable"

    Write-Host "Removing Node State for old nodes"
    $nodesToDisable | Remove-ServiceFabricNodeState -Force
    Write-Host "Done"

    Get-ServiceFabricClusterHealth
    Get-ServiceFabricNode
} else {
    Write-Host "New Seed Nodes do not match the minimum requirements $NewSeedNodesCount."
    Write-Host "Manually run  Remove-AzureRmVmss"
    Write-Host "Then Manually run  Remove-ServiceFabricNodeState"
    Get-ServiceFabricClusterHealth
    Get-ServiceFabricNode
}

Enable-InternetExplorerESC

This script is extremely useful, you can see the progress of the transfer of seed nodes and disabling of existing primary node types.

You know it is successful, when the old nodes have ZERO seed nodes. All SEED nodes must transfer over to the new nodes, and all nodes in the old  scale set shoul dbe set to false by the end of the script execution.

MoveDNSToNewPublicIPController.ps1

Lastly you MUST update DNS to use the original CNAME . This script can help with this, what it does is actually detach the original internal Azure CNAME from the old public IP and move it to your new public IP attached to the new load balancer.




param (
        [Parameter(Mandatory = $true)]
        [string]
        $subscriptionName,

        [Parameter(Mandatory = $true)]
        [string]
        $oldLoadBalancerName,

        [Parameter(Mandatory = $true)]
        [string]
        $resourceGroupName=,

        [Parameter(Mandatory = $true)]
        [string]
        $oldPublicIpName=,

        [Parameter(Mandatory = $true)]
        [string]
        $newPublicIpName=
)

    Install-Module AzureRM.Network -Force
    Import-Module AzureRM.Network -Force

    $ErrorActionPreference = "Stop"
    Login-AzureRmAccount -SubscriptionName $subscriptionName

    Write-Host "Are you sure you want to do this. There will be brief connectivty downtime?"
    Pause

    $oldprimaryPublicIP = Get-AzureRmPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $resourceGroupName
    $primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel
    $primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn
    
    if($primaryDNSName.Length -gt 0 -and $primaryDNSFqdn -gt 0) {
        Write-Host "Found the Primary DNS Name" $primaryDNSName
        Write-Host "Found the Primary DNS FQDN" $primaryDNSFqdn
    } else {
        Write-Error "Could not find the DNS attached to Old IP $oldprimaryPublicIP"
        Exit
    }
    
        Write-Host "Moving the Azure DNS Names to the new Public IP"
    $PublicIP = Get-AzureRmPublicIpAddress -Name $newPublicIpName -ResourceGroupName $resourceGroupName
    $PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
    $PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
    Set-AzureRmPublicIpAddress -PublicIpAddress $PublicIP

    Get-AzureRmPublicIpAddress -Name $newPublicIpName -ResourceGroupName $resourceGroupName
    Write-Host "Transfer Done"

    Write-Host "Removing Load Balancer related to old Primary NodeType."
    Write-Host "Are you sure?"
    Pause

    Remove-AzureRmLoadBalancer -Name $oldLoadBalancerName -ResourceGroupName $resourceGroupName -Force
    Remove-AzureRmPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $resourceGroupName -Force

    Write-Host "Done"

Summary

In this article you followed the process to:

  • Configure ARM to add a new VMSS with OS, Data Disk and Operating System
  • Add a new Virtual Machine Scale Set to an Existing Service Fabric Node Type
  • Ran a powershell script controller to monitor the outcome of the VMSS transfer.
  • Transferred the original management DNS CNAME to the new Public IP Address

Conclusion

This project requires a lot of testing for your environment, allocate at least a a few days to test the entire process before you try it out on your production services.

HTH

Puppet – Join Ubuntu 16.04 Servers to an Azure Windows Active Directory Domain

Puppet – Join Ubuntu 16.04 Servers to an Azure Windows Active Directory Domain

We use Azure Active Directory Domain Services and wanted a single sign on solution for Windows and Linux. The decision was made to join all servers to the Windows Domain in addition to having SSH Key auth.

I assume you know how to use Puppet Code Manager and have a Puppet Control Repository to manage Roles and Profiles.

Ensure you have a Active Directory Service Account that has permissions to join a computer to a domain. You can store this user in Hiera.

The module we need is called Realmd, however the current version (Version 2.3.0 released Sep 3rd 2018) does not support Ubuntu  16.04. So I have a forked repro here that you can use.

Modules

puppetfile


mod 'romiko-realmd',
  :git    => 'https://github.com/Romiko/realmd.git',
  :branch => 'master'

mod 'saz-resolv_conf',                  '4.0.0'

linuxdomain.pp


class profile::domain::linuxdomain (
  String $domain = 'RANGERROM.COM',
  String $user = 'romiko.derbynew@rangerrom.com',
  String $password = 'info',
  Array  $aadds_dns = ['10.0.103.36','10.0.103.37']
) {

  $hostname = $trusted['hostname']
  $domaingroup = downcase($domain)

  host { 'aaddshost':
    ensure  => present,
    name    => "${hostname}.${domain}",
    comment => 'Azure Active Directory Domain Services',
    ip      => '127.0.0.1',
  }

  class { 'resolv_conf':
    nameservers => $aadds_dns,
    searchpath  => [$domain],
  }

  exec { "waagent":
    path        => ['/usr/bin', '/usr/sbin', '/bin'],
    command     => "waagent -start",
    refreshonly => true,
    subscribe   => [File_line['waagent.conf']],
  }

  exec { "changehostname":
    path        => ['/usr/bin', '/usr/sbin', '/bin'],
    command     => "hostnamectl set-hostname ${hostname}.${domain}",
    refreshonly => true,
    subscribe   => [File_line['waagent.conf']],
  }

  file_line { 'waagent.conf':
    ensure             => present,
    path               => '/etc/waagent.conf',
    line               => 'Provisioning.MonitorHostName=y',
    match              => '^Provisioning.MonitorHostName=n',
    append_on_no_match => false,
  }

  class { 'ntp':
    servers => [ $domain ],
  }

  class { '::realmd':
    domain                => $domain,
    domain_join_user      => $user,
    domain_join_password  => $password
  } ->

  file_line { 'sssd.conf':
  ensure             => present,
  path               => '/etc/sssd/sssd.conf',
  line               => '#use_fully_qualified_names = True',
  match              => '^use_fully_qualified_names',
  append_on_no_match => false,
  }

  file_line { 'common-session':
  path => '/etc/pam.d/common-session',
  line => 'session required        pam_mkhomedir.so skel=/etc/skel/ umask=0077',
  }

  file_line { 'sudo_rule':
  path => '/etc/sudoers',
  line => "%aad\ dc\ administrators@${domaingroup} ALL=(ALL) NOPASSWD:ALL",
  }
}

The above code ensures:

  • Its fqdn is in the /etc/hosts file so it can resolve to itself.
  • The Active Directory DNS servers are in resolv.conf and the default search domain is set. This allows nslookup to <computername> to work as well as <computername>.<domainname>
  • Renames the Linux Server and sets Azure to monitor host name changes
  • Sets the Network Time Protocol to use the Active Directory Severs
  • Uses Realmd 
    • Join the domain
    • configure Sssd, samba etc
    • We using my fork until the pull request is accepted
  • Makes changes to SSSD and PAM to ensure smooth operations with Azure Active Directory Domain Services
  • Adds the administrators from AADDS to the sudoers

 

With the above running on your Linux agent, you will have Linux machines using the domain and can leverage single sign on.

CASE SENSITIVE –  Ensure you use Upper Case for $domain.

My source of inspiration to do this configuration came from a Microsoft Document here:

https://docs.microsoft.com/en-us/azure/active-directory-domain-services/active-directory-ds-join-ubuntu-linux-vm

However, here is a Hiera config file to get you going for this class if you like using a key value data source.

common.yaml

#AADDS Domain must be Upper Case else Kerberos Tickets fail
profile::domain::linuxdomain::domain: 'RANGERROM.COM'
profile::domain::linuxdomain::user: 'serviceaccounttojoindomain@RANGERROM.COM'
profile::domain::linuxdomain::aadds_dns:
        - '10.0.103.36'
        - '10.0.103.37'

common.eyaml

profile::domain::linuxdomain::password:
    ENC[PKCS7,MIIBiQ.....==]

In my next article, I will write about setting up a flexible Puppet Control Repository and leveraging Hiera.

Now I can login to the server using my Azure Active Directory credentials that I even use for Outlook, Skype and other Microsoft Products.

ssh -l romiko.derbynew@rangerrom.com myserver   (resolv.conf has a search entry for rangerrom.com, so the suffix will get applied)
ssh -l romiko.derbynew@rangerrom.com myserver.rangerrom.com

Cheers

Azure Database for PostgreSQL – Backup/Restore

PSQL has some awesome tools pg_dump and pg_restore that can assist and cut down restore times in the event of a recovery event.

Scenario

User error – If a table was accidentally emptied at noon today:

• Restore point in time just before noon and retrieve the missing table and data from that new copy of the server (Azure Portal or CLI) , Powershell not supported yet.
• Update firewall rules to allow traffic internally on the new SQL Instance

Then log onto a JumpBox in the Azure cloud.
• Dump the restored database to file (pg_dump)


pg_dump -Fc -v -t mytable -h Source.postgres.database.azure.com -U pgadmin@source -d mydatabase > c:\Temp\mydatabase .dump

The switch -Fc – gives us flexibility with pg_restore later (Table Level Restore)

• Copy the table back to the original server using  pg_restore

You may need to truncate the target table before, as we doing data only restore (-a)
Table Only (not schema):


pg_restore -v -A -t mytable -h Target.postgres.database.azure.com -p 5432 -U pgadmin@target -d mydatabase "c:\Temp\mydatabase .dump"

note: Use a VM in the Azure cloud that has access to SQL via VNET Rules or SQL Firewalls.

We currently use PSQL 10. There is a known issue with VNet Rules. Just contact Microsoft if you need them, else use firewall rules for now.

Issues

VNET Rules and PostgreSQL 10

Issue Definition: If using VET rules to control access to the PostgreSQL 10.0 server, customer gets the following error: psql ‘sslmode=require host=server .postgres.database.azure.com port=5432 dbname=postgres’ –username=pgadmin@server psql: FATAL: unrecognized configuration parameter ‘connection_Vnet’ This is a known issue: https://social.msdn.microsoft.com/Forums/azure/en-US/0e99fb68-47fd-4053-a8be-5f8b87b3a660/azure-database-for-postgresql-vnet-service-endpoints-not-working?forum=AzureDatabaseforPostgreSQL

DNS CNAME

nslookup restoretest2.postgres.database.azure.com

Non-authoritative answer:

Name:    cr1.westus1-a.control.database.windows.net

Address:  23.9.34.71

Aliases:  restoretest2.postgres.database.azure.com

nslookup restoretest1.postgres.database.azure.com

Non-authoritative answer:

Name:    cr1.westus1-a.control.database.windows.net

Address:  23.9.34.71

Aliases:  restoretest1.postgres.database.azure.com

The Host is the same server. The way the connection string works is on the USERNAME.

These two command will connect to the SAME Server instance


pg_restore -v -a -t mytable -h restoretest2.postgres.database.azure.com -p 5432 -U pgadmin@restoretest1 -d mydatabase  "c:\Temp\mydatabase.dump"


pg_restore -v -a -t mytable -h restoretest1.postgres.database.azure.com -p 5432 -U pgadmin@restoretest1 -d mydatabase   "c:\Temp\mydatabase.dump"

The hostname just gets us to the Microsoft PSQL server, it is the username that points us to the correct instance. This is extremely important when updating connection strings on clients!

Happy Life – Happy Wife

Windows Azure – Restore Encrypted VM – BEK and KEK

Windows Azure – Restore Encrypted VM – BEK and KEK

When restoring an Encrypted Virtual Machine that is managed by Windows Azure, you will need to use PowerShell.

This is a two stage process.

Restore-BackupImageToDisk … | Restore-EncryptedBekVM …

Stage 1

Retrieve the Backup Container – This will contain the encrypted disks and json files with the secret keys.

  • keyEncryptionKey
  • diskEncryptionKey

This will allow you to retrieve the disks and JSON metadata in step 2.

Restore-BackupImageToDisk.ps1

param(
    [Parameter(Mandatory=$true)]
    [string]
    $SubscriptionName="RangerRom",

    [Parameter(Mandatory=$true)]
    [string]
    $ResourceGroup="Ranger-Rom-Prod",

    [Parameter(Mandatory=$true)]
    [string]
    $StorageAccount="RangerRomStorage",

    [Parameter(Mandatory=$true)]
    [string]
    $RecoveryVaultName="RecoveryVault",

    [Parameter(Mandatory=$true)]
    [string]
    $VMName,

    [Parameter(Mandatory=$true)]
    [datetime]
    $FromDate
)

Login-AzureRmAccount
Get-AzureRmSubscription -SubscriptionName $SubscriptionName | Select-AzureRmSubscription
$endDate = Get-Date

$namedContainer = Get-AzureRmRecoveryServicesBackupContainer  -ContainerType "AzureVM" -Status "Registered" | Where { $_.FriendlyName -eq $VMName }
$backupitem = Get-AzureRmRecoveryServicesBackupItem -Container $namedContainer  -WorkloadType "AzureVM"

$rp = Get-AzureRmRecoveryServicesBackupRecoveryPoint -Item $backupitem -StartDate $FromDate.ToUniversalTime() -EndDate $enddate.ToUniversalTime()
$restorejob = Restore-AzureRmRecoveryServicesBackupItem -RecoveryPoint $rp[0] -StorageAccountName $StorageAccount -StorageAccountResourceGroupName $ResourceGroup
Wait-AzureRmRecoveryServicesBackupJob -Job $restorejob -Timeout 43200
$restorejob = Get-AzureRmRecoveryServicesBackupJob -Job $restorejob
$details = Get-AzureRmRecoveryServicesBackupJobDetails -Job $restorejob
$details

Stage 2

We will grab the job details via the pipeline if provided or go find the earliest Restore Job after the FromDate and then initiate a restore of all the encrypted disks to the new Virtual Machine.

Restore-EncryptedBekVM.ps1

[CmdletBinding()]
param(
    [Parameter(ValueFromPipeline=$True, Mandatory=$false, HelpMessage="Requires a AzureVmJobDetails object e.g. Get-AzureRmRecoveryServicesBackupJobDetails. Optional.")]
    $JobDetails,

    [Parameter(Mandatory=$true, HelpMessage="Assumes osDisks, DataDisk, RestorePoints, AvailbilitySet, VM, Nic are all in same resource group.")]
    [string]
    $DefaultResourceGroup="Ranger-Rom-Prod",
    
    [Parameter(Mandatory=$true)]
    [string]
    $SourceVMName="Lisha",

    [Parameter(Mandatory=$true)]
    [string]
    $TargetVMName,

    [Parameter(Mandatory=$true)]
    [string]
    $BackupVaultName="RecoveryVault",

    [Parameter(Mandatory=$true)]
    [datetime]
    $FromDate
)

Begin {
    $FromDate = $FromDate.ToUniversalTime()
    Write-Verbose "Started. $(Get-Date)"
}

Process {
	Write-Verbose "Retrieving Backup Vault."
	if(-not $JobDetails)
	{
		Get-AzureRmRecoveryServicesVault -Name $BackupVaultName -ResourceGroupName $DefaultResourceGroup | Set-AzureRmRecoveryServicesVaultContext
		$Job = Get-AzureRmRecoveryServicesBackupJob -Status Completed -Operation Restore -From $FromDate | Sort -Property StartTime | Where { $_.WorkloadName -eq $SourceVMName} | Select -Last 1
		if($Job -eq $null) {
			throw "Job $workLoadName not found."
		}
		$JobDetails = Get-AzureRmRecoveryServicesBackupJobDetails -Job $Job
	}

	Write-Verbose "Query the restored disk properties for the job details."
	$properties = $JobDetails.properties
	$storageAccountName = $properties["Target Storage Account Name"]
	$containerName = $properties["Config Blob Container Name"]
	$blobName = $properties["Config Blob Name"]

	Write-Verbose "Found Restore Blob Set at $($Properties['Config Blob Uri'])"
	Write-Verbose "Set the Azure storage context and restore the JSON configuration file."

	Set-AzureRmCurrentStorageAccount -Name $storageaccountname -ResourceGroupName $DefaultResourceGroup
	$folder = New-Item "C:\RangerRom" -ItemType Directory -Force
	$destination_path = "C:\$($folder.Name)\$SourceVMName.config.json"
	Get-AzureStorageBlobContent -Container $containerName -Blob $blobName -Destination $destination_path
	Write-Verbose "Restore config saved to file. $destination_path"
	$restoreConfig = ((Get-Content -Path $destination_path -Raw -Encoding Unicode)).TrimEnd([char]0x00) | ConvertFrom-Json

	# 3. Use the JSON configuration file to create the VM configuration.
	$oldVM = Get-AzureRmVM | Where { $_.Name -eq $SourceVMName }
	$vm = New-AzureRmVMConfig -VMSize $restoreConfig.'properties.hardwareProfile'.vmSize -VMName $TargetVMName -AvailabilitySetId $oldVM.AvailabilitySetReference.Id
	$vm.Location = $oldVM.Location

	# 4. Attach the OS disk and data disks - Managed, encrypted VMs (BEK only
	$bekUrl = $restoreConfig.'properties.storageProfile'.osDisk.encryptionSettings.diskEncryptionKey.secretUrl
	$keyVaultId = $restoreConfig.'properties.storageProfile'.osDisk.encryptionSettings.diskEncryptionKey.sourceVault.id
	$kekUrl = $restoreConfig.'properties.storageProfile'.osDisk.encryptionSettings.keyEncryptionKey.keyUrl
	$storageType = "StandardLRS"
	$osDiskName = $vm.Name + "_Restored_" + (Get-Date).ToString("yyyy-MM-dd-hh-mm-ss") + "_osdisk"
	$osVhdUri = $restoreConfig.'properties.storageProfile'.osDisk.vhd.uri
	$diskConfig = New-AzureRmDiskConfig -AccountType $storageType -Location $restoreConfig.location -CreateOption Import -SourceUri $osVhdUri
	$osDisk = New-AzureRmDisk -DiskName $osDiskName -Disk $diskConfig -ResourceGroupName $DefaultResourceGroup
	Set-AzureRmVMOSDisk -VM $vm -ManagedDiskId $osDisk.Id -DiskEncryptionKeyUrl $bekUrl -DiskEncryptionKeyVaultId $keyVaultId -KeyEncryptionKeyUrl $kekUrl -KeyEncryptionKeyVaultId $keyVaultId -CreateOption "Attach" -Windows

	$count = 0
	foreach($dd in $restoreConfig.'properties.storageProfile'.dataDisks)
	{
		$dataDiskName = $vm.Name + "_Restored_" + (Get-Date).ToString("yyyy-MM-dd-hh-mm-ss") + "_datadisk" + $count ;
		$dataVhdUri = $dd.vhd.uri;
		$dataDiskConfig = New-AzureRmDiskConfig -AccountType $storageType -Location $restoreConfig.location -CreateOption Import -SourceUri $dataVhdUri
		$dataDisk = New-AzureRmDisk -DiskName $dataDiskName -Disk $dataDiskConfig -ResourceGroupName $DefaultResourceGroup
		Add-AzureRmVMDataDisk -VM $vm -Name $dataDiskName -ManagedDiskId $dataDisk.Id -Lun $dd.Lun -CreateOption "Attach"
		$count += 1
	}

	Write-Verbose  "Setting the Network settings."

	$oldNicId = $oldVM.NetworkProfile.NetworkInterfaces[0].Id
	$oldNic = Get-AzureRmNetworkInterface -Name $oldNicId.Substring($oldNicId.LastIndexOf("/")+ 1) -ResourceGroupName $DefaultResourceGroup
	$subnetItems =  $oldNic.IpConfigurations[0].Subnet.Id.Split("/")
	$networkResourceGroup = $subnetItems[$subnetItems.IndexOf("resourceGroups")+1]
	$nicName= $vm.Name + "_nic_" + (New-Guid).Guid
	$nic = New-AzureRmNetworkInterface -Name $nicName -ResourceGroupName $networkResourceGroup -Location $oldNic.location -SubnetId $oldNic.IpConfigurations[0].Subnet.Id
	$vm=Add-AzureRmVMNetworkInterface -VM $vm -Id $nic.Id
	$vm.DiagnosticsProfile = $oldVM.DiagnosticsProfile

	Write-Verbose "Provisioning VM."
	New-AzureRmVM -ResourceGroupName $oldVM.ResourceGroupName -Location $oldVM.Location -VM $vm

}

End {
    Write-Verbose "Completed. $(Get-Date)"
}


 Usage

.\Restore-BackupImageToDisk … | .\Restore-EncryptedBekVM.ps1 …

Summary

Ensure you have Azure VM Backups and they are backed on on a regular schedule.

Use this script to backup a Restore Point from a Recovery Vault Container.

Remember it will pick the earliest date of a Restore Point relative to your From Date. If there are 4 Restores in the recovery vault from 1 May, it will pick the first one.

Remember to:

  • Decommission the old VM
  • Update DNS with the new ip address
  • Update the Load Balance Sets (If using a Load Balancer)