Category: DevOps

Adobe Experience Manager – Setup Azure Devops CICD

Overview

CI/CD Git Flow

Above we can see that we would like our developers to push to their own git repo (Customer Git = Azure Devops).

From here we can then Sync Azure git with AEM Cloud Manager Git.

Below is a sample build pipeline you can use in Azure Devops.

azure-pipline.yml

trigger:
  batch: true
  branches:
    include:
    - master

variables:
- name: remote_git
  value: rangerrom/africa-p46502-uk11112

stages:
- stage: AEM_Cloud_Manager
  jobs:
  - job: Push_To_Cloudmanager
    timeoutInMinutes: 10
    condition: succeeded()
    workspace:
      clean: all
    steps:
    #steps: [ script | bash | pwsh | powershell | checkout | task | templateReference ]
    
    - task: AzureKeyVault@1
      displayName: pull secrets
      inputs:
        azureSubscription: PROD
        KeyVaultName: mykeyvault
        SecretsFilter: aem_dm_cm_credentials
    - checkout: self
      clean: true
    - bash: echo "##vso[task.setvariable variable=git_ref]https://$(aem_dm_cm_credentials)@git.cloudmanager.adobe.com/$(remote_git)/"
      displayName: Set remote adobe git URL 
    - bash: git remote add adobe $(git_ref)
      displayName: Add git remote to Adobe CloudManager
    - bash: cat .git/config
      displayName: Show git config
    - bash: git checkout $(Build.SourceBranchName)
      displayName: Checkout $(Build.SourceBranchName) branch
    - bash: git push -f -v adobe $(Build.SourceBranchName)
      displayName: Push changes from $(Build.SourceBranchName) branch to Adobe CloudManager

That is pretty much the minimum required to sync the two git repos. Happy AEMing and building your CMS solution.

Advertisement
Kubernetes – Prometheus –  use an existing persistent volume claim

Kubernetes – Prometheus – use an existing persistent volume claim

We use the Prometheus Operator Chart to deploy the Prometheus, Alert Manager and Grafana stack,

Please note as of October 2020, the official Prometheus Operator Chart is.

prometheus-communityhttps://prometheus-community.github.io/helm-charts

To add this chart to your Helm repo.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

What usually happens is that you will initially install the chart and by default your kubernetes PV will have a default policy of DELETE. This means if you uninstall the chart, the Persistent Volume in the cloud (Azure, AWS, GCP etc) will also be deleted. Not a great outcome if you want historic metrics.

What you want is a PV that has a reclaim policy of retain, so that when the chart is every uninstalled, your managed disks in the cloud are retained.

So how do you go about doing this?

  • Install the chart initially with a persistent volume configured in the values files for Prometheus. (The default way)
  • Configure Grafana correctly on the first install.

Stage 1

Prometheus

We using managed GKE/GCP, so standard storage class is fine, your cloud provider may be different.

  • Configure your Prometheus Operator chart with the following in the values file.
prometheus:    
    prometheusSpec:
      storageSpec:
        volumeClaimTemplate:
          spec:
            storageClassName: standard
            resources:
              requests:
                storage: 64Gi

Grafana

With Grafana, you can get away with setting it up correctly first time round.

Create the PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Add the following to the Grafana values files.

grafana:
  persistence:
    enabled: true
    type: pvc
    existingClaim: pv-claim-grafana
  • Deploy your chart

Stage 2

Once the chart is deployed, go to your cloud provider and note the disk id’s. I am using GCP. So I note them down here:

In the above, the Name column is the disk id for GCP. Azure/AWS will be different e.g. Disk URI etc.

Go back to your helm chart repository and lets alter the chart so that Prometheus and Grafana are always bound to this disks, even if you uninstall the chart.

Prometheus

If you would like to keep the data of the current persistent volumes, it should be possible to attach existing volumes to new PVCs and PVs that are created using the conventions in the new chart. For example, in order to use an existing Azure disk for a helm release called `prometheus-operator` the following resources can be created:

  • Note down the RELEASE NAME of your prometheus operator chart. Mine is called prometheus operator.

Configure the following yaml template. This is a HACK. By making the name of the PV and PVC EXACTLY the same as the chart. Prometheus will reuse the PV/PVC.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pvc-prometheus-operator-prometheus-0
spec:
  storageClassName: "standard"
  capacity:
    storage: 64Gi
  accessModes:
    - ReadWriteOnce
  gcePersistentDisk:
    pdName: gke-dev-xyz-aae-pvc-d8971937-85f8-4566-b90e-110dfbc17cbb
    fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: prometheus
    prometheus: prometheus-operator-prometheus
  name:  prometheus-prometheus-operator-prometheus-db-prometheus-prometheus-operator-prometheus-0
spec:
  storageClassName: "standard"
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 64Gi
  • Configure the above to always run as a PreInstall hook e.g. with Helmfile
  - events: ["presync"]
    showlogs: true
    command: "kubectl"
    args:
    - apply
    - -n
    - monitoring
    - -f
    - ./pv/pv-prometheus.yaml

Grafana

Grafana is not so fussy. So we can do the following:

Configure the following yaml template.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-grafana
spec:
  storageClassName: "standard"
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  claimRef:
    namespace: service-compliance
    name: pv-claim-grafana
  gcePersistentDisk:
    pdName: gke-dev-xyz-aae-pvc-4b450590-8ec0-471d-bf1a-4f6aaa9c4e81
    fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pv-claim-grafana
spec:
  storageClassName: "standard"
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Then finally setup a preinstall helm sync if using helmfile

  hooks:
  - events: ["presync"]
    showlogs: true
    command: "kubectl"
    args:
    - apply
    - -n
    - monitoring
    - -f
    - ./pv/pv-grafana.yaml

With the above in place, you will be able to rerun chart installs for updates and uninstall the chart. Your final check is to ensure the PVs are RETAIN and not on the DELETE policy.

Site Reliability Engineering with Gitops & Kubernetes – Part 1

Site Reliability Engineering with Gitops & Kubernetes – Part 1

Intro

One of the key pillars regarding SRE is being able to make quantitative decisions based on key metrics.

The major challenge is what are key metrics and this is testament to the plethora of monitoring software out in the wild today.

At a foundational level you want to ensure your services are always running, however 100% availability in not practical.

Class SRE implements Devops
{
  MeasureEverything()
}

Availability/Error Budget

You then figure out what availability is practical for your application and services.

Availability

Your error budget will then be the downtime figure e.g. 99% is 7.2 hours of downtime a month that you can afford to have.

SLAs, SLOs and SLIs

This will be the starting point on your journey to implementing quantitative analysis to

  • Service Level Agreements
  • Service Level Objectives
  • Service Level Indicators

This blog post is all about how you can measure Service Level Objectives without breaking the bank. You do not need to spend millions of dollars on bloated monitoring solutions to observe key metrics that really impact your customers.

Just like baking a cake, these are the ingredients we will use to implement an agile, scaleable monitoring platform that is solely dedicated to doing one thing well.

Outcome

This is what we want our cake to deliver:

  • Measuring your SLA Compliance Level
  • Measuring your Error Budget Burn Rate
  • Measuring if you have exhausted your error budget
Service Level Compliance – SLAs -> SLOs -> SLIs

If you look at the cake above, you can see all your meaningful information in one dashboard.

  1. Around 11am the error budget burn rate went up. (A kin to your kids spending all their pocket money in one day!)
  2. Compliance was breached (99% availability) – The purple line (burn rate) went above the maximum budget (yellow line)

These are metrics you will want to ALERT on at any time of the day. These sort of metrics matter. They are violating a Service Level Agreement.

What about my other metrics?

Aaaah, like %Disk Queue Length, Processor Time, Kubernetes Nodes/Pods/Pools etc? Well…

I treat these metrics as second class citizens. Like a layered onion. Your initial metrics should be – Am I violating my SLA? If not, then you can use the other metrics that we have enjoyed over the decades to compliment your observeability into the systems and as a visual aid for diagnostics and troubleshooting.

Alerting

Another important consideration is the evolution of infrastructure. In 1999 you will have wanted to receive an alert if a server ran out of disk space. In 2020, you are running container orchestration clusters and other high availability systems. A container running out of disk space is not so critical as it used to be in 1999.

  • Evaluate every single alert you have and ask yourself. Do I really need to wake someone up at 3am for this problem?
  • Always alert on Service Level Compliance levels ABOUT to breach
  • Always alert on Error Budget Burn Rates going up sharply
  • Do not alert someone out of hours because the CPU is 100% for 5 minutes unless Service Level Compliance is being affected to

You will have happier engineers and a more productive team. You will be cooled headed during an incident because you know the different between a cluster node going down versus Service Level Compliance violations. Always solve the Service Level Compliance and then fix the other problems.

Ingredients

Where are the ingredients you promised? You said it will not break the bank, I am intrigued.

Summary

In this post we have touched on the FOUNDATION of what we want out of monitoring.

We know exactly what we want to measure – Service Level Compliance, Error Budget Burn Rate and Max Budget. All this revolves around deciding on the level of availability we give a service.

We touched on the basic ingredients that we can use to build this solution.

In my next blog post we will discuss how we mix all these ingredients together to provide a platform that is is good at doing one thing well.

Measuring Service Level Compliance & Error Budget Burn Rates

When you give your child $30 to spend a month and they need to save $10 a month. You need to be alerted if they spending too fast (Burn Rate).

Microsoft Azure Devops – Dynamic Docker Agent (Ansible)

Microsoft Azure Devops – Dynamic Docker Agent (Ansible)

Often you may require a unique custom build/release agent with a specific set of tools.

A good example is a dynamic Ansible Agent that can manage post deployment configuration. This ensures configuration drift is minimised.

Secondly this part of a release is not too critical, so we can afford to spend a bit of time downloading a docker image if it is not already cached.

This article demonstates how you can dynamically spawn a docker container during your release pipeline to apply configuration leveraging Ansible. It will also demonstrate how to use Ansible Dynamic Inventory to detect Azure Virtual machine scale set instances – in the past you would run hacks on facter.

Prerequsites

You will require:

  • A docker image with ansible – You can use mine as a starting point – https://github.com/Romiko/DockerUbuntuDev
    The above is hosted at: dockerhub – romiko/ansible:latest (See reference at bottom of this page)
  • A Self-host Azure Devops Agent – Linux
  • Docker installed on the self-hosted agent
  • Docker configured to expose Docker Socket
    docker run -v /var/run/docker.sock:/var/run/docker.sock -d –name some_container some_image

Release Pipeline

Configure a CLI Task in your release pipeline.

variables:
  env: 'dev'

steps:
- task: AzureCLI@2
  displayName: 'Azure CLI Ansible'
  inputs:
    azureSubscription: 'RangerRom'
    scriptType: bash
    scriptLocation: inlineScript
    inlineScript: |
     set -x
     
     docker run --rm -v $(System.DefaultWorkingDirectory)/myproject/config:/playbooks/ romiko/ansible:latest \
      "cd  /playbooks/ansible; ansible-playbook --version; az login --service-principal --username $servicePrincipalId --password $servicePrincipalKey --tenant $tenantId; az account set --subscription $subscription;ansible-playbook my-playbook.yaml -i inventory_$(env)_azure_rm.yml --extra-vars \"ansible_ssh_pass=$(clientpassword)\""
    addSpnToEnvironment: true
    workingDirectory: '$(System.DefaultWorkingDirectory)/myproject/config/ansible'

In the above the code that is causing a SIBLING container to spawn on the self-hosted devops agent is:

docker run –rm -v $(System.DefaultWorkingDirectory)/myproject/config:/playbooks/ romiko/ansible:latest \ <command to execute inside the container>

Here we have a mount point occuring where the config folder in the repo will be mounted into the docker container.

-v <SourceFolder>:<MountPointInDockerContainer>

The rest of the code after the \ will execute on the docker container. So in the above,

  • The container will become a sibling,
  • Entry into a bash shell
  • Container will mount a /playbooks folder containing the source code from the build artifacts
  • Connect to azure
  • Run an anisble playbook.
  • The playbook will find all virtual machine scale sets in a resoruce group with a name pattern
  • Apply a configuration by configuring logstash to auto reload config files when they change
  • Apply a configuration by copying files

Ansible

The above is used to deploy configurations to an Azure Virtual Machine Scale Set. Ansible has a feature called dynamica inventory. We will leverage this feature to detect all active nodes/instances in a VMSS.

The structure of ansible is as follows:

Ansible Dynamic Inventory

So lets see how ansible can be used to detect all running instances in an Azure Virtual machine Scale Set

inventory_dev_azure_rm.yml

Below it will detect any VMSS cluster in resourcegroup rom-dev-elk-stack that has logstash in the name

plugin: azure_rm

include_vmss_resource_groups:
- rom-dev-elk-stack

conditional_groups:
  logstash_hosts: "'logstash' in name"

auth_source: auto

logstash_hosts.yml (Ensure this lives in a group_vars folder)

Now, I can configure ssh using a username or ssh keys.

---
ansible_connection: ssh
ansible_ssh_user: logstash

logstash-playbook.yaml

Below I now have ansible doing some configuration checks for me on a logstash pipeline (upstream/downstream architecture).


    - name: Logstash auto reloads check interval
      lineinfile:
        path: /etc/logstash/logstash.yml
        regexp: '^config\.reload\.interval'
        line: "config.reload.interval: 30s"
      become: true
      notify:
        - restart_service

    - name: Copy pipeline configs
      copy:
        src: ../pipelines/conf.d/
        dest: /etc/logstash/conf.d/
        owner: logstash
        group: logstash
      become: true
    
    - name: Copy pipeline settings
      copy:
        src: ../pipelines/register/
        dest: /etc/logstash/
        owner: logstash
        group: logstash
      become: true

To improve security – replace user/password ansible login with an SSH key pair.

References

To read up more about Docker Socket mount points. Check out

https://www.develves.net/blogs/asd/2016-05-27-alternative-to-docker-in-docker/

https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html

Thanks to Shawn Wang and Ducas Francis for the inspirations on Docker Socket.

https://azure.microsoft.com/en-au/services/devops/

Debugging Azure Event Hubs and Stream Analytics Jobs

When you are dealing with millions of events per day (Json format). You need a debugging tool to deal with events that do no behave as expected.

Recently we had an issue where an Azure Streaming analytics job was in a degraded state. A colleague eventually found the issue to be the output of the Azure Streaming Analytics Job.

The error message was very misleading.

[11:36:35] Source 'EventHub' had 76 occurrences of kind 'InputDeserializerError.TypeConversionError' between processing times '2020-03-24T00:31:36.1109029Z' and '2020-03-24T00:36:35.9676583Z'. Could not deserialize the input event(s) from resource 'Partition: [11], Offset: [86672449297304], SequenceNumber: [137530194]' as Json. Some possible reasons: 1) Malformed events 2) Input source configured with incorrect serialization format\r\n"

The source of the issue was CosmosDB, we need to increase the RU’s. However the error seemed to indicate a serialization issue.

We developed a tool that could subscribe to events at exactly the same time of the error, using the sequence number and partition.

We also wanted to be able to use the tool for a large number of events +- 1 Million per hour.

Please click link to the EventHub .Net client. This tool is optimised to use as little memory as possible and leverage asynchronous file writes for the an optimal event subscription experience (Console app of course).

Have purposely avoided the newton soft library for the final file write to improve the performance.

The output will be a json array of events.

The next time you need to be able to subscribe to event hubs to diagnose an issue with a particular event, I would recommend using this tool to get the events you are interested in analysing.

Thank you.

What is Devops – Part 1

What is Devops – Part 1

Patrick Debois from Belgium is the actual culprit to blame for the term Devops, he wanted more synergy between developers and operations back in 2007.

Fast-forward a few years and now we have “Devops” everywhere we go. If you using the coolest tools in town such as Kubernetes, Azure Devops Pipelines, Jenkins, Grafana etc – then you probably reckon that you are heavy into Devops. This can not be further from the truth.

The fact is that Devops is more about a set of patterns and practices within a culture that nurtures shared responsibilities across all teams during the software development life-cycle.

Put it this way, if you only have 1 dude in your team that is “doing Devops”, then you may want to consider if you are really implementing Devops or one of it’s anti-patterns. Ultimately you need to invest in everyone within the SDLC teams to get on board with the cultural shift.

If we cannot get the majority of engineers involved in the SDLC to share responsibilities, then we have failed at our objectives regarding Devops, even if we using the latest cool tools from Prometheus to AKS/GKE. In a recent project that I was engaged in there was only 1 devops dude, when he fell ill nobody from any of the other engineering teams could perform his duties. Despite the fact that confluence has numerous playbooks and “How To’s”. Why?

It comes down to people, process & culture. All of which can be remedied with strong technical leadership and encouraging your engineers to work with the process and tools in their daily routine. Hence why I encourage developers that are hosting their code on Kubernetes to use Minikube on their laptops.

If there is any advice that I can provide teams that want to implement Devops – Focus on People then Process and finally the Tools.

In order to setup the transition for success – we will discuss in the next part of this series the pillars of Devops.

Installing Kubernetes – The Hard Way – Visual Guide

Installing Kubernetes – The Hard Way – Visual Guide

This is a visual guide to compliment the process of setting up your own Kubernetes Cluster on Google Cloud. This is a visual guide to Kelsey Hightower GIT project called Kubernetes The Hard Way. It can be challenging to remember all the steps a long the way, I found having a visual guide like this valuable to refreshing my memory.

Provision the network in Google Cloud

VPC

Provision Network

Firewall Rules

External IP Address

Provision Controllers and Workers – Compute Instances

Controller and Worker Instances

Workers will have pod CIDR

10.200.0.0/24

10.200.1.0/24

10.200.2.0/24

Provision a CA and TLS Certificates

Certificate Authority

Client & Server Certificates

Kubelet Client Certificates

Controller Manager Client Certificates

Kube Proxy Client Certificates

Scheduler Client Certificates

Kubernetes API Server Certificate

Reference https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md

Service Account Key Pair

Certificate Distribution – Compute Instances

Generating Kubernetes Configuration Files for Authentication

Generating the Data Encryption Config and Key

Bootstrapping etcd cluster

Use TMUX set synchronize-panes on to run on multiple instances at same time. Saves time!

Notice where are using TMUX in a Windows Ubuntu

Linux Subsystem and running commands in parallel to save a lot of time.

The only manual command is actually ssh into each controller, once in, we activate tmux synchronize feature. So what you type in one panel will duplicate to all others.

Bootstrapping the Control Pane (services)

Bootstrapping the Control Pane (LB + Health)

Required Nginx as Google health checks does not support https

Bootstrapping the Control Pane (Cluster Roles)

Bootstrapping the Worker Nodes

Configure kubectl remote access

Provisioning Network Routes

DNS Cluster Add-On

First Pod deployed to cluster – using CoreDNS

Smoke Test

Once you have completed the install of your kubernetes cluster, ensure you tear it down after some time to ensure you do not get billed for the 6 compute instances, load balancer and public statis ip address.

A big thank you to Kelsey for setting up a really comprehensive instruction guide.

Creating a Cloud Architecture Roadmap

Creating a Cloud Architecture Roadmap

Image result for cloud architecture jpg

Overview

When a product has been proved to be a success and has just come out of a MVP (Minimal Viable Product) or MMP (Minimal Marketable Product) state, usually a lot of corners would have been cut in order to get a product out and act on the valuable feedback. So inevitably there will be technical debt to take care of.

What is important is having a technical vision that will reduce costs and provide value/impact/scaleable/resilient/reliable which can then be communicated to all stakeholders.

A lot of cost savings can be made when scaling out by putting together a Cloud Architecture Roadmap. The roadmap can then be communicate with your stakeholders, development teams and most importantly finance. It will provide a high level “map” of where you are now and where you want to be at some point in the future.

A roadmap is every changing, just like when my wife and I go travelling around the world. We will have a roadmap of where want to go for a year but are open to making changes half way through the trip e.g. An earthquake hits a country we planned to visit etc. The same is true in IT, sometimes budgets are cut or a budget surplus needs to be consumed, such events can affect your roadmap.

It is something that you want to review on a regular schedule. Most importantly you want to communicate the roadmap and get feedback from others.

Feedback from other engineers and stakeholders is crucial – they may spot something that you did not or provide some better alternative solutions.

Decomposition

The first stage is to decompose your ideas. Below is a list that helps get me started in the right direction. This is by no means an exhausted list, it will differ based on your industry.

Component Description Example
Application Run-timeWhere apps are hostedAzure Kubernetes
Persistent StorageNon-Volatile DataFile Store
Block Store
Object Store
CDN
Message
Database
Cache
Backup/RecoveryBackup/Redundant SolutionsManaged Services
Azure OMS
Recovery Vaults
Volume Images
GEO Redundancy
Data/IOTConnected Devices / SensorsStreaming Analytics
Event Hubs
AI/Machine Learning
GatewayHow services are accessedAzure Front Door, NGIX, Application Gateway, WAF, Kubernetes Ingress Controllers
Hybrid ConnectivityOn-Premise Access
Cross Cloud
Express Route
Jumpboxes
VPN
Citrix
Source ControlWhere code lives
Build – CI/CD
Github, Bitbucket
Azure Devops, Octopus Deploy, Jenkins
Certificate ManagementSSL CertificatesAzure Key Vault
SSL Offloading strategies
Secret ManagementStore sensitive configurationPuppet (Hiera), Azure Keyvault, Lastpass, 1Password
Mobile Device ManagementGoogle Play
AppStore
G-Suite Enterprise MDM etc

Once you have an idea of all your components. The next step is to breakdown your road-map into milestones that will ultimately assist in reaching your final/target state. Which of course will not be final in a few years time πŸ˜‰ or even months!

Sample Roadmap

Below is a link to a google slide presentation that you can use for your roadmap.

https://docs.google.com/presentation/d/1Hvw46vcWJyEW5b7o4Xet7jrrZ17Q0PVzQxJBzzmcn2U/edit?usp=sharing

Query Azure AppInsights with Powershell

In order to query AppInsights using powershell, you will need your AppInsights AppId and APIKey.

The important consideration is to ensure your JSON is valid, so always run it through a parser and use the correct escape characters for both JSON and PowerShell. Have a look at the string in $queryData.

The following code will query appinsights and generate csv files based on the batch size. It also using paging by leveraging:

| serialize | extend rn = row_number()

Happy DevOps Reporting πŸ™‚

param (
[Parameter(Mandatory = $true)]
[string] $AppInsightsId,

[Parameter(Mandatory = $true)]
[string] $apiKey,

[Parameter(Mandatory = $false)]
[string]
$Timespan = "P7D",

[Parameter(Mandatory = $false)]
[int]
$batchSize = "10000",

[Parameter(Mandatory = $false)]
[string]
$OutputFolder = "C:\Output\",

[Parameter(Mandatory = $false)]
[string]$logFileName = "AppQuery.log",

[Parameter(Mandatory = $false)]
[string]$logFolder = "C:\Logs\"
)

Add-Type -AssemblyName System.Web

Import-Module .\Helpers.psm1 -Force -ErrorAction Stop
Import-Module .\Shared.Logging.psm1 -Force -Global
Import-Module .\Security.Helpers.psm1 -Force -Global

function prepareFileHeader($filenumber, $columnNames) {
$csvString = ""
ForEach ( $Property in $columnNames )
{
$csvString += "$($Property.Name),"
}
$csvString = $csvString.Substring(0,$csvString.Length -1)
$file = Join-Path $OutputFolder "batch-$i.csv"
$csvString | Out-File -filepath $file -Encoding utf8
$csvString = $null
}

function writeRecordsToFile($records) {
ForEach ( $record in $records )
{
$csvString = ""
foreach($cell in $record) {
$csvString += "$cell,"
}
$csvString = $csvString.Substring(0,$csvString.Length -1)
$file = Join-Path $OutputFolder "batch-$i.csv"
$csvString | Out-File -filepath $file -Encoding utf8 -Append -NoClobber
$csvString = $null
}
}

$logFilePath = PrepareToLog $logFolder $logFileName

try {
$url = "https://api.applicationinsights.io/v1/apps/$AppInsightsId/query"
$headers = @{"Content-Type" = "application/json"}
$headers.add("x-api-key", $apiKey)
$queryString = "?timespan=$Timespan"
$fullUrl = $url + $queryString

$queryTotalMessageCount = "traces\r\n | where message contains \`"Max Retry Count reached\`" and message contains \`"MessageService\`"\r\n | summarize count()"
$queryTotalMessageCountBody = "{
`"query`": `"$queryTotalMessageCount`"
}"
$resultCount = Invoke-WebRequest -Uri $fullUrl -Headers $headers -Method POST -Body $queryTotalMessageCountBody -ErrorAction Continue
$totalObject = ConvertFrom-Json $resultCount.Content

$totalRecords = $totalObject.tables.rows[0]
$pages = [math]::ceiling($totalRecords/$batchSize)
$startRow = 0
$endRow = $batchSize

Write-Host "Total Files: $pages for Batch Size: $batchSize"
For ($i=1; $i -le $pages; $i++) {
Write-Host "Processing File: C:\batch-$i.csv"
$queryData = "traces\r\n | extend TenantId = extract(\`"Tenant Id \\\\[[a-z0-9A-Z-.]*\\\\]\`", 0, message) | extend UniqueTransactionId = extract(\`"\\\\[[a-z0-9A-Z-. _\\\\^]*\\\\]\`",0 ,extract(\`"Message Transaction \\\\[[a-z0-9A-Z-._]*\\\\]\`", 0, message))\r\n | extend TransactionId = trim_start(\`"\\\\[\`", tostring(split(UniqueTransactionId, \`"_\`") [0]))\r\n | extend TransactionDateTicks = tostring(split(UniqueTransactionId, \`"_\`") [1])\r\n | extend PrincipalId = trim_end(\`"\\\\]\`", tostring(split(UniqueTransactionId, \`"_\`") [2]))\r\n | where message contains \`"Max Retry Count reached\`" and message contains \`"MessageService\`"\r\n | project TransactionId, TransactionDateTicks, PrincipalId, TenantId\r\n | summarize ErrorCount = count(TransactionId) by TransactionId, TransactionDateTicks, PrincipalId, TenantId\r\n | serialize | extend rn = row_number()\r\n | where rn > $startRow and rn <= $endRow"
$queryBody = "{
`"query`": `"$queryData`"
}"
$result = Invoke-WebRequest -Uri $fullUrl -Headers $headers -Method POST -Body $queryBody -ErrorAction Continue
$data = ConvertFrom-Json $result.Content
$startRow += $batchSize
$endRow += $batchSize

if($i -eq 1) {
$columnNames = $data.tables.columns | select name
}

prepareFileHeader $i $columnNames
writeRecordsToFile $data.tables.rows
}
} catch {
LogErrorMessage -msg $error[0] -filePath $logFilePath -fatal $true
}

ARM – Modular Templates – Reference resources already created

Hi,

I noticed the Microsoft documentation related to the following function is a little bit vague.

reference(resourceName or resourceIdentifier, [apiVersion], [‘Full’])

The second issue is see a lot of people having is how do you reference a resource already created in ARM and get some of that objects properties e.g. FQDN on a public IP already created etc.

The clue to solve this issue, so that ARM Template B can reference a resource created in ARM Template A can be found here:

By using the reference function, you implicitly declare that one resource depends on another resource if the referenced resource is provisioned within same template and you refer to the resource by its name (not resource ID). You don’t need to also use the dependsOn property. The function isn’t evaluated until the referenced resource has completed deployment.

Or use linked templates (Linked templates is a huge rework and you need to host the files on the net). Lets see if we can do it via resourceId.

Therefore if we do reference a resource by resourceId, we will remove the implicit “depends on”, allowing ARM Template B to use a resource created in a totally different ARM template.

A great example might be the FQDN on an IP Address.

Imagine ARM Template A creates the IP Address


"resources": [
{
"apiVersion": "[variables('publicIPApiVersion')]",
"type": "Microsoft.Network/publicIPAddresses",
"name": "[variables('AppPublicIPName')]",
"location": "[variables('computeLocation')]",
"properties": {
"dnsSettings": {
"domainNameLabel": "[variables('AppDnsName')]"
},
"publicIPAllocationMethod": "Dynamic"
},
"tags": {
"resourceType": "Service Fabric",
"scaleSetName": "[parameters('scaleSetName')]"
}
}]

Now Imagine we need to get the FQDN of the IP Address in a ARM Template B

What we going to do is try this:

reference(resourceIdentifier, [apiVersion]) ->
reference(resourceId(), [apiVersion]) ->
e.g.

Here is an example where ARM template B references a resource in A and gets a property.


"managementEndpoint": "[concat('https://',reference(resourceId('Microsoft.Network/publicIPAddresses/',variables('AppPublicIPName')), variables('publicIPApiVersion')).dnsSettings.fqdn,':',variables('nodeTypePrimaryServiceFabricHttpPort'))]",

The important thing here is to ensure you always include the API Version. This pattern is a very powerful way to create smaller and more modular ARM templates.

Note: In the above pattern, you do not need to define DependsOn in ARM Template B, as we are explicitly defining a reference to an existing resource. ARM Template B is not responsible for creating a public IP. If you need it, you run ARM Template A.

So if you need a reference to existing resources use the above. If you need a reference to resources created in the SAME ARM template use:

reference(resourceName)

Cheers