Tag: Docker

Optimizing Dockerfile for Web Applications with Multi-Stage Builds

Optimizing Dockerfile for Web Applications with Multi-Stage Builds

Introduction

Docker has revolutionized the way applications are developed and deployed. However, as Docker images grow in complexity, so do their sizes, which can lead to longer build times, increased storage costs, and slower deployment speeds. One way to mitigate these issues is through optimizing Dockerfiles using multi-stage builds. This blog post will explain how to optimize Dockerfiles, reduce image size, and improve security using multi-stage builds and other best practices.

Understanding Multi-Stage Builds

Multi-stage builds allow you to use multiple FROM statements in your Dockerfile. This feature enables you to create intermediate images that are not included in the final image, thereby reducing the final image size.

Best Practices for Dockerfile Optimization

1. Use Small Base Images: Start with a minimal base image like alpine to reduce the overall size.

2. Combine Commands: Use && to chain commands together to reduce the number of layers.

3. Clean Up: Remove unnecessary files and packages to keep the image clean and minimal.

4. Avoid Unnecessary Packages: Only install the packages you need.

5. Multi-Stage Builds: Use multi-stage builds to keep build dependencies out of the final image.

6. Remove SSH and Unnecessary Services: Improve security by not including SSH and other unnecessary services in your image.

Example Web Application Dockerfile (Non-Optimized)

# Non-Optimized Dockerfile
FROM ubuntu:20.04

# Install dependencies
RUN apt-get update && \
    apt-get install -y nginx curl

# Copy application files
COPY . /var/www/html

# Expose port
EXPOSE 80

# Start nginx
CMD ["nginx", "-g", "daemon off;"]


Example Web Application Dockerfile (Optimized)

# Stage 1: Build Stage
FROM node:16-alpine as build

# Set working directory
WORKDIR /app

# Install dependencies
COPY package*.json ./
RUN npm install

# Copy application files and build
COPY . .
RUN npm run build

# Stage 2: Runtime Stage
FROM nginx:alpine

# Remove default nginx website
RUN rm -rf /usr/share/nginx/html/*

# Copy built application from build stage
COPY --from=build /app/build /usr/share/nginx/html

# Stage 3: Install Runtime Dependencies
FROM node:16-alpine as runtime

# Set working directory
WORKDIR /app

# Copy only package.json and package-lock.json to install runtime dependencies
COPY package*.json ./
RUN npm install --production

# Copy built application from build stage
COPY --from=build /app/build /app

# Expose port
EXPOSE 80

# Start the application
CMD ["npm", "start"]

# Start nginx
CMD ["nginx", "-g", "daemon off;"]

Explanation

1. Build Stage: This stage includes all dependencies (both development and production) required to build the application.

2. Runtime Stage: This stage installs only the production dependencies to keep the final image lean and optimized.

3. Separation of Concerns: By separating the build and runtime stages, we ensure that unnecessary development dependencies are not included in the final image.

4. Nginx Configuration: The final image uses Nginx to serve the built application, ensuring a lightweight and secure setup.

Conclusion

Optimizing your Dockerfiles can significantly reduce image size, improve build times, and enhance security. By using multi-stage builds, small base images, combining commands, and cleaning up unnecessary files, you can create efficient and secure Docker images. The example provided demonstrates how to apply these best practices to a simple web application using Nginx and Node.js.

You can do the same with your dev and production environment; stage 1 can include all the dev tools for compilation, e.g. gcc, MSBuild, etc, and stage 2 can remove these dev tools that are not required at runtime.

References

Docker Documentation

Best Practices for Writing Dockerfiles

By following these guidelines, you can ensure that your Docker images are optimized for performance, security, and efficiency.

Microsoft Azure Devops – Dynamic Docker Agent (Ansible)

Microsoft Azure Devops – Dynamic Docker Agent (Ansible)

Often you may require a unique custom build/release agent with a specific set of tools.

A good example is a dynamic Ansible Agent that can manage post deployment configuration. This ensures configuration drift is minimised.

Secondly this part of a release is not too critical, so we can afford to spend a bit of time downloading a docker image if it is not already cached.

This article demonstates how you can dynamically spawn a docker container during your release pipeline to apply configuration leveraging Ansible. It will also demonstrate how to use Ansible Dynamic Inventory to detect Azure Virtual machine scale set instances – in the past you would run hacks on facter.

Prerequsites

You will require:

  • A docker image with ansible – You can use mine as a starting point – https://github.com/Romiko/DockerUbuntuDev
    The above is hosted at: dockerhub – romiko/ansible:latest (See reference at bottom of this page)
  • A Self-host Azure Devops Agent – Linux
  • Docker installed on the self-hosted agent
  • Docker configured to expose Docker Socket
    docker run -v /var/run/docker.sock:/var/run/docker.sock -d –name some_container some_image

Release Pipeline

Configure a CLI Task in your release pipeline.

variables:
  env: 'dev'

steps:
- task: AzureCLI@2
  displayName: 'Azure CLI Ansible'
  inputs:
    azureSubscription: 'RangerRom'
    scriptType: bash
    scriptLocation: inlineScript
    inlineScript: |
     set -x
     
     docker run --rm -v $(System.DefaultWorkingDirectory)/myproject/config:/playbooks/ romiko/ansible:latest \
      "cd  /playbooks/ansible; ansible-playbook --version; az login --service-principal --username $servicePrincipalId --password $servicePrincipalKey --tenant $tenantId; az account set --subscription $subscription;ansible-playbook my-playbook.yaml -i inventory_$(env)_azure_rm.yml --extra-vars \"ansible_ssh_pass=$(clientpassword)\""
    addSpnToEnvironment: true
    workingDirectory: '$(System.DefaultWorkingDirectory)/myproject/config/ansible'

In the above the code that is causing a SIBLING container to spawn on the self-hosted devops agent is:

docker run –rm -v $(System.DefaultWorkingDirectory)/myproject/config:/playbooks/ romiko/ansible:latest \ <command to execute inside the container>

Here we have a mount point occuring where the config folder in the repo will be mounted into the docker container.

-v <SourceFolder>:<MountPointInDockerContainer>

The rest of the code after the \ will execute on the docker container. So in the above,

  • The container will become a sibling,
  • Entry into a bash shell
  • Container will mount a /playbooks folder containing the source code from the build artifacts
  • Connect to azure
  • Run an anisble playbook.
  • The playbook will find all virtual machine scale sets in a resoruce group with a name pattern
  • Apply a configuration by configuring logstash to auto reload config files when they change
  • Apply a configuration by copying files

Ansible

The above is used to deploy configurations to an Azure Virtual Machine Scale Set. Ansible has a feature called dynamica inventory. We will leverage this feature to detect all active nodes/instances in a VMSS.

The structure of ansible is as follows:

Ansible Dynamic Inventory

So lets see how ansible can be used to detect all running instances in an Azure Virtual machine Scale Set

inventory_dev_azure_rm.yml

Below it will detect any VMSS cluster in resourcegroup rom-dev-elk-stack that has logstash in the name

plugin: azure_rm

include_vmss_resource_groups:
- rom-dev-elk-stack

conditional_groups:
  logstash_hosts: "'logstash' in name"

auth_source: auto

logstash_hosts.yml (Ensure this lives in a group_vars folder)

Now, I can configure ssh using a username or ssh keys.

---
ansible_connection: ssh
ansible_ssh_user: logstash

logstash-playbook.yaml

Below I now have ansible doing some configuration checks for me on a logstash pipeline (upstream/downstream architecture).


    - name: Logstash auto reloads check interval
      lineinfile:
        path: /etc/logstash/logstash.yml
        regexp: '^config\.reload\.interval'
        line: "config.reload.interval: 30s"
      become: true
      notify:
        - restart_service

    - name: Copy pipeline configs
      copy:
        src: ../pipelines/conf.d/
        dest: /etc/logstash/conf.d/
        owner: logstash
        group: logstash
      become: true
    
    - name: Copy pipeline settings
      copy:
        src: ../pipelines/register/
        dest: /etc/logstash/
        owner: logstash
        group: logstash
      become: true

To improve security – replace user/password ansible login with an SSH key pair.

References

To read up more about Docker Socket mount points. Check out

https://www.develves.net/blogs/asd/2016-05-27-alternative-to-docker-in-docker/

https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html

Thanks to Shawn Wang and Ducas Francis for the inspirations on Docker Socket.

https://azure.microsoft.com/en-au/services/devops/

Run Azure CLI inside Docker on a Macbook Pro

Laptop Setup

Bootcamp with Windows on one partition and OSX on another.

A great way to manage your Windows Azure environment is to use a Docker Container, instead of powershell.
If you are new to automating your infrastructure and code, then this will be a great way to start on the right foot from day one.

Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications. By taking advantage of Docker’s methodologies for shipping, testing, and deploying code quickly, you can significantly reduce the delay between writing code and running it in production.

Install Docker

Grab the latest version of docker here.

After Installing Docker
1. Use bootcamp to boot back into OSX.
2. In OSX restart the machine (warm restart)
3. Hold the Options key down to boot back into Windows

The above looks like a waste of time, However, this will enable Virtualisation in the Bios of the Macbook, since OSX does this by default and windows will not. So it is a small hack to get virtualisation working via a warm reboot from OSX back to Windows.

Grab a Docker Virtual Image with Azure CLI

Run the following command:

docker run -it microsoft/azure-cli

docker-install-azure

The above command will connect to the Docker Repository and download the image to run in a container. This is basically a virtualized environment where you can now manage your windows environment from.

Install Azure Command Line Interface (CLI)

Run the following command:

Azure Help

Look carefully at the image below. Powershell was used to run Docker. However once I run Docker, look at my prompt (root@a28193f1320d:/#). We are now in a Linux virtual machine  (a28193f1320d) and we now have total control over our Azure resources from the command line.

Docker in Windows
Docker in Windows

Now, the Linux guys will start having some respect for us Windows guys. We are now entering an age where we need to be agnostic to technology.

Below we are now running a full blown Kernel of Linux in a Windows Powerhsell prompt.

docker-linux

What is even cooler, we are using a Linux VM to manage the Azure environment, and so we get awesome tools for free.

linuxtools

Good Habits
By using docker with the Azure Command Line interface, you will put yourself into a good position by automating all your infrastructure and code requirements.

You will be using the portal less and less to manage and deploy your azure resources such as Virtual Machines, Blobs and Permissions.

Note, we are now using ARM – Azure Resource Management, some features in ARM will not be compatible with older Azure deployments. Read more about ARM.

Conclusion
You can deploy, update, or delete all the resources for your solution in a single, coordinated operation. You use a template for deployment and that template can work for different environments such as testing, staging, and production. Resource Manager provides security, auditing, and tagging features to help you manage your resources after deployment.

CLI Reference


help: Commands:
help: account Commands to manage your account information and publish settings
help: acs Commands to manage your container service.
help: ad Commands to display Active Directory objects
help: appserviceplan Commands to manage your Azure appserviceplans
help: availset Commands to manage your availability sets.
help: batch Commands to manage your Batch objects
help: cdn Commands to manage Azure Content Delivery Network (CDN)
help: config Commands to manage your local settings
help: datalake Commands to manage your Data Lake objects
help: feature Commands to manage your features
help: group Commands to manage your resource groups
help: hdinsight Commands to manage HDInsight clusters and jobs
help: insights Commands related to monitoring Insights (events, alert rules, autoscale settings, metrics)
help: iothub Commands to manage your Azure IoT hubs
help: keyvault Commands to manage key vault instances in the Azure Key Vault service
help: lab Commands to manage your DevTest Labs
help: location Commands to get the available locations
help: network Commands to manage network resources
help: policy Commands to manage your policies on ARM Resources.
help: powerbi Commands to manage your Azure Power BI Embedded Workspace Collections
help: provider Commands to manage resource provider registrations
help: quotas Command to view your aggregated Azure quotas
help: rediscache Commands to manage your Azure Redis Cache(s)
help: resource Commands to manage your resources
help: role Commands to manage role definitions
help: servermanagement Commands to manage Azure Server Managment resources
help: storage Commands to manage your Storage objects
help: tag Commands to manage your resource manager tags
help: usage Command to view your aggregated Azure usage data
help: vm Commands to manage your virtual machines
help: vmss Commands to manage your virtual machine scale sets.
help: vmssvm Commands to manage your virtual machine scale set vm.
help: webapp Commands to manage your Azure webapps
help:
help: Options:
help: -h, --help output usage information
help: -v, --version output the application version
help:
help: Current Mode: arm (Azure Resource Management)