Gitops – Romiko Derbynew

In the ever-evolving landscape of DevOps, continuous integration and continuous deployment (CI/CD) practices have become the backbone of modern software development. Two major players in this field are GitOps-based tools like ArgoCD and traditional CI/CD push architectures like Azure DevOps and GitHub Actions. Let’s embark on an exploratory journey to compare these two approaches, highlighting their unique features and determining which might be the best fit for your development workflow.

The Contenders

ArgoCD: A GitOps tool that utilizes a pull-based deployment model, designed for Kubernetes-centric environments. It focuses on maintaining the desired state of applications and infrastructure as defined in Git repositories.

Azure DevOps & GitHub Actions: Traditional CI/CD tools that utilize a push-based model. They are versatile, supporting various deployment environments beyond Kubernetes and integrating well with a wide range of development tools and services.

Round 1: Architecture and Approach

ArgoCD: The GitOps Champion

ArgoCD follows the GitOps paradigm, where the desired state of the system is stored in Git. This approach brings several advantages:

• Consistency: By maintaining the desired state configuration in Git, ArgoCD ensures that the actual state of the cluster matches the desired state, automatically correcting any drift.

• Security: Credentials and sensitive information remain within the Kubernetes cluster, reducing the risk of exposure.

• Versioning: Git’s inherent version control allows for easy rollbacks and audits, enhancing traceability and reliability.

ArgoCD shines in Kubernetes-centric environments where maintaining state consistency and security is paramount.

Azure DevOps & GitHub Actions: The Versatile Veterans

Azure DevOps and GitHub Actions adopt a more traditional push-based model, triggering deployments based on events (e.g., code commits). They offer:

• Flexibility: These tools support a wide range of deployment environments, from cloud-native applications to traditional on-premises systems.

• Simplicity: Familiarity among engineers and widespread documentation make them easier to adopt and implement.

• Structure: Compatibility with existing repository structures allows for seamless integration without significant restructuring.

These tools are ideal for diverse environments where flexibility and ease of use are critical.

Round 2: Deployment Models

Pull-Based Deployment (ArgoCD)

ArgoCD continuously monitors the Git repository for changes. When it detects a difference between the desired state in Git and the actual state in the cluster, it pulls the changes and applies them to the cluster.

Pros:

• Enhanced Security: By keeping sensitive information within the cluster.

• Automatic Sync: Ensures that the cluster state is always in sync with the Git repository.

Cons:

• Learning Curve: Requires a deeper understanding of Kubernetes and GitOps practices.

• Initial Setup: Can be more complex to set up compared to push-based models.

Push-Based Deployment (Azure DevOps & GitHub Actions)

In a push-based model, changes are pushed to the deployment environment when triggered by events such as code commits. The CI/CD pipeline executes and deploys the application.

Pros:

• Ease of Use: More intuitive for developers familiar with traditional CI/CD practices.

• Broad Support: Works well with various environments and tools.

Cons:

• Potential Inconsistencies: The actual state might drift from the desired state if not managed properly.

• Security Risks: Credentials might need to be managed outside the cluster.

Round 3: Integration and Ecosystem

ArgoCD

ArgoCD is tightly integrated with Kubernetes and excels in environments where Kubernetes is the primary platform. It integrates well with other cloud-native tools like Prometheus, Grafana, and various service meshes.

Azure DevOps & GitHub Actions

These tools boast a rich ecosystem with extensive integrations across various platforms and services, including cloud providers (Azure, AWS, GCP), container registries, and monitoring tools.

The Hybrid Approach: Best of Both Worlds

As highlighted in the recommendations from the Catalyst team, a hybrid approach leveraging both GitHub Actions for CI and ArgoCD for CD can offer the best of both worlds. This strategy allows teams to:

• Utilize GitHub Actions for building, testing, and initial deployment stages across diverse environments.

• Adopt ArgoCD for Kubernetes-specific deployments, ensuring state consistency and security.

Conclusion

The choice between ArgoCD and traditional CI/CD tools like Azure DevOps and GitHub Actions ultimately depends on your specific needs and environment. If your operations are Kubernetes-centric and you prioritize security and state consistency, ArgoCD is a robust choice. However, for diverse environments requiring flexibility and ease of use, Azure DevOps and GitHub Actions remain strong contenders.

By understanding the strengths and trade-offs of each approach, you can design a CI/CD pipeline that not only meets your operational requirements but also enhances the developer experience. Whether you choose ArgoCD, traditional push architectures, or a hybrid approach, the key is to align your tools with your development goals, ensuring efficient and reliable software delivery.

References

• CI/CD for AKS apps with GitHub Actions and GitFlow – Microsoft Learn

• DevOps Topologies – DevOps Topologies

• ArgoCD – Docs

By following this integrated approach, you can leverage the strengths of both GitHub Actions and ArgoCD, ensuring efficient and secure CI/CD processes tailored to your needs. This strategy promotes scalability, security, and developer productivity while accommodating the diverse requirements of modern software development.

Hope you enjoyed this detailed exploration of ArgoCD vs. traditional CI/CD push architectures. Keep experimenting, stay curious, and happy deploying!

Intro

One of the key pillars regarding SRE is being able to make quantitative decisions based on key metrics.

The major challenge is what are key metrics and this is testament to the plethora of monitoring software out in the wild today.

At a foundational level you want to ensure your services are always running, however 100% availability in not practical.

Class SRE implements Devops
{
  MeasureEverything()
}

Availability/Error Budget

You then figure out what availability is practical for your application and services.

Your error budget will then be the downtime figure e.g. 99% is 7.2 hours of downtime a month that you can afford to have.

SLAs, SLOs and SLIs

This will be the starting point on your journey to implementing quantitative analysis to

Service Level Agreements
Service Level Objectives
Service Level Indicators

This blog post is all about how you can measure Service Level Objectives without breaking the bank. You do not need to spend millions of dollars on bloated monitoring solutions to observe key metrics that really impact your customers.

Just like baking a cake, these are the ingredients we will use to implement an agile, scaleable monitoring platform that is solely dedicated to doing one thing well.

Outcome

This is what we want our cake to deliver:

Measuring your SLA Compliance Level
Measuring your Error Budget Burn Rate
Measuring if you have exhausted your error budget

Service Level Compliance – SLAs -> SLOs -> SLIs

If you look at the cake above, you can see all your meaningful information in one dashboard.

Around 11am the error budget burn rate went up. (A kin to your kids spending all their pocket money in one day!)
Compliance was breached (99% availability) – The purple line (burn rate) went above the maximum budget (yellow line)

These are metrics you will want to ALERT on at any time of the day. These sort of metrics matter. They are violating a Service Level Agreement.

What about my other metrics?

Aaaah, like %Disk Queue Length, Processor Time, Kubernetes Nodes/Pods/Pools etc? Well…

I treat these metrics as second class citizens. Like a layered onion. Your initial metrics should be – Am I violating my SLA? If not, then you can use the other metrics that we have enjoyed over the decades to compliment your observeability into the systems and as a visual aid for diagnostics and troubleshooting.

Alerting

Another important consideration is the evolution of infrastructure. In 1999 you will have wanted to receive an alert if a server ran out of disk space. In 2020, you are running container orchestration clusters and other high availability systems. A container running out of disk space is not so critical as it used to be in 1999.

Evaluate every single alert you have and ask yourself. Do I really need to wake someone up at 3am for this problem?
Always alert on Service Level Compliance levels ABOUT to breach
Always alert on Error Budget Burn Rates going up sharply
Do not alert someone out of hours because the CPU is 100% for 5 minutes unless Service Level Compliance is being affected to

You will have happier engineers and a more productive team. You will be cooled headed during an incident because you know the different between a cluster node going down versus Service Level Compliance violations. Always solve the Service Level Compliance and then fix the other problems.

Ingredients

Where are the ingredients you promised? You said it will not break the bank, I am intrigued.

A Kubernetes cluster – Google, Azure Kubernetes Services etc
ArgoCD – Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes.
Prometheus
Service Level Operator

Summary

In this post we have touched on the FOUNDATION of what we want out of monitoring.

We know exactly what we want to measure – Service Level Compliance, Error Budget Burn Rate and Max Budget. All this revolves around deciding on the level of availability we give a service.

We touched on the basic ingredients that we can use to build this solution.

In my next blog post we will discuss how we mix all these ingredients together to provide a platform that is is good at doing one thing well.

Measuring Service Level Compliance & Error Budget Burn Rates

When you give your child $30 to spend a month and they need to save $10 a month. You need to be alerted if they spending too fast (Burn Rate).

Romiko Derbynew

ArgoCD vs. Traditional CI/CD Push Architectures: A Modern DevOps Showdown

Site Reliability Engineering with Gitops & Kubernetes – Part 1