Operable

Pour ce cadran, nous avons choisi de présenter des outils et des technologies qui permettent d’améliorer les opérations courantes sur les infrastructures Cloud Native.

Adopt

12

Datadog

13

Grafana

Grafana is an open-source dashboarding platform for all your cloud environments.


Grafana is an open-source dashboarding tool created by the eponymous company, Grafana Labs. It enables users to create dynamic, customizable dashboards to monitor and analyze metrics related to your infrastructure.

 

A wide range of data sources can be connected to Grafana: temporal databases such as InfluxDB, Prometheus, ElasticSearch, and even the native monitoring services of your preferred Cloud Provider. Grafana's intuitive user interface then lets you group the various data into real-time graphs, gauges and bar charts, to name but a few.

 

Grafana is an essential monitoring tool for your Kubernetes clusters, easy to install and configure thanks to its Helm chart. What's more, you can define your dashboards using ConfigMap. If you don't want the responsibility of managing your visualization tool, don't worry, there's a Grafana Cloud SaaS offering.

 

With the rise of microservice architectures and the use of the cloud to create complete, complex environments, Grafana has become an essential tool for both operational and development teams.

14

KPIs d'infrastucture (RED Method)

15

Kustomize

Kustomize makes it easy to configure your complex deployments in Kubernetes.


Kustomize lets you manage resource configurations for Kubernetes using YAML files with easy-to-use syntax. It also enables you to manage complex configurations for multi-environment applications by applying serial configurations.


In fact, Kustomize applies patches and overlays to a basic configuration. This simplifies managing complex configurations more than Helm, where you must always redefine a value file. And all this while keeping your code as DRY as possible!


Moreover, Kustomize automatically generates new Kubernetes secrets whenever you modify a data field in the configuration file. This feature keeps secrets secure while simplifying your rollbacks.


However, Kustomize is more challenging to grasp for people unfamiliar with Kubernetes resources and their declaration in YAML.  It's also less flexible than Helm in terms of customization: it's impossible to integrate logic via templating, for example.


We believe that Kustomize and Helm are two complementary tools for managing deployment configurations for Kubernetes. While Kustomize is ideal for managing configurations modularly, Helm offers a convenient way of managing more complex application packages.


By using Kustomize with Helm, teams can benefit from the advantages of both tools. For example, Helm can be used to manage complete application packages, while Kustomize can be used to customize specific configurations for deployment environments.


Kustomize is a powerful tool for managing deployment configurations for Kubernetes, with a simple syntax that perfectly complements Helm.

16

OpenTelemetry

17

Prometheus Operator

Prometheus Operator makes it easy to deploy and manage an entire technical stack around Prometheus to monitor a Kubernetes cluster.


Prometheus is the benchmark tool for metrology on Kubernetes architectures. Deployment and management are made particularly simple with Prometheus Operator, while other ancillary, but no less necessary, components are added to improve the operability of your platform: 

  • Alerting with Alerte Manager
  • HTTP monitoring with Blackbox Exporter
  • Visualize metrics with Grafana

Installation of the Prometheus suite is a single command, and after just a few minutes, you'll have access to all your cluster's metrics and much more. The operator will be able to manipulate Prometheus resources via Kube CRDs. So you won't need to configure your resources in Prometheus but simply declare them in the Kube API. 


We'll automate monitoring by adding these resources to our charts, and Prometheus will monitor each deployed application (metrics or HTTP monitoring, for example).


However, the operator does not solve the major problem with Prometheus: a consolidated, centralized view in multi-environment, multi-cluster architectures. You'll need to deploy components that bridge the gap between different deployments: a central Grafana and solutions like Thanos to increase data retention.


Even if other solutions exist, such as Datadog (for a fee), Prometheus remains Kubernetes's de facto community standard. It will always be a good choice for operating your clusters.

18

Renovate

Automate patch management of external dependencies (libraries) for infrastructure and applications


Patch Management is a significant challenge for platform security. Our infrastructures, as much as the applications deployed on them, use external components (dependencies, very often open source) that must be updated regularly to correct security flaws and bugs. With the rapid pace of updates and the growing number of dependencies, it can be difficult to keep all our dependencies up to date.


Renovate is an open-source dependency management tool that automates updating packages in your projects. It analyzes your dependency configuration files (such as package.json, pom.xml, or build.gradle) and automatically generates pull requests for necessary package updates.


Renovate is compatible with various package managers, including npm, yarn, pip, Mavn, and NuGet, making it easily adaptable to different programming languages. It also allows you to analyze the dependencies of your infrastructure, supporting Helm charts, Docker images, and Terraform modules. You can use it with major Git Providers such as Gitlab, Github, BitBucket, and Azure DevOps.


Highly configurable, it adapts perfectly to the development workflows of our projects. It can be integrated via CI tasks or, our preference, as a cronjob in a Kubernetes cluster to optimize processing with Redis caching. This deployment mode will enable us to scale more easily by deploying several "instances" of Renovate (cronjob Kube) to spread the load and adapt its operation.


With daily execution and automatic merge of changes (when the CI is valid), we automate part of the correction of security flaws by automatically applying patches. 


Renovate also enables us to track the evolution of our dependencies by providing us with an overview of the changes (new minor or major versions) that need to be taken into account to keep our dependencies up to date (via the open Merge Request/Pull Request list). We aim to ensure that we don't fall behind on major releases and always benefit from security updates in the long term.


Renovate allows us to reduce the burden of Patch Management and free up time to work on improvements that will bring more value to our customers' businesses.

19

Terraform

Today, Terraform is the leading Infrastructure as Code (IAC) tool on the market. It enables you to provision and manage resources on all Cloud Providers.

We've created hundreds of infrastructures on several cloud providers and used Terraform every time. An IAC tool is essential when launching into the cloud, as it facilitates collaboration and the operability of an infrastructure. 


Terraform shines through with a wide range of features:

  • The use of modules to define sets of resources that meet a precise need and can be easily reused.
  • State Terraform for tracking the life cycle of each resource
  • Compatibility with multiple clouds and systems thanks to providers. For example: AWS, OVH, but also Github

We know that IAC tools are offered by cloud providers such as CDK AWS, cloud formation, and even ARM for Azure. But their vendor lock-in and lack of interoperability made us lean towards Terraform. 


Despite being a leader in IAC for managing infrastructure, it is necessary to have a framework for the code base. Padok has converged on a WYSIWYG (What You See Is What You Get) pattern that helps standardize code and collaboration. 


The points to remember are : 

  • Organizing Terraform states according to business needs
  • Create modules that meet a complex or reproducible need
  • Don't hesitate to factorize code as your infrastructure evolves

Tips for use 💡

And don't forget the tools for syntax quality and maintainability: terraform fmt, tfllint, tfautomv, or terraform-docs. 

 

Terraform is today's benchmark tool for building and maintaining infrastructure in the cloud. Tools such as Terragrunt further enhance its ability to manage at-scale infrastructure by offering features to avoid code redundancy, known as DRY (Don't Repeat Yourself).

20

Terragrunt

Terragrunt is a tool offered by Gruntwork to enhance Terraform and boost its ability to manage multi-module deployments.


Terraform is the current community standard for as-code deployment of cloud resources. It includes libraries (called "providers") for almost all the resources of the major Cloud Providers.


However, Terraform has its limitations, penalizing teams who need to manage a multi-module infrastructure or large infrastructures. Indeed, in such cases, it is often necessary to split Terraform deployment into several modules (sometimes also called layers) to simplify them or avoid collisions of Terraform states.


However, this can quickly become very complicated to manage as it becomes necessary :

  • Use remote states to share outputs between states
  • Duplicate or over-template backend configurations that are not natively configurable in Terraform
  • Manage common or environment-specific variables, for example, via files and symbolic links

Terragrunt sits on top of this to create and manage auto-generated Terraform workspaces. Terragrunt's enhanced functionality can be used to link layers. Terragrunt provides a better link between layers while relying on Terraform's proven deployment capabilities.


What's more, Terragrunt is configured using the same language as Terraform, HashiCorp Language (HCL), which has been extended to add the necessary functionalities. This facilitates team training and reduces the feeling of having a new tool to master.


Today, other tools try to meet this need, but Terragrunt is our favorite because it achieves the result by adding only a very thin layer around Terraform.

21

Terraspace

Trial

22

Atlantis

Atlantis is an application for automating the use of Terraform via pull requests. It provides a workflow for maintaining the consistency of an infrastructure defined in IaC.


Atlantis is an open-source tool that automates contributions to a Terraform code base, allowing you to execute the plan, apply, and import commands directly in the pull request. As a result, you can see the feedback directly in the comments. The application can be hosted anywhere and uses the webhook system of Github, Gitlab, or Bitbucket. 


This solves collaboration problems on large infrastructures and provides a history of modifications made.


More complex workflows mean even faster DevX (Developer Experience): 

  • Autoplanning for each new commit or pull request, providing a quick overview of the state of the infrastructure, and validating the impact of changes made
  • Auto-merging to merge the pull request if all plans are functional

However, there is still room for improvement if this tool is to become a benchmark:

  • The server that manages Atlantis holds the credentials to access your infrastructure. Consequently, you need to instantiate several servers to separate access to several infrastructures, which can quickly become complex.
  • Its architecture severely limits it, and scaling it is not straightforward. Other more robust and mature solutions exist if you have large-scale infrastructure needs.

 

Atlantis is a promising tool, but its complex management of rights and scalability is why we're putting it on "Trial." Interesting features, such as drift detection, are planned in its roadmap and deserve to keep it on the radar.

23

Burrito

24

Custom Kubernetes Operators

Custom operators allow you to automate tasks in Kubernetes by adding functionality to its API.


Kubernetes is very popular as a container orchestrator. But first and foremost, it's an extensible API. You can add new resources to the Kubernetes API and extend its functionality by creating your own operators.


You'll define Custom Resource Definitions (CRDs) when creating your operator. The operator code takes advantage of the Kubernetes reconciliation pattern to trigger events in your cluster each time a CRD instance is added, modified, or deleted. This can help automate repetitive tasks (reducing your TOIL) and add custom application functionality. If you're using Kubernetes, you're probably already using custom operators like cert-manager or ArgoCD daily.


In a SaaS environment, for example, each new customer requires the creation of a new tenant. With an operator, it's possible to automatically create all the necessary resources by declaring a new object in your Kubernetes cluster!


However, creating an operator can be complicated: you need to thoroughly understand how Kubernetes works and the lifecycle and different edge cases of what you want to automate. And testing all edge cases is no mean feat. 


It's important to note that you can write your operators in any programming language: Java, Rust... and even Ansible. We recommend Golang: you'll find a wealth of resources to help you, and Red Hat's operator-SDK allows you to bootstrap your code very efficiently.


Creating an operator can offer many benefits to DevOps teams working with Kubernetes. However, it can also be complex, requiring a certain amount of programming and Kubernetes expertise.

25

Excalidraw+

Excalidraw+ is a SaaS virtual whiteboard solution. Its simplicity makes it possible to draw diagrams with the same ease as on a sheet of paper while retaining the ability to store and share them like a Google Doc.


In just 2 clicks, Excalidraw+ creates an unlimited blank page ("Scene") on which you can draw shapes as if on a board. Scenes are grouped into "Collections," to which you can assign team rights in a dedicated workspace.


Excalidraw+'s great strength lies in its simplicity. Only basic shapes (e.g., rectangles, circles, arrows, text boxes) and limited formatting capabilities (e.g., colors, 4 font sizes) are available in the default view.


The result is better day-to-day collaboration, based on many graphical representations and more up-to-date architecture diagrams, because the effort required to maintain them is minimized.


Excalidraw+ lets you make any kind of diagram and collaborate effectively at a distance with visual support. If you need to make a diagram and don't know where to do it, there's no need to hesitate 😉 You can try the free version at excalidraw.com.


However, the tool has the following limitations:

  • Overly broad rights management, e.g. :
    • You must be an administrator to manage rights
    • Only teams can be given rights to files
    • A user can only be a member of 6 teams at a time
    • Subfolders cannot be created
  • No SLAs displayed (even though the application is generally always available)
  • No choice of data storage location

These limitations justify putting it on "Trial" instead of "Adopt."

26

Kubernetes Gateway API

Kubernetes Gateway API lets you manage access to Kubernetes services from outside the cluster with a role-oriented approach between Ops and developers.


When we want to expose Kubernetes services outside our cluster, we tend to use Ingress resources. We therefore deploy Ingress Controllers such as those offered by Nginx, Traefik or Kong, which will have their own annotations to direct traffic and manage the Ingresses attached to them. Generally speaking, the developers and Ops in charge of the cluster will be working on these same resources, which can sometimes cause disruptions.


In order to better separate the role of each in managing the exposure of application services, a new concept has recently emerged: the Kubernetes Gateway API. It enables Ops to set up a global gateway at the cluster level (cross-namespace), with an L4 or L7 load balancer as the entry point.


Developers are then free to create their own HTTPRoutes in their namespaces containing their configurations. It's worth noting that these resources provide natively more functionalities, such as header-based matching and traffic weighting.


The Kubernetes Gateway API is still relatively new but is gaining popularity due to its ability to simplify route management in complex Kubernetes environments. It also offers greater visibility and control over gateways, making detecting errors and security issues easier.


At Padok, this technology has great promise for teams looking to simplify route management in Kubernetes environments. As this technology continues to mature, it should gain popularity and become the benchmark, even if it means replacing Ingress. 


In fact, GCP has integrated it into its GKE service under the name GKE Gateway Controller, and it's in GA!

Assess

27

Grafana Mimir

28

Crossplane

Crossplane is an infrastructure-as-code tool based on Kubernetes. It lets you create Cloud resources using Custom Resources Definitions.


Crossplane is an infrastructure-as-code (IaC) technology developed by Upbound. It enables infrastructure resources to be deployed using Kubernetes as a state manager. It works similarly to GCP's Config Connector or AWS Controllers for Kubernetes.


Crossplane is deployed as an operator in Kubernetes. To use it to manage your infrastructure, you'll need to deploy a dedicated provider as a Custom Resource Definition (CRD). The provider then deploys CRDs for each Cloud resource (for example, for AWS: an EC2 instance, a VPC, a Lambda...).


Combined with GitOps technologies such as ArgoCD, Crossplane can be transformed into a true Cloud self-service platform. Using YAML and Kubernetes attributes to define and link your entire infrastructure is highly intuitive. The minimum knowledge required to start using Crossplane is significantly lower than Terraform.


However, we note a number of counterpoints that do not allow us to be entirely confident in the use of Crossplane in production:


  • Modifying an immutable field in a resource does not result in its replacement.
  • No native support for sharing information between resources managed by different providers
  • Dependence on a Kubernetes cluster to manage your infrastructure implies impeccable cluster management
  • Crossplane has no notion of "plan" or "dry-run," as found in other tools.
  • Importing existing resources is a poorly documented and risky process, even though it's a widespread use case.

Today, we use Crossplane to solve specific problems such as MySQL database configuration. We see it as a tool to watch, as it could become a serious competitor in the IaC field by addressing the problems we've mentioned. 

29

Guacamole

30

OpenTofu

31

Pulumi

Pulumi is an Infrastructure As Code tool that uses languages such as Python, Go, and Typescript. It offers many possibilities but is not yet the default choice for building your own IAC infrastructure.


Terraform may be the leader in IAC, but Pulumi is a serious contender with an approach using languages such as Python or GO. The main advantage of this approach is that it makes it easier to write conditional code, a complex task in Terraform.



Pulumi offers 2 features that set it apart from other IAC tools:

  • Native providers are automatically updated according to the official Cloud Providers API. So there's no need to wait for the provider to be manually updated following a new feature released on the official API before using it. This means you can quickly take advantage of the latest Cloud Provider features!
  • Secret management by encryption, which enables sensitive data to be written directly into the code. Data is encrypted with keys from providers such as aws kms, hashi vault, or gcp kms, and decrypted at just-in-time runtime.

Using Pulumi is a good compromise for teams made up of developers only who want to stick to a familiar language. This is advantageous, as the maintenance processes and best practices in place guarantee code quality. 


However, Pulumi comes with its own language limitations. For example, managing dependencies with `node_modules` can become cumbersome when scaling the code base. 


In conclusion, if you have a specific need to use languages such as GO or Python in your complete stack, getting started with Pulumi will be simpler. However, Pulumi doesn't solve Terraform's fundamental problems, as it's still as complicated as ever to create, organize and maintain IAC code. If you have an existing infrastructure managed with Terraform, we don't advise you to migrate to Pulumi!

32

Terratest

Since infrastructure is the foundation of any robust application, it should be tested! Terratest is one of the few test libraries available for Terraform.


Terratest is the reference for testing Terraform code. Coded in Go, this library lets you write unit tests for Terraform and Terragrunt.

Terratest allows you to deploy an infrastructure and carry out tests on : 

  • HTTPS calls to check whether load balancers are working properly
  • API requests to check gateway api response
  • SSH connections to bastion servers 
  • The right status for a resource 
  • Error-free terraform apply

These tests are becoming essential for growing infrastructures, as errors can quickly appear due to Terraform or provider version changes, and above all, to guarantee the non-regression of existing functionalities.


We position it as a "Trial" because it is not yet an industry standard. In particular, it is not used by the open-source modules maintained by Cloud Providers. The main obstacles are : 

  • The mechanics of designing a test to validate the operation of the infrastructure is not an easy, documented process
  • The availability of an environment to carry out these tests generates additional costs, even if over a short period of time

As a side note, we've also used it to test our Helm packages, and we're delighted!