Padok becomes Theodo Cloud, the Cloud expert entity of Theodo group.
Hold
Assess
Trial
Adopt
Adopt
1
Architecture ARM
2
Caas
"Container as a Service" services enable you to get closer to "Serverless" without fundamentally impacting the way you organize or develop your applications.
Today, most Cloud Providers offer CaaS (e.g., CloudRun on GCP, AWS Lambda/ECS on AWS, ACS on Azure). These are ideal for teams that do not require extensive customization. Some of their features (e.g., scale to 0) enable significant maintenance and cost savings.
Using a CaaS service today can be a great first step in modernizing your applications or creating a new one.
CaaS is a real alternative to :
Another advantage of CaaS is that it is, by definition, less vendor-locked than other hosting services. Your application is packaged via a market standard (an OCI image) and can be deployed on any service supporting these OCI images, such as a future Kubernetes cluster.
We've observed several different behaviors when using these services: either they're entirely adopted and perfectly suited to our customers' needs, or they lack customization, and our customers are naturally pushed towards using Kubernetes. This is no longer an issue today, but 1 year ago, Cloud Run didn't support the CPU being "always allocated," and we couldn't use it for applications with background processes.
These services have become indispensable, and we recommend them. If using a Cloud Provider isn't an option, building your own CaaS using open-source technologies like Knative is always possible.
3
Karpenter
Karpenter is an autoscaling node for Kubernetes. It differentiates itself from its peers by offering to create its autoscaling configuration within the cluster itself.
In the world of node autoscaling for Kubernetes, two significant solutions exist today:
AWS is the last major Cloud Provider not to provide managed autoscaling functionality. Their response to this lack was the development of Karpenter.
Unlike KCA, Karpenter will not rely on the Cloud Provider to create node groups but will manage them using CRDs. Combined with a GitOps tool such as ArgoCD, Karpenter brings a new level of flexibility to node group configuration.
As soon as at least one pod is in the Pending state and the Kubernetes scheduler cannot assign it to a node, Karpenter will provide the right capacity to accommodate this workload. This may involve one or more nodes with sufficient characteristics.
In addition, Karpenter comes with some exciting features:
4
KEDA
Open source, KEDA enables resources deployed in Kubernetes to be scaled based on external events.
One of our main objectives as DevOps is ensuring that infrastructures can handle the load quickly. However, scaling our resources in anticipation is difficult, as we generally rely on CPU and RAM consumption. This is where KEDA (Kubernetes Event-Driven Autoscaling) makes the task much easier.
KEDA is a component that extends Kubernetes' event-based autoscaling capabilities.
It monitors:
By monitoring these, it is possible to trigger application scaling based on the event load of numerous services such as Kafka, RabbitMQ, Azure Service Bus, AWS SQS, and PubSub. It is, therefore, possible to start scaling a resource when many messages arrive upstream of a stream, for example, or even scaling to 0 when no messages are present.
KEDA is an excellent choice for event-driven autoscaling. There are alternatives, such as proprietary solutions offered by Cloud Providers like AWS Lambda, Azure Functions, or Google Cloud Functions. However, KEDA stands out for its open-source approach and compatibility with Kubernetes.
We, therefore, offer you a powerful and flexible tool for managing event-driven autoscaling in Kubernetes. We recommend using it if you're looking for a solution to manage the event-driven scalability of your containerized applications efficiently.5
Kubernetes
Kubernetes is the current community standard for scalable containerized application deployments.
Deploying an application in production involves solving a number of technological challenges, such as:
Before Kubernetes and the era of containers, we would have used an army of bash or python scripts, Ansible playbooks, or even a setup to ensure application deployment. Today, all this is elegantly replaced by a CI pipeline to produce container images and Helm Charts for deployment.
We would also have used a separate VIP system for load balancing. This is now supported:
The management of processing capacity and the installation of external tools is also facilitated by the use of its YAML interface and the various tools that have emerged from it, such as Kustomize and Helm.
Even so, Kubernetes maintenance does entail a certain burden, even when using a Cloud Provider's managed service. A CaaS will be lighter to maintain but less scalable. Indeed, it is often not possible to install additional controllers on CaaS.
6
PCA
A business continuity plan, or BCP, is the most pragmatic cloud architecture pattern for ensuring high resilience, exploiting the strengths of cloud providers.
A business continuity plan (BCP) guarantees infrastructure availability in the event of a disaster. It is essential for any infrastructure wishing to maintain a high level of availability and deliver an uninterrupted user experience. Before designing an architecture and considering implementing a BCP, you need to estimate a Recovery Time Objective (RTO) and a Recovery Point Objective (RPO) for your application.
Public Cloud Providers offer several ways of guaranteeing a BCP. For example:
Depending on the objectives, the application will be hosted on :
At Padok, we recommend the implementation of BCPs rather than DRPs (Disaster Recovery Plans). These are costly to test, and rarely result in activatable levers, and we are convinced that it is vital that they are carried out at the Cloud Provider level.
The more resilient and partition-tolerant an infrastructure becomes, the more complicated data consistency becomes. This is the CAP theorem.
7
Synthetic Monitoring
A technique for monitoring your applications that involves simulating a real user with robots to detect malfunctions.
Synthetics Monitoring is a technique for monitoring your applications that involves simulating a real user with robots to detect malfunctions. It contrasts with the classic but now outdated technique of checking infrastructure availability. This method no longer makes sense in highly distributed cloud architectures, where self-healing is present by design.
Generally speaking, priority is given to testing the critical paths of an application. This is the user path that represents the greatest business value. For example, for an online sales site, this would be the purchasing tunnel:
This test will ensure that your customers can actually buy on your site. It also validates that your backend services, such as search, session storage, etc., are all operational during the test.
From a simple call to a backend API to a multi-step process, many tools are available to set up Synthetics Monitoring. The choice is vast, from paid tools such as Datadog, NewRelic, and DynaTrace to open-source tools like Blackbox Exporter (Prometheus Stack). Please note, however, that it's best to run these scenarios from outside the infrastructure to position yourself as a "real" client to your application and thus detect network malfunctions (outages, latency).
Beware of external dependencies, however: they can generate alerts you can't act on. Nevertheless, it is important to be aware if one of your suppliers is unavailable. In this case, we recommend setting up specific procedures to deal with such events and implementing a circuit breaker system in your applications. Your application will automatically go into maintenance, and you'll be able to have a specific treatment during your on-call periods. You can imagine being alerted only if the service is inaccessible for more than 1 hour.
Application monitoring should be a standard part of your development cycle. As with security and performance, you need to set up a monitoring phase to ensure your services run smoothly. Don't neglect maintenance and evolution either. You should never go into production without probes to warn you of malfunctions. Nothing is more frustrating than being warned by your customers without realizing it beforehand.
We recommend an "automatic" approach, creating probes for each deployed application. This is what we do on our projects, with the flexibility of the Kubernetes API and tools such as Blackbox exporter or Datadog via Crossplane. As a result, none of our projects goes into production without adequate monitoring to validate that the service is being delivered.
Synthetic Monitoring is essential to ensure that your application is up and running. You shouldn't consider going into production without this kind of monitoring.
Trial
8
k6
k6 is a native and extensible kube load testing framework.
k6 is an extensible load-testing framework developed by Grafana Labs. It enables you to test the resilience of your infrastructure to peak loads on critical routes.
k6 uses Javascript as its scripting language. This makes it easy for developers to be owners of the scripts to be run against their applications. k6 itself was developed in Golang, so there's no need to worry about performance.
By default, k6 uses InfluxDB to store metrics, but sending them to Prometheus is possible. This is made possible by xk6, the tool's extension system. This is what won us over at Padok. You can easily add functionalities: MQTT module for IoT, RabbitMQ, event-driven, browser tests, and many more!
k6 is also equipped with a Kubernetes operator, which is still in the experimental phase and does not include a master/worker system by default. Each k6 replica sends raw metrics to your storage system without adding any logic. This caused us problems when sending metrics to Prometheus. And it's up to you to master Grafana and get the data you want out of it. What's more, you'll need to build your runner images incorporating them to use extensions with the operator.
However, running k6 locally or from a VM for simple HTTP tests will be sufficient for sporadic testing. There is also GitHub Actions to integrate directly into your CI pipelines. Using the operator is also possible but will require additional development to integrate its use.
9
Locust
Locust is a tool for measuring the performance of your web application.
Locust is part of a family of tools known as "load testing", enabling you to describe usage scenarios for your web applications and then play them out with many virtual users.
Performing these tests allows you to :
Locust's strength lies in its simplicity of use and ability to scale, with its central agent and workers model enabling you to reach the performance threshold you want to check almost without limit.
The scenarios are very simple to write, and if you're familiar with Python, you'll have very little trouble writing your first tests.
Locust will also provide a UI featuring real-time performance dashboards and control over progress tests (Stop/Start).
There are many load-testing tools on the market, but Locust is one of the simplest, and we recommend you try it along with k6.
Assess
10
Nomad
Nomad is a task orchestrator.
The task orchestrator has become one of the pillars of modern infrastructure. With Kubernetes being the best known, we can quickly convince ourselves that it's the only viable alternative and should be adopted by default.
It's a choice that many companies are making, as container management with Kubernetes is very mature. However, the infrastructure of many companies is heterogeneous (containers, VMs, web services...) and would be very costly to containerize. They can therefore turn to Nomad.
Nomad is a general-purpose task orchestrator created by Hashicorp. In addition to managing containers, Nomad features drivers that support tasks in virtual machines, simple scripts, Java applications, and more. Presenting itself as a simple binary, Nomad is lightweight and easy to install.
Backed by Consul (HashiCorp's Service Mesh solution), it is possible to completely federate applications, whatever form they take. This gives Nomad the scalability and ability to adapt to existing systems that Kubernetes lacks. Migration to an orchestrator is therefore much less costly than if you had to containerize all your applications.
Our reservations relate mainly to its lack of integration with existing Cloud Providers, as well as its lower community adoption than Kubernetes. However, we consider it a tool to consider if you have a large on-premise infrastructure or limited resources to allocate to orchestrating your tasks.
Hold
11
CloudNativePG