![]() ![]() With the latest updates, custom logic performs a blue/green style deployment where all of the new larger instances are launched in parallel. In the past, when the control plane had to be scaled, we would increase the size of the auto scaling group (ASG) and perform a rolling deployment of the instances in the ASG, one at a time. Now when we receive a scale up signal, both these components are scaled in parallel.Ĭhange cooldown to 15 mins after creation: We’ve shrunk the window before a cluster becomes eligible for scale up to 15 minutes.īlue/green style updates for the api-server: To achieve high availability and meet our Service Level Agreement obligations, Amazon EKS requires that a minimum number of control plane nodes be running at all times. These changes include the following:Ĭoncurrent API server and etcd scale ups: In the past, when the control plane needed to be scaled we would wait for the control plane nodes to be scaled before scaling etcd. Multiple engineering teams implemented several changes that together increased the speed. With our latest updates, the control plane can now scale in 10 minutes or less, which represents a 4x improvement. Long scaling delays could cause API and etcd latencies to increase or even cause the API server to become temporarily unresponsive. ![]() This time was felt most acutely by customers whose requests to the kube-apiserver steadily increased (linear growth). Previously, control plane scaling could take as long as 50 minutes. The latest enhancement involves reducing the amount of time it takes to scale the control plane. Since introducing control plane auto scaling, we’ve been looking at ways to further improve the scaling experience for our customers. These enhancements are a great example of the flywheel effect where AWS releases a feature in response to customer feedback, solicits feedback from end users about its impact, and uses that feedback to continue improving the customer experience. ![]() Now we use a variety of metrics to scale the control plane, including the number of worker nodes and the size of the etcd database. As we learned how the control plane behaved under different conditions, we adjusted our metrics to make scaling more responsive. At first, we used basic metrics such as CPU/memory for scaling. Today, the control plane is scaled automatically when certain metrics are exceeded. When this happened, they had to file a ticket with AWS support to have their cluster control plane resized. However, as usage of EKS grew, we discovered there were customers who occasionally exceeded the provisioned capacity of the cluster. This initially included automated upgrades, patching, and backups, which we often refer to as “undifferentiated heavy lifting.” We analyzed volumes of data to create a control plane that would work for the vast majority of our customers. When EKS launched in 2018, it aimed to reduce our customers’ operational burden by offering a managed control plane for Kubernetes. Many of them were running self-managed clusters on Amazon Elastic Computer Cloud (EC2) and were having challenges upgrading, scaling, and maintaining the Kubernetes control plane. What is a Developer Control Plane?Ī developer control plane enables developers to control and configure the entire cloud development loop in order to ship software faster.Years before Amazon Elastic Kubernetes Service (EKS) was released, our customers told us they wanted a service that would simplify Kubernetes management. Now is the time for developers to adopt a control plane. These control planes provide automated cluster management (e.g., cloud-managed Kubernetes control planes), L7 traffic traffic management (e.g., service mesh), and more. In addition to coding, cloud native developers are now also responsible for shipping and running applications.įor years, operations teams have addressed operational complexity in software applications by adopting control planes that provide appropriate abstractions and aggregation of control. To borrow a phrase from the Netflix engineering team, full stack developers are becoming full lifecycle developers. Today’s developers must write and package code, deploy these services into production, and make sure that the corresponding applications continue to run correctly when released into production. In a cloud native world, software developers are no longer only responsible for writing code. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |