SecDevOps.comSecDevOps.com
The Bare Metal Myth: Why VMs Now Win for Containers

The Bare Metal Myth: Why VMs Now Win for Containers

The New Stack(today)Updated today

There are strong opinions about the best way to deploy containers. The purist wants to run containerized applications on pure bare metal servers. Minus the abstraction layers of virtual machines...

There are strong opinions about the best way to deploy containers. The purist wants to run containerized applications on pure bare metal servers. Minus the abstraction layers of virtual machines (VMs) between the CPUs and the GPUs, the purist’s application management freedom may remain hobbled — but that doesn’t matter when every millisecond of latency counts. That is the story of bare metal. The pragmatist, by contrast, is supporting a massive team of developers who are familiar with the public cloud experience. The pragmatist cares most about making the business successful, which means they understand the importance of assessing pros and cons based on individual cases, not tradition. Some organizations might have opted to keep low-latency applications on bare metal due to concerns about VM performance. However, virtualization has been in the market for more than 25 years, and in recent years, the performance of VMs has largely caught up to bare metal servers. This is why organizations have been running mission-critical applications, including large databases, in VMs for the last 15 years. While the now-negligible performance differences between VMs and bare metal have largely been solved, there are an overwhelming number of use cases or features — for manageability, security, isolation and other benefits — that lend themselves to either container VMs or bare metal infrastructure. Ultimately, though, VMs offer significant operational, security and isolation benefits that often outweigh the negligible performance differences for most enterprise applications. A Brief History of Virtualization: From Bare Metal to VMs Around 20 or 25 years ago, every application ran on physical hardware — bare metal servers and large mainframes. With the disruptive arrival of virtualization, enterprises initially hesitated regarding the performance implications of migrating from their existing systems. While it’s true that early virtualization technology was not mature enough to deliver the performance required by enterprise applications, this has changed significantly over time. Hardware vendors such as Intel began developing functionalities within CPUs to offer a “pass-through” experience and hardware-assisted virtualization. This allows direct hardware access to CPU instructions for the hypervisor, overcoming the latency issues associated with the layer between the application and the hardware. From a purist’s perspective, a hypervisor will always introduce some latency. However, the customers most likely to be impacted by this latency, such as those in financial trading or telecommunications, where nanoseconds are critical, are likely still running their applications on bare metal and have never virtualized them. For the vast majority of applications, such as web services, e-commerce and streaming platforms, the operational benefits of virtualization outweigh the performance gains of bare metal. These services were often born in the cloud, running on VMs, where they were designed to deliver high-quality service. The core of the matter is that while virtualization does introduce a layer of latency on paper, the operational efficiencies it provides are substantial. The time it takes an organization to spin up a Kubernetes cluster on bare metal is significantly longer than it takes on a virtualized platform. This is one of the reasons why the pragmatist’s perspective wins in most cases. Why VMs Excel for Running Kubernetes at Scale As organizations seek alternatives to established virtualization platforms, they are increasingly virtualizing containers. This highlights the enduring benefits (and foundational strength) of traditional enterprise VMs. A major reason is that running Kubernetes at scale is simpler and more performant from an operational standpoint when deployed within VMs. These operational advantages are why the largest hyperscalers — Amazon, Google and Microsoft — operate their Kubernetes services on VMs. A common perception is that a VM’s abstraction layer inherently slows down performance. But this overlooks the potential for enhanced performance when running a high volume of containers within VMs, especially as CPUs grow in size and power. Here are some things to consider: Kubernetes defaults to a maximum of 110 pods per node, and as core counts per node have gone up, that number hasn’t changed. While you can run higher numbers, you may run into performance and networking issues that Kubernetes wasn’t designed for. You can alleviate this by running VMs on larger nodes with enterprise hypervisors (which, not incidentally, have been handling these issues for decades). VMs provide far better isolation between tenants. The advantages of isolation are legion: stronger security and data protection, better reliability and performance, improved resource management, greater regulatory compliance and easier troubleshooting. Using VMs gives you access to dynamically sized clusters, a powerful feature in Kubernetes. As your needs change, you can instantly scale up or down, making clusters larger or smaller on the fly to optimize resources. Running Kubernetes in VMs allows you to run VMs and containers on the same hardware. Maintaining silos between hardware running Kubernetes and hardware running VMs can complicate networking and security, and lead to poor hardware utilization. As an operating system (OS) manages more parallel processes, contention grows, which hurts performance — especially with higher core counts. A hypervisor splits the CPU across multiple VMs, so each OS handles fewer tasks, reducing contention and improving performance. This means that, when running a large number of containers on a single server, splitting it up into VMs may help improve overall performance. Bare Metal vs. Virtualization: Which Is Best for Specific Use Cases? Running highly available Kubernetes on bare metal typically requires a minimum of seven physical servers, echoing the data center landscape of 25 years ago. During that era, organizations sought to consolidate underutilized servers that consumed excessive energy, power, space and cooling. This drive for consolidation led to the widespread adoption of virtualization, to the point where today many critical databases run on VMs without performance complaints. The journey of virtualization illustrates a key principle: There is a definite place for bare metal, particularly for organizations that never virtualized certain high-performance applications. However, for the other 90% to 95% of workloads, the benefits of virtualization remain compelling. A flexible platform that can run Kubernetes on both VMs and bare metal, managed from a single layer, offers the freedom of choice. This allows workloads to run where it makes the most sense, without being forced into a one-size-fits-all solution. Latency and Volume The use cases that lend themselves more to bare metal are typically low-latency applications, such as real-time stock market transactions, where any delay can have significant financial implications. In contrast, streaming video, which utilizes a queue and can tolerate a few seconds of delay, does not fall into this category. It’s also important to note that web, e-commerce and streaming applications were born on the cloud running on VMs. Another critical factor is the volume of consolidation. For a telecommunications company running at massive scale, the possibility of a small percentage of performance loss related to virtualization is a more than fair trade against the tremendous operational costs of running on bare metal. Cost and Utilization It’s common knowledge that major hyperscalers and cloud providers run their containerized infrastructure on VMs. This is driven by a balance of security and cost. While these providers do support bare metal, offering it to every customer would require an immense amount of data center space, power and cooling. Therefore, bare metal options are typically reserved for customers with specific performance or security needs, such as compliance requirements that are simpler to manage on a dedicated physical server. The arguments that bare metal is easier, involves less licensing and leads to cost savings often don’t hold up in practice, especially when operations are a more significant factor. After all, managing virtual infrastructure with containers requires fewer personnel than managing a full bare metal container infrastructure. You also have to install the OS on the bare metal system, which takes time and requires tools like PXE boot, which can add complexity. With VMs, you can use standard VM life cycle tooling (snapshots, cloning, migration) to manage container hosts more easily. For most customers, paying for an entire physical server that is not fully utilized is not cost-effective, which is why they usually opt for appropriately sized VMs. There are even open source autoscalers, such as Karpenter and Cluster Autoscaler, that automatically provision and scale VMs based on cost efficiency and workload requirements. Scalability Virtualization can simplify dynamically scaling, a core value proposition for Kubernetes. If an application experiences a sudden surge in users and needs more containers but lacks compute capacity, a virtualized environment can spin up new VMs on available resources across multiple servers. In a bare metal environment, this would require having spare physical servers sitting idle, waiting for such a contingency. With virtualization, multitenancy is simpler, and you can also run container runtimes side by side. Also, when building in one location and deploying in another (from dev > test > staging > prod), each VM can run a known, hardened base image for container orchestration, avoiding hardware or driver mismatches on bare metal. Scalability is a primary reason why many prefer the cloud for container deployments, as scalability presents a challenge when moving to an on-premises model. For companies running...

Source: This article was originally published on The New Stack

Read full article on source →

Related Articles