July 20, 2020

1470 words 7 mins read

Thinking about two different models of virtualization hosts

Thinking about two different models of virtualization hosts

The conventional way to make a significant commitment to on-premise virtualization is to buy a few big host machines that will each run a lot of virtual machines (possible with some form of shared storage and virtual machine motion from one server to another). In this model you’re getting advantages of scale and also fluctuating usage of your virtual machines over time; probably not all of them ar

e all active at once, so you can to some degree over-subscribe your host.

It has recently struck me that there is another approach, where you have a significant number of (much) smaller host machines, each supporting only a few virtual machines (likely without shared storage). This approach has some drawbacks but also has a number of advantages, especially over not doing virtualization at all. The first advantage is that you can deploy hardware without deciding what it’s going to be used for; it is just a generic VM host, and you’ll set up actual VMs on it later. The second advantage is that it is much easier to deploy actual ‘real’ servers, since they’re now virtual instead of physical; you get things like KVM over IP and remote power cycling for free. On top of this you may make more use of even moderate sized servers, since these days even basic 1U servers can easily be too big for many server jobs.

(You may also be able to make your hardware more uniform without wasting resources on servers that don’t ‘need’ it. If you put 32 GB into every server, you can either run one service that needs 32 GB or several VMs that only need 4 GB or 8 GB. You’re not stuck with wasting 32 GB on a server that only really needs 4 GB.)

Having only a few virtual machines on each host machine reduces the blast radius when a host machine fails or has to be taken down for some sort of maintenance (although a big host machine may be more resilient to failure). It also makes it easier to progressively upgrade host machines; you can buy a few new ones at a time, spreading out the costs. And you spend less money on spares as a ratio of your total spending; one or two spare or un-deployed machines cover your entire fleet.

(This also makes it easier to get into virtualization to start with, since you don’t need to make a big commitment to expensive hardware that is only really useful for virtualization. You just buy more or less regular servers, although perhaps somewhat bigger than you’d otherwise have done.)

However, this alternate approach has a number of drawbacks. Obviously you have more machines and more hardware to manage, which is more work. You will likely spend more time planning out and managing what VM goes on what host, since there is less spare capacity on each machine and you may need to carefully limit the effects of any given host machine going down. Without troublesome or expensive shared storage, you can’t rapidly move VMs from host to host (although you can probably migrate them slowly, by taking them down and copying data over).

In a way you’re going to a lot of effort to get KVM over IP, remote management capabilities, and buying somewhat fewer machines (or having less unused server capacity). The smaller the host machines and the fewer VMs you can put on them, the more pronounced this is.

(But at the same time I suspect that there is a sweet spot in the cost versus server capacity in CPU, RAM, and disk space that could be exploited here.)

(6 comments.)

Author: cks

Date: 2020-11-09

URL: https://utcc.utoronto.ca/~cks/space/blog/sysadmin/VirtualizationHostLargeVsSmall

utoronto.ca

Fifteen years of DWiki, the Python engine of Wandering Thoughts (2020-10-26) DWiki the wiki engine that underlies Wandering Thoughts this blog is fifteen years old That makes it my oldest Python program thats in active regular and even somewhat demanding use we serve up a bunch of requests a day although mostly from syndication feed fetchers and bots on a typical day As is usual for my long-lived Python programs DWikis not in any sort of active development as you can see i..
Getting the git tags that are before and after a commit (in simple cases) (2020-11-09) When I investigate something in a code base I often wind up wanting to know when a particular change became available in a release or in general to know when it was made in terms not of time but of releases Using release dates is both not reliable since a change can land early in a side branch and then be merged into mainline only much later and a certain amount of pain you have to look up release..
Some settings you want to make to CyberPower’s UPS Powerpanel daemon (2020-10-31) I have a CyberPower UPS and a while back I installed their PowerPanel software to talk to it in large part to get various status information in an automated way The other day I discovered that it has some undesirable default settings So here are some notes on things that you will almost certainly want to change in /etc/pwrstatdconf if youre using PowerPanel too As I discovered the daemon has at le..
Apple Silicon Macs versus ARM PCs (2020-11-19) In a comment on my entry on how I dont expect to have an ARM-based PC any time soon Jonathan said: My big takeaway from the latest release of Apple laptops is that these new laptops arent necessarily ARM laptops When a person gets an Apple Silicon Mac they are not getting an ARM computer They are getting an Apple computer As it happens I mostly agree with this view of the new Apple machines and i..
We rebooted all of our servers remotely (more or less) and it all worked (2020-09-27) Even under normal circumstances we dont routinely reboot our Linux servers Reboots are disruptive to our users especially to the people who are logged in to the servers that reboot and local policies require us to schedule an after-hours downtime for large scale user visible things like this which is disruptive to our lives We do reboot them periodically either for significant enough Ubuntu kernel..
My likely path away from spinning hard drives on my home desktop (2020-09-29) One of my goals for my home desktop is to move entirely to solid state storage Well its a goal for both my home and work machine and I originally expected to get there first at home but then work had spare money and suddenly my work machine has been all solid state for some time which is great except for the bit where Im not at work to enjoy it Moving to all solid state at work was relatively stra..
Daniel J. Bernstein’s IM2000 email proposal is not a good idea (2020-09-06) A long time ago Daniel J Bernstein wrote a proposal for a new generation of Internet email he called IM2000 although it never went anywhere Ever since then a significant number of people have idealized it as the great white if only hope of email especially as the solution to spam in much the same way that people idealized Suns NeWS as the great if only alternative to X11 Unfortunately IM2000 is no.. Daniel J. Bernstein’s IM2000 email proposal is not a good idea
How we choose our time intervals in our Grafana dashboards (2020-08-08) In a comment on my entry on our Prometheus and Grafana setup trallnag asked a good question: Would you mind sharing your concrete approach to setting the time intervals for functions like rate and increase? This is a good question because trallnag goes on to cover why this is an issue you may want to think about: I tend to switch between using $__interval completely fixed values like 5m or a Grafa..
URL query parameters and how laxness creates de facto requirements on the web (2020-09-07) One of the ways that DWiki the code behind Wandering Thoughts is unusual is that it strictly validates the query parameters it receives on URLs including on HTTP GET requests for ordinary pages If a HTTP request has unexpected and unsupported query parameters such a GET request will normally fail When I made this decision it seemed the cautious and conservative approach but this caution has turned..
Web page generation systems should support remapping external URLs (2020-10-05) Some web pages and web sites are hand authored but many more are generated dynamically or statically through web page generation systems and content management systems of various sorts Also often our writing in these systems has links to external pages; to other peoples writing to reference documentation to Wikipedia to whatever This presents us the people running web sites and writing on them a l..