"Docker Container on a VMWare instance on AWS is our version of Turducken." —@cloud_opinion
Let's imagine my Mac running VMware Fusion.
Inside that, I'm running Ubuntu. In there, I've got
a Docker container running a CentOS 6.4 base image.
Once there, I've used virtualenv
to create a Python
environment with my favorite version of Flask.
(Or maybe I chose Java, then Tomcat, running
multiple WARs!) (Or maybe AWS, then Docker,
then node.)
Look at all the ways we can distribute and run software these days:
Layer | Examples | Build Target Set |
---|---|---|
OS Image | PXE | Hardware Architecture (x86_64 ) |
VM Image | VMware, AMI, OpenStack, vagrant | Virtualization Platform |
OS Packages | RPM, deb packages | OS Version |
OS-level containers | Docker, LXC, Solaris Zones | Containerization Platform |
Language runtime configuration | CLASSPATH for Java, virtualenv for python, rvm for Ruby |
Language runtime version |
Language VM manipulation | Java Application Servers (e.g., Tomcat) hosting multiple applications; OSGI | Container Platform |
Static Linking | gcc -static ; Go binaries |
OS-architecture pair (e.g., linux_arm ) |
Browser | Web Apps | Browser Version and Platform |
PaaS | Google App Engine, Heroku | PaaS Platform |
The Build Target Set column suggests what you have to vary as you build. For example, if you're building OS Packages, then you need to build one package per supported OS Version. If you're building VMs, you have to build one VM per supported hypervisor. If you're building web sites, you have to test with every browser version.
A few observations:
- The Java application server/OSGI stuff is silliness. I can't imagine a world where you're better served by one JVM running two applications rather than two JVMs running two applications.
- The
virtualenv
stuff is extremely frustrating and fragile. I've got Stockholm syndrome, so I think Java is relatively straight-forward with$CLASSPATH
, but of course it's still nutty with its wildcard-handling, order dependencies, and obtuse load ordering. - It turns out that every runtime needs its own, separate package/dependency/compatibility story. (C is the exception that proves the rule: the OS package manager is the package/dependency story for dynamically linked C binaries.) The impedance mismatch between runtimes is too great to use RPMs for Java applications or Ruby gems or python
- And, oh, what's the browser doing there in the bottom? Isn't it interesting that the client side of a web app, including JQuery, Bootstrap, and all those other goodies, gets shipped (modulo caching) to its users every time! And different tabs can use different versions of jQuery.
- Web apps are also interesting for their interaction with the time axis. If you've got a Tamagotchi pet, it comes shipped with software that never updates. If you're using a web browser, the software that you're interacting with updates without you knowing it all the time.
All this, of course, reminds us of static linking. (Static linking may be a third rail of computer programming. See Rob Pike, for example.) Shipping an AMI or a VM with all of your bits baked in is isomorphic to shipping a statically linked binary. (The isomorphism is hammered through in this paper about MirageOS which smooshes the app and the OS into a single thing.)
When we ship software, we depend on some platform (be it x86, the JVM, Python, the browser), some local state (the shared libraries that happen to be installed, which is a source of potential trouble), and the bits we actually package up (be it on a website or a CD). We're now moving the boundaries around aggressively and in different ways. Most of the mechanisms in the table above are trying to squish into nothing the potentially odious middle state, but it's notable how different a JVM-based approach (somewhat OS independent) is from a Docker-based approach (the OS is part of the distribution).
Thanks to @henryr for commenting on an earlier draft and pointing me towards MirageOS.