Docker vs. Vagrant
By Yong Fu, OCI Software Engineer
Modern distributed application development often faces two problems:
- Inconsistency of development environments because of multiple versions of operating systems and packages
- Services and discrepancies between development and test/deployment environments
Virtualization technology, such as virtual machines (VMs) and containers, can handle these problems effectively, but they are often difficult to use in real-world development processes due to the lack of appropriate tools for developers. In this article, we briefly introduce VM and container management tools, Vagrant and Docker, with a focus on their concepts and workflow. By comparing major features of both technologies, we advocate for combining them, as opposed to using one or the other exclusively. We’ll use two real-world examples to demonstrate how to integrate both technologies to build a better development environment.
Docker is an open source platform used to automate the development cycle of applications inside software containers. It provides a layer of abstraction to automate creating, deploying, and running software containers.
Basically, Docker can provide consistent, reproducible, disposable containers that enable components to run on different machines, while sharing CPU and memory underneath and providing TCP/IP forwarding and file systems shared between containers.
Docker builds on container technology, and its early version uses Linux Container as the virtualization engine. But Docker is more than a mere wrapper of containers. A Docker engine that is responsible for running containers also provides the mechanism to host and distribute container images conveniently. Figures 1 and 2 illustrate the architecture of Docker, as compared to a traditional VM environment.
The Docker project is created by Solomon Hykes in dotCloud (now docker.io) and released as an open source project in March 2013.
The goal of the Docker project, as stated by Solomon Hykes, is “[t]o build the ‘button’ that enables any application to be built and deployed on any server, anywhere.“ Docker has been widely accepted since its public release, especially in the DevOps community. Docker follows the paradigm shift in virtualization technology in recent years. Ten years ago, most virtualized applications were thought to have a long life, and they were to be carefully maintained throughout their life cycle. Along those lines, for the sake of efficiency, stacks of applications were packed within a virtualized machine. You see this where both web servers and database servers run in one (1) virtual machine.
Today, modern distributed applications, especially web applications, are typically built using a set of loosely coupled services, with rapid development and deployment cycles. These services may run in multiple virtual machines and are ephemeral according to clients demands. Docker is a great tool for these kinds of applications, due to its easy-to-use interface, fast launching and small resource overhead.
A significant benefit of Docker in deploying/running applications in containers is its simple and developer-friendly command line interface. For example, to run/stop applications on Docker, a user simply enters the command:
docker run -d --name test echo “Docker is awesome” docker stop test
Docker's short launch time and minimal resource consumption are achieved by its underneath container technology, as depicted in Figure 3. Docker also improves the usability of container technology. For example, an appealing feature introduced by Docker, not included in a Linux Container, is versioned container images, which enables developers and DevOps engineers to polish their environment step-by-step. Docker implements this feature by attaching a layer files system via AUFS and Devicemapper.
Another advantage of Docker is that it facilitates the creation and sharing of customized images by using a standard and reproducible way to create images via Dockerfile. For example:
# A simple example of Dockerfile to run node.js. Key word are # # # capitalized FROM ubuntu # base image (bare OS) MAINTAINER Yong Fu # install necessary packages RUN apt-get update RUN apt-get install -y python-software-properties python RUN add-apt-repository ppa:chris-lea/node.js RUN echo "deb http://us.archive.ubuntu.com/ubuntu/ precise universe" >> /etc/apt/sources.list RUN apt-get install -y nodejs RUN mkdir /var/www # map a local directory into Docker ADD app.js /var/www/app.js # default application when exec “docker run … “ CMD ["/usr/bin/node", "/var/www/app.js"]
In fact, Dockerfile is a configuration management tool written by a simple DSL to customize the develop/deploy environment. Although Dockerfile is straightforward and easy to use, its functions are still simpler than the matured configuration management tools heavily used by the DevOps community.
Vagrant is an automation tool for building and deploying a virtualized environment. It easily spins up a headless and configured VM in a local machine or cloud platform, circumventing the need to set up a VM manually. Vagrant also supports most configuration management tools, including Chef, Puppet and Ansible. Under the hood, Vagrant adds a thin wrapper (written in Ruby) that automates the initialization, launching and configuration of virtual machines.
To understand the workflow of Vagrant, we first need to review some important concepts of Vagrant. Box is a package format used by Vagrant. (Vagrant compressed a VM image and its meta information into a box so that anyone can create identical working environments from the box.) Box may be versioned and hosted in private or public hubs for convenience to revise and distribute. Some special boxes are named basebox, which are the bare minimum virtual machines required for Vagrant to function. A basebox is often created manually or some tools like Packer according some requirements for interaction to Vagrant. Provider is a VM vendor, e.g., VirtualBox. Vagrant uses providers to differentiate launching and configuring the process of virtual machines. Provisioner is responsible for installing and configuring the software packages required to develop or run applications. It often runs automatically and without any user interaction. Vagrant supports most major configuration management tools as providers, including Ansible, Puppet, and Salt (to name a few). Vagrantfile, similar to Dockerfile, is a configuration tool of the Vagrant setup process. It is written in Ruby, and includes information on basebox, providers and other necessary actions on the VM.
As shown in Figure 4, Vagrant first reads from a Vagrantfile to identify the basebox and choose the provider. Then Vagrant delegates the task starting a basebox to an appropriate provider. Once the basebox runs, Vagrant configures the basebox by the user-defined provisioner in the Vagrantfile. After completing all these steps, a fully configured VM runs in the developer’s machine, which may be manipulated by users through ssh or rdp.
The workflow listed above seems a little bit complicated; in practice, however, only a few simple Vagrant commands are needed to run the workflow.
- Launch a VM based on Vagrantfile
- Access the VM via ssh or rdp
Similarly, it is also easy to stop the VM and clean the whole environment
- Stop a VM
- Delete the VM and remove all data related to VM (do not remove the basebox)
Comparison between Docker and Vagrant
As you may have noticed, this article identifies some overlapping features of Docker and Vagrant. So, one may ask: "Which one should I use, Docker or Vagrant?" We compare the two technologies in Table 1.
Table 1 is not intended to serve as a comprehensive comparison between the two technologies. While Docker performs (e.g., start time and size) better due to the inherent advantages of the container technology upon which it relies; Vagrant provides strong resource isolation and supports most OS types by managed virtual machines, which allows deployment to a heterogeneous environment (a big plus!).
Integrating Docker and Vagrant
Reviewing the comparison in Table 1, some readers may already notice that the strengths of Docker and Vagrant are really complementary. So instead of asking which is better, a more suitable question may be: why not let them work together? Caveat: when we suggest that Docker and Vagrant should work together, this is what we mean:
- Use Vagrant as a vessel to provide portability and resource isolation for applications running in Docker containers;
- Incorporate Docker containers with Vagrant VMs to build heterogeneous applications; and/or
- Run Vagrant VMs in Docker Containers.
In this section, we go a bit deeper to discuss how Nos. 1 & 2 work in real-world examples.
At eBay, developers are faced with the challenge of maintaining consistent develop/deploy environments throughout the organization, where each service may need reconfiguration subject to the OS and packages installed in each development machine. Using Vagrant and Docker, Ebay devops engineers have developed a neat solution.
In this solution, a developer can generate pseudo-distributed systems by developing and testing in Docker, shown in Figure 5, thereby accelerating the iteration between development and test, as opposed to using independent development environments in a single machine, while the test environment is in multiple machines. Notably, by wrapping the Docker based pseudo-distributed system into a Vagrant-managed VM, identical development and test environments may be created for the whole team. Considering Vagrant and Docker's developer-friendly command line interface, it is easy to build and tear down services in this solution, so as to save development time improve productivity.
Testing OpenDDS often involves several runtimes on different operating systems, e.g., Linux publisher/subscriber, Windows Info Repo (broker), etc., since we need to test heterogeneous platforms' support of OpenDDS. In this case, we incorporated Docker containers and Vagrant VMs, as shown in Figure 6.
Users should carefully consider how communication occurs between Vagrant-managed VMs and Docker containers. In theory, messages may be transferred in layer 2 and 3 in the OSI network model. In implementation, however, we found that it is tricky to connect messages at layer 2 by a bridge, since the Virtualbox VM is not compatible with the Linux bridge, while Docker adopts the Linux bridge as the default for connecting containers. That is why we only implement a layer 3 level network, which links the two bridges connected to Docker and Vagrant separately. Building this kind of connection is fairly straightforward, by configuring a Linux kernel to forward messages between bridges. An alternative method is to set up a Docker container as a dedicated router and to forward messages between VMs and containers. While building this test platform, we also found it difficult to configure Windows VMs without learning more complicated CM tools, like Puppet and Chef.
Both Docker and Vagrant are great tools to accelerate the adoption of virtualization technology in the development, test and deployment cycles. While Docker seems promising to change the landscape of virtualization and cloud technology, Vagrant is useful to those developers who want to exploit traditional VMs as a solution for resource isolation and cross-platforms. Integrating both technologies, as compared to using each exclusively, can be helpful in producing consistent, reproducible and portable platforms.
Software Engineering Tech Trends (SETT) is a regular publication featuring emerging trends in software engineering.