December 2009 Archives

Using Big Files As Hard Disks

| 4 Comments | No TrackBacks
The XEN hypervisor uses big files (a couple of gigabytes) as filesystem images for virtual machines. Unlike other virtualisation solutions XEN does not impose its own internal structure on the image file. The big file simply has to contain an ordinary ext3 filesystem and, optionally, a partition table just as if it were a real hard disk. The ability to use big files as hard disks comes in handy if you are running short of space on your main hard disk. With an external hard disk you should be well prepared to run a number of virtual machines as big files. However, having the filesystem of a virtual machine in a big file raises the question of how to boot the virtual machine. Essentially there are two options to do that:

  1. Provide the VM's kernel and the init-ramdisk, which are usually stored inside the filesystem (in the /boot directory), as separate files together with the big file, and modify the VM's configuration to use them.

  2. leave the kernel and the init-ramdisk in the big file and provide a working boot sector that accesses the kernel inside the big file, using the native XEN pygrub bootloader to start the virtual machine.
Both options require that the big file must be associated with a real, special device file (i.e /dev/loop0) in order to create a filesystem on the big file. While for the first option it is sufficient to simply connect the big file with the loop device, using the "losetup /dev/loop0 bigfile" command, the second option is much more complex, as the big file has to be partitioned like an ordinary hard disk before the filesystem can be created.

For the rest of this article we will focus on the second option which is much more appealing as everything is kept inside the big file. I will show you how exactly the big file is turned into a virtual hard disk and how you can access and modify the information stored in the virtual machine's own filesystem.

Getting Partitions And Filesystem Sizes Sorted

Our journey through the big file's internal structure naturally begins with the creation of the big file.

dd if=/dev/zero of=bigfile bs=1M count=3950

As a second step we use this chunk of 4141875200 bytes to act as a hard disk and try to partition the bigfile as usual:


losetup /dev/loop0 bigfile
fdisk /dev/loop0

Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help):

As expected, the fdisk program throws a number of error messages at us, because we have given a big file instead of a real hard disk to the program. But let's see how the fdisk program recognizes our new hard disk in detail
.
Disk /dev/loop0: 4141 MB, 4141875200 bytes
255 heads, 63 sectors/track, 503 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Obviously there is no partition table yet, but the program assumes that the big file represents a hard disk with 255 heads and 63 sectors of 512 bytes data each. Every cylinder of our virtual hard disk is made of 255 x 63 x 512 bytes = 8225280 bytes which represents the units in which we can chop the hard disk space into partitions now. All in all there are 503 cylinders in our virtual hard disk which makes a total of 503 x 8225280 bytes = 4137315840 bytes to spend on partitions.

But wait, didn't we create 4141875200 bytes in the first place? That's 4559360 bytes less than what we had originally. Well, this loss is due to the fact that for the 504th cylinder we'd need 8225280 bytes which we don't have, so this loss is inevitable. But the important consequence of this reduction of space is that we cannot create a filesystem on the whole bunch of data we supplied. At the moment the size of our filesystem is not determined at all. The next step is to create a new primary partition inside our big file using all the space we have:

Disk /dev/loop0: 4141 MB, 4141875200 bytes
255 heads, 63 sectors/track, 503 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/loop0p1 1 503 4040316 83 Linux

After having written the partition table to the big file, have you checked for the new device file /dev/loop0p1? Don't worry, it does not exist! Adding p1 to the disk label is fdisk's way to denote partitions, that does not mean that you'll find such a thing in the /dev directory.

Poking Inside The Big File

From the partition table you can see that 4040316 blocks have been allocated for the new partition. With each block storing 1024 bytes we now know our first partition size, it's 4040316 x 1024 bytes = 4137283584 bytes. This is another number we never saw before! After having written off some 4.5 megabytes because we cannot use half a cylinder, we now face another loss of exactly 4137315840 - 4137283584 = 32256 bytes.

Of course these 32256 bytes at the beginning of the big file are there for a purpose, which is to store the partition table. Our first partition begins right after this amount of data, at an offset of 32256 inside the big file. The amount of 32256 bytes results from the fact that one track (63 sectors of 512 bytes for one head) are put away for the partition table. Now it's time to use a second loop device (/dev/loop1) to poke inside the big file at exactly the point where our first partition begins and create a new filesystem there:

losetup -o 32256 /dev/loop1 bigfile
mkfs -t ext3 -c /dev/loop1 4040316

It's essential that we supply the number of blocks as a parameter to the mkfs command to ensure, that our new filesystem on the first partition fits exactly in the space we have allocated. Without this parameter our filesystem would become too big, as the 4.5 megabytes after the first partition would be used for the filesystem too, and when the virtual machine is going to use the filesystem its actual size would conflict with the numbers in the partition table. Either the partition table or the filesystem's superblock is lying, which will cause distress for the virtual machine that expects a consistent filesystem to operate.

Writing The Master Boot Record

You can fill up the filesystem with whatever carefully selected quality open source software you can find on the planet, but in the end we need to write the new virtual disk's master boot record to boot the jewel. There is one step of preparation to be done before we can use the grub shell to write the MBR. We have to make a symbolic link named /dev/loop to the device that points to the master boot record, that is to the beginning of the big file, /dev/loop0 in the example above.

grub> device (hd0) /dev/loop
grub> root (hd0,0)
grub> setup (hd0)
grub> quit

Now your spick-and-span virtual hard disk is ready to boot.

Recent Comments

  • Luigie Fulc: Gerade habe ich eine interessante Seite für Tricks auf Linux read more
  • Ralph: There is a page called "Copyright Policy and Terms of read more
  • Windows Icons: Hello! I do not see a condition of use of read more
  • Ezine: A thoughtful insight and ideas I will use on my read more
  • Ralph: Elaborating upon your thought experiment a little bit more and read more

OpenID accepted here Learn more about OpenID

Small Business Blogs - BlogCatalog Blog Directory