Distracting adventures in ZFS upgrades

Sep 04 2015 Published by under linux

Last week I wanted to play around with some software packages for logging and charting of environmental measurements and events (specifically, two packages, openhab, and emoncms)
Wanting to save time (sweet irony!), rather than building up a VM and manually configuring the tools, I figured I’d use docker. Except that the workstation I wanted to use was running Debian Squeeze was still on kernel 3.2, which doesn’t support docker. Oh, and a ZfsOnLinux (ZoL) zraid for the root filesystem.
So the steps to get to docker involved upgrading the kernel, ZFS, and by the way, the nvidia drivers.
Mistake #1. I should have just built a Xubuntu 14.04 VM and run docker inside that!

Before upgrading the kernel from Debian backports, I decided to ensure ZfsOnLinux was updated. I (correctly, confirmed) anticipated the most problems with ZoL. Anyway, I knew that upgrading ZoL would be fraught with danger so I read all the documentation, and upgrade advice, and so on, and took all the recommended precautions.

But of course, after going through two cycles of apt-get and dpkg-reconfigure and  rebuilding the initramfs and so on, after rebooting, BAM! A variant of the dreaded “failed to mount the root filesystem” error. Reported close by was a missing kernel module error for something called zcommon.

After a bit of digging and breaking the virtual glass on an emergency boot partition I worked out that I had missed upgrading one of the packages required for ZoL. Why it was not an automatic dependency I don’t know, but after installing something called “libvnpair” the system booted further. And then stopped again.

This one would take rather a bit more work to track down. Semi-helpfully, the entire error message was:

Manually import the root pool at the command prompt and then exit.
Hint: Try: zpool import -R / -N ${ZFS_RPOOL}

At this point, the initramfs was dropping my system to a rescue shell, and via the above message advising me to import the ZFS pool containing the root filesystem. So I tried its helpful suggestion to execute the ‘zpool import’ command, which actually succeeded, and after some more fiddling manually mount various file systems proceeded to boot the system. However, this manual process only got me out of trouble once, and still needed to be resolved.

To get further I had to instrument the initramfs file scripts/zfs with a bunch of echo statements and rebuild the initramfs. (The script files bundled in when rebuilding initramfs on Debian are located under /usr/share/initramfs-tools/scripts) This let me reboot and work out where the zpool import was failing (or not even being called at all.)

As it turns out, zpool was not being called, at all, in a way that would work for my partitioning scheme. The logic in scripts/zfs runs a whole bunch of permutations trying to locate the pool, but if a variable called ROOT is empty it skips executing zpool as required. The solution, as it turns out, was to update my grub with ‘root=zfs:AUTO‘ – previously, my kernel did not require this kernel argument, but now, having upgraded ZoL, from 0.6.2 to 0.6.4, it did.

So, what caused this? There were a lot of year or so old threads discussing upgrade errors related to ZfsOnLinux but none of them quite matched my specific scenario.

One possibility is this:
* I run a separate boot filesystem from the usual /boot, containing a hand crafted grub, which can execute various tools such as Gparted, various minimal linux installs for rescue purposes, memcheckx86 and other tools.
* Whenever I upgrade the kernel on this system I need to copy over the vmlinux and initramfs files to this originating boot filesystem from /boot (which is never used by my grub)
* I wonder if ZoL  may have added the root=zfs:AUTO option to the Debian grub update facility, but I neglected to check for changes to the generated /boot/grub/grub.cfg and apply any changes to  my real grub.cfg. And wham!

However, I couldn’t find any references to zfs in /etc/grub.d, so this hypothesis may well be wrong. Via occams razor, perhaps its just that my setup on this particular workstation is more complex or unusual than most users of ZfsOnLinux. Anyway, onward and upwards.

I’ shortly to decide on which of OpenHAB or EmonCMS I’ll be using for my Hackaday Prize finals entry. Stay tuned!

No responses yet

Booting a Windows7/Vista Recovery partition when Windows is broken

Apr 22 2014 Published by under howto, windows

Even though I am “pretty much” an open source advocate, I still have to use Windows professionally when required, and of course am the IT support for extended family :-) In this case, I needed to rebuild a laptop for my mum from scratch.

The laptop in question, a Benq Joybook A52 had previously been my dads, and been through incarnations of Windows Vista, downgraded to XP then back up to Vista, and I decided it would be safer to start with a clean slate and perform a factory restore, apply the service packs and all the recent security updates anew.

This laptop had been previously been cleansed of the usual ‘crapware’ and other default programs, including the factory recovery icon. I had early on installed the very useful tool EasyBCD from http://neosmart.net/EasyBCD/ to dual boot Vista and XP. (Aside: EasyBCD used to be free, it seems you can still get it free for Non-Commercial use but you have to dig a bit.) Using a Linux bootable USB I was able to detect that the recovery partition, a FAT32 primary partition labelled ‘PQSERVICE’ but set to type 0xde (Dell Utility) was still present (luckily). However regardless of which settings I tried I was unable to immediately get the factory recovery partition to start, either from the rescue USB or via EasyBCD. Being an mum & dads for dinner I didn’t have a lot of time to get into nuts and bolts.

Surprisingly, a quick search for PQSERVICE, booting benq recovery partition or various other combinations didn’t really bring up anything that useful. So for the moment /dev/sda4 remained stubbornly inaccessible to the Windows boot machinery.

A couple of weeks later, now having the laptop in my possession to deal with this, I took an image so that I could experiment. Luckily the drive was only 80GB, such an expansive size from circa 2006!

First thing I did was create an image to play with in qemu. Figuring that the important parts were simply the recovery partition and the boot sector, I managed this as follows:

(Aside – these instructions may or may not also work on other flavours of laptop of this vintage!)

Firstly, upon mounting the recovery partition, you can see what appears to be a bog standard cut down Windows filesystem:

Examine the partition layout:


The recovery partition is #4. Note that ‘diag’ is actually type 0xde when checked using fdisk.

Second, assemble a fresh experimental disk:

As a check the size of test.bin and laptop.img should be identical.

Now, attempt to boot the image in QEMU.

This is achievable using Grub2, and in this case I chose SuperGub2Disk, http://www.supergrubdisk.org.

After the boot screen starts, choose ‘c’ for a command line, and use the following:

For once, this worked first time, starting the Powerquest recovery software.

So now I simply had to repeat this process, using a USB key with Grub2 on the laptop itself.


For some reason the BIOS in this laptop did not understand my favourite bootable USB with Grub2 so I ended up burning a CDR for the first time in a little while.

I also had to wipe the other partitions out of the partition table; it seems the recovery program just unpacks a partition to C: rather than rebuilding the partition table! Things to note include, remember to set the partition type of what will become C: (/dev/sda1) as bootable and NTFS; and for good measure zero the first sectors of that partition.

No responses yet

AspireOne and encrypted SD card /home

Jan 28 2013 Published by under linux

Tip for the week (well I finally sorted this just in time on Sunday for LCA2013 )

If you run /home with encryption it doesn’t come back properly after suspend resumes, unless you add the following to your kernel boot command line:


I think this means the hot removal of the SD card slots doesn’t work so dynamically but in my case, it means I can suspend my machine – which I need to do a lot at LCA.

(With thanks to https://bbs.archlinux.org/viewtopic.php?id=91807 )

I found this out after rebuilding my trustty aspireone to Crunchbang Waldorf and suddenly suspend appeared to cause all manner of problems.

No responses yet

Windows shenanigans – with a little help from our (Linux) friends – part 1

Nov 11 2012 Published by under realworld, tech, windows

Although I use Linux as my primary O/S, I am required to also use Windows at work and most family / friends / neighbours etc. use it. So I need to stay up to date with the Microsoft world to retain my computer geek “cred”, as I am often called upon to fix problems or provide tuition…

Quick tip if you have to use a Microsoft O/S – you may be able to resolve Windows Vista / Windows 7 boot problems using EasyBCD, it is free for non-commercial use. Similar can be accomplished using Grub2 and GPartEd; however EasyBCD can manipulate the native Windows boot manager, and I need to experiment further with my wifes Win7 laptop when she is not around ;-)

A while ago a close relative had a run of bad luck with his system. Amongst other things this involved migrating from an old “slow” Vista Premium to a fresh install, on a clean hard drive. The fresh install ran much faster without the years of crud build up and recent drivers, etc. but he was unable to make it work without the original “slow Vista” hard disk in the machine, which was the system (BIOS) boot disk. The computer involved had several internal SATA and external drives, a situation which had previously eventually lead to disaster as my relative attempted to sort it all out, but more on that another time!

The problem was the “new” Vista was added to the Windows boot menu but with the “old” drive removed, the system was rendered unbootable; i.e. the clean drive had no boot manager installed.

The solution I employed was to use a tool called EasyBCD. The procedure essentially involved first installing EasyBCD onto the old Vista, and using it to make the new drive the default, at which point we went out to lunch at least making his system slightly more usable.
Having confirmed the new Vista was automatically entered on reboot, EasyBCD was installed into the “new” Vista, and used to install the boot manager onto the new drive, and finally removing the old drive. One key step involved using the “Select BCD store” to edit the menu on the alternative disk.
In all cases, it is prudent to take a BCD backup! (And of course backup anything else important.)

This was not completed without some recourse to Linux; at the start of proceedings, in spite of a lot of to-ing and fro-ing of drives and cables, neither Vista system would recognise a new 2TB drive he wished to use for data. I was able to boot using Xubuntu 12.04 and this could not properly see the drive either! As a last resort we swapped it to a USB a caddy and using my Acer Aspire One running Squeeze with a 3.2.9 kernel confirmed the drive was OK. Then running a manual Windows update on Vista actually allowed the system to recognise the drive. Perhaps I should have done this first, but I think it can be useful to experiment a bit longer and it was more comfortable inside on this day anyway… S

It seems therefore that both older unpatched Microsoft systems and older Linux kernels cant see some larger hard disks.

This all happened a little while ago so I don’t have exact model numbers or software versions.

No responses yet