It does not take much ef­fort to lock your­self out­side a re­mote vir­tual ma­chine: as soon as you lose ssh ac­cess to your server, most of them are as good as lost. Some VPS pro­viders, in­clud­ing Di­gita­lOcean, make it pos­sible to ac­cess the con­sole of the server even if the server is off­line, which makes re­cov­er­ing a failed droplet pos­sible.

Given the ter­rible Dig­it­al­Ocean ker­nel hand­ling, mess­ing up a droplet is very easy. Chan­ging a ker­nel the droplet boots up into is a mat­ter of 5 clicks. One will en­counter no warn­ings along the way and a single power cycle later the droplet will be gone from the In­ter­net. Up­grad­ing the ker­nel pack­age in­side the droplet will lead to the same out­come. Al­though I knew this after get­ting burnt try­ing to up­grade the ker­nel on my droplet more than a year ago, I still com­mit­ted the same mis­take twice since then. Yes­ter­day, when the init dae­mon seg­faul­ted after 11 months of up­time and the ma­chine be­came un­reach­able, some­thing in­side me fi­nally snapped and I set off look­ing for the reason be­hind miss­ing eth0 after a ker­nel up­grade.

Droplet ker­nel versus boot ker­nel

Droplets boot with ker­nel other than one in­stalled in­side the droplet some­times caus­ing com­pat­ib­il­ity is­sues between two. There­fore, dis­tinc­tion between the boot and the droplet ker­nels is im­port­ant in this art­icle:

  • Droplet ker­nel shall mean ker­nel resid­ing in­side the droplet and usu­ally man­aged by OS pack­age man­ager;
  • Boot ker­nel is the ker­nel a droplet boots with.

Change of the boot ker­nel is ac­com­plished through Dig­it­al­Ocean droplet man­age­ment in­ter­face:

Con­trol panel sec­tion for chan­ging the ker­nel
Con­trol panel sec­tion for chan­ging the ker­nel

If you only changed the boot ker­nel and did not up­grade the droplet ker­nel, chan­ging the boot ker­nel back should fix any prob­lems you have.

If you up­graded the droplet ker­nel, fix­ing it is more com­plic­ated and is the fo­cus of this art­icle. First of all, check whether you can change to a boot ker­nel with ex­actly the same ver­sion as your droplet ker­nel and power cycle the droplet after chan­ging. If there is no match­ing ver­sion, pick the closest ver­sion avail­able and fol­low the in­struc­tions.

Ac­cess­ing the droplet

Dig­it­al­Ocean provides a simple way to ac­cess your droplet through the VNC con­sole. Ac­tu­ally, it is prob­ably the only way one can ac­cess a droplet by one­self when it loses the ac­cess to the In­ter­net. You can open it by click­ing on the “Con­sole Ac­cess” but­ton avail­able in the droplet con­trol pan­el:

Con­trol panel sec­tion with Con­sole Ac­cess but­ton vis­ible
Con­trol panel sec­tion with “Con­sole Ac­cess” but­ton vis­ible

This con­sole provides the same amount of con­trol over a droplet one would get through ssh, al­though the ex­per­i­ence is not as smooth.

Un­der­stand­ing the un­der­ly­ing cause

Ker­nels in­ter­act with hard­ware through a well spe­cified API via drivers. Linux drivers are usu­ally im­ple­men­ted as ker­nel mod­ules which are loaded dur­ing runtime. Usu­ally, when the ker­nel is un­able to load some mod­ule it will not fail and boot just fine, but some fea­tures, such as the Eth­er­net, might be un­avail­able.

In my case droplet ker­nel up­grade caused a change of mod­ules’ loc­a­tion in the file sys­tem and made the boot ker­nel un­able to find mod­ules ne­ces­sary for nor­mal op­er­a­tion.

Quick and dan­ger­ous

Ker­nel mod­ules reside at a well defined path and in case of a ver­sion mis­match between the boot and the droplet ker­nel this loc­a­tion has an in­fin­itely big chance of not ex­ist­ing. Given a 3.14.1-1 ker­nel re­lease mod­ules are loc­ated at /usr/lib/modules/3.14.1-1 and you can find the path where your boot ker­nel ex­pects to find them with sh -c 'echo /usr/lib/modules/$(uname -r)'. As mod­ules don’t change much, you can at­tempt us­ing mod­ules of your droplet ker­nel with the boot ker­nel even if the ker­nel ver­sions do not match:

cp -R /usr/lib/modules/$DROPLET_KERNEL_VERSION /usr/lib/modules/$(uname -r)
reboot

If everything is all right, after a re­boot the boot ker­nel should load all the ne­ces­sary mod­ules and eth0 along with the In­ter­net con­nec­tion should be back.

Slower and safer

You can also load mod­ules manu­ally which is slower and in case of an ac­ci­dent will not make your droplet un­boot­able:

cd /usr/lib/modules/$DROPLET_KERNEL_VERSION/kernel/drivers
insmod net/mii.ko*
insmod net/ethernet/realtek/8139cp.ko*
insmod virtio/virtio.ko*
insmod virtio/virtio_ring.ko*
insmod net/virtio_net.ko*

Once the mod­ules are loaded without any er­rors, you can try con­nect­ing to the In­ter­net. $YOUR_IP and $GATEWAY_IP are con­veni­ently provided be­low the con­sole win­dow.

ifconfig eth0 $YOUR_IP netmask 255.255.255.0 up
route add default gw $GATEWAY_IP
ping $GATEWAY_IP -c 1

At this point your droplet will have full In­ter­net con­nec­tion mak­ing data re­cov­ery pos­sible, but will not have DNS which im­plies no do­main res­ol­u­tion.

Signed mod­ules

De­pend­ing on dis­tri­bu­tion setup insmod might also fail with an er­ror about mod­ule hav­ing a bad sig­na­ture when you try to in­sert them (or after re­boot if you used the dan­ger­ous meth­od). Re­mov­ing the sig­na­ture is an ef­fect­ive way to cir­cum­vent the check:

objcopy -R .note.module.sig module.ko module.ko

In case the fail­ing mod­ule is com­pressed (.ko.gz), you’ll need to de­com­press it with gzip -d first.

Fin

Us­ing the sys­tem in pro­duc­tion after patch­ing like this is dis­cour­aged. This art­icle is only meant to help people to re­cover data locked into the droplet. After the In­ter­net con­nec­tion is re­covered, you by all means want to move everything from failed droplet to a pristine one.