Summary: | btrfs raid on plain dmcrypt fails to boot randomly | ||
---|---|---|---|
Product: | systemd | Reporter: | Paolo <palmaway> |
Component: | general | Assignee: | systemd-bugs |
Status: | NEW --- | QA Contact: | systemd-bugs |
Severity: | major | ||
Priority: | medium | CC: | 2bluesc, dutch109, freedesktop, liststuff, mail, palmaway, radek |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Paolo
2015-01-16 07:01:04 UTC
How precisely does your fstab and crypttab look like? Note that for btrfs RAID only the backing device that makes the RAID set complete is considered active by systemd. That means that systemd will only pick up the last device that is discovered. This is intended that way. For this to properly work you need to reference the btrfs file system by its UUID in fstab, so that it doesn't matter which one is the last one to be picked up. Hi, and thanks for your reply. However, putting the UUID in fstab doesn't work on my system, as mentioned in the original bug report above. Here are a few additional info: The relevant blkid output: /dev/mapper/cryptb: LABEL="data" UUID="ee0726c7-f7d1-4031-8a53-32d384334196" UUID_SUB="8eb20b58-f736-404d-92dd-bb467da8a275" TYPE="btrfs" /dev/mapper/cryptc: LABEL="data" UUID="ee0726c7-f7d1-4031-8a53-32d384334196" UUID_SUB="0072ff4d-25e6-44a1-9fae-8cb9b38299d9" TYPE="btrfs" /dev/mapper/crypta: LABEL="data" UUID="ee0726c7-f7d1-4031-8a53-32d384334196" UUID_SUB="d0a0ff7f-76d0-4572-9e0f-db9f20ea6fa0" TYPE="btrfs" My /etc/crypttab: crypta /dev/sda /root/key.crypta cipher=aes-xts-plain64,size=512,hash=plain cryptb /dev/sdb /root/key.cryptb cipher=aes-xts-plain64,size=512,hash=plain cryptc /dev/sdc /root/key.cryptc cipher=aes-xts-plain64,size=512,hash=plain The relevant part of my /etc/fstab: /dev/mapper/cryptc /data btrfs defaults,device=/dev/mapper/crypta,device=/dev/mapper/cryptb,device=/dev/mapper/cryptc,compress-force=lzo 0 0 Again, if I put in the fstab the UUID shared by the btrfs raid filesystem indicated by blkid (ee0726c7-f7d1-4031-8a53-32d384334196) then the problem is not solved, it actually gets worse, as indicated in the original report above! I get 3 "start job" messages, one related to the /dev/disk-by-uuid and the two relative to the remaining devices /dev/mapper/cryptX. (In reply to Paolo from comment #2) > Hi, and thanks for your reply. However, putting the UUID in fstab doesn't > work on my system, as mentioned in the original bug report above. Not following. Why does that now work? Can you elaborate? The comment #1 is not very clear about that? > /dev/mapper/cryptc /data btrfs > defaults,device=/dev/mapper/crypta,device=/dev/mapper/cryptb,device=/dev/ > mapper/cryptc,compress-force=lzo 0 0 > > Again, if I put in the fstab the UUID shared by the btrfs raid filesystem > indicated by blkid (ee0726c7-f7d1-4031-8a53-32d384334196) then the problem > is not solved, it actually gets worse, as indicated in the original report > above! I get 3 "start job" messages, one related to the /dev/disk-by-uuid > and the two relative to the remaining devices /dev/mapper/cryptX. Hmm, why do you get the latter three? I mean, the idea is to always use the UUID, and nothing else, not a mixture... I changed /etc/fstab to contain the following line: UUID=ee0726c7-f7d1-4031-8a53-32d384334196 /data btrfs defaults,compress-force=lzo 0 0 as you can see, I also eliminated the "device" mount options, in case those were the ones creating the problem. Unfortunately, as stated in my first post, now I get these messages at boot: - "A start job is running for" /dev/mapper/cryptb and /dev/mapper/cryptc with no timeout; - "A start job is running for dev-disk-by/x2uuid-..." with a timeout of 1min and 30 secs. The three messages alternate on screen, and when the timeout for the last one ends, I am able to enter the root password for maintenance. Please don't ask me why doesn't it work, as that is exactly why I posted a bug report: there is an expected behavior, but it doesn't happen. Can you please clarify what do you mean by "the idea is to always use the UUID"? Do you mean in the crypttab as well? How can I use the UUID's in the crypttab, since the disks share the same UUID? Can you provide an example? I changed both the fstab (for the devices not related to the btrfs, which were previously using LABEL= instead of UUID=) and the crypttab (using the UUID_SUB indicated in the blkid output as UUID=). Now both files only reference to UUID's. The situation is *much* worse: now I get 7 "A start job" messages (2 for the "/dev/mapper/..." and 5 for "dev-disk-by/x2uuid-..."). The system is virtually unbootable: 20 reboots and none succeeding. I have reverted to the previous configuration, and now the system sometime boots (as explained before, when I am lucky enough that systemd finds /dev/mapper/crypta first). When I can boot, I cannot run systemd-analyze (it reports that "Bootup is not yet finished." exactly as described in the first link of comment #1). However, if I run "systemctl list-units" I get the following as first lines: UNIT LOAD ACTIVE SUB JOB DESCRIPTION proc-sys-fs-binfmt_misc.automount loaded active waiting Arbitrary Executable File Formats File System Automount Point dev-mapper-cryptb.device loaded inactive dead start dev-mapper-cryptb.device dev-mapper-cryptc.device loaded inactive dead start dev-mapper-cryptc.device I hope this helps. Thanks for looking into this. I'm also affected by this bug with a similar setup running up-to-date Archlinux (x64). My configuratation consists of 2 disks encrypted with plain dm-crypt with a btrfs RAID1 on top. The disks are also encrypted via a keyfile. However, I use UUIDs in the crypttab provided by blkid /dev/sdX and in fstab I use /dev/mapper/DEVICE1 and in the mount-options provide also device=/dev/mapper/DEVICE2. With this configuration I need 3-5 attempts each boot to succesfully mount the RAID. Please tell me, if I can provide any further information. I'm also affected: Arch Linux systemd 221-2 + device-mapper 2.02.122-1 + linux 4.0.7-2 I have 4 physical disks with using LUKS + keyfile and my crypttab uses the partition's UUID. The fstab file use the btrfs RAID filesystem's UUID as explained by Lennart (although using LABEL= appears to behave the same). The system boots and mounts the filesystem as expected; however, there are 3 pending jobs and the system remains in the "starting" state. $ systemctl list-jobs JOB UNIT TYPE STATE 72 dev-mapper-crypt3.device start running 58 dev-mapper-crypt0.device start running 69 dev-mapper-crypt1.device start running 3 jobs listed. The missing device crypt2 seems to have worked as suggested to be the device actually mounted by btrfs magic: $ systemctl status dev-mapper-crypt2.device dev-mapper-crypt2.device - /dev/mapper/crypt2 Follow: unit currently follows state of sys-devices-virtual-block-dm\x2d6.device Loaded: loaded Drop-In: /run/systemd/generator/dev-mapper-crypt2.device.d └─90-device-timeout.conf Active: active (plugged) since Sat 2015-07-04 14:54:08 PDT; 1h 3min ago Device: /sys/devices/virtual/block/dm-6 Jul 04 14:54:08 puppies systemd[1]: Found device /dev/mapper/crypt2. I do confirm this bug as well on my arch linux system with a VM that decrypts at boot 3 disks as btrfs RAID1. See https://bugs.archlinux.org/task/42884?project=1. I'm quite unhappy as it got worse in the last time, don't want to blame systemd, but right now not any workaround did solve this issue as reported here already. At the moment I need more than 15 boots to get my system up and running. I really hope that somebody will step in to solve this bug. Forgot to say that I use linux 4.2.3 and systemd 227 I am hit by this bug too (Arch Linux with Linux 4.4 & systemd 231). None of the workarounds I found here and elsewhere did work, I tried: * explicitely require all devices in /etc/fstab with device=xxx,device=yyy * setting filesystem to noauto,x-systemd.automount in /etc/fstab and noauto in /etc/crypttab (it did work until a recent update) * adding btrfs in MODULES array in /etc/mkinitcpio.conf I finally "fixed" it by setting the filesystem to noauto in /etc/fstab (so they are NOT mounted by systemd at boot), and creating a simple service that mounts the missing partitions later. /etc/systemd/system/late-mount.service : [Unit] Description=Mount directories that systemd fail to mount [Service] ExecStart=/etc/systemd/system/late-mount [Install] WantedBy=multi-user.target /etc/systemd/system/late-mount: #!/bin/bash -e mount /xxx mount /xxx/yyy This is a really nasty bug, I wasted hours trying to get my configuration working. I'm on Ubuntu 16.04 with all the latest updates, and this bug is still there. It'd be really appreciated with systemd devs could look into this. For me the workaround from dutch109 worked, with some modification, systemd/late-mount.service: [Unit] Description=Mount encrypted multi-device Btrfs filesystems that systemd fails to mount due to https://bugs.freedesktop.org/show_bug.cgi?id=88483 Before=display-manager.service getty@tty1.service getty@tty2.service getty@rrt3.service getty@tty4.service getty@tty5.service getty@tty6.service [Service] Type=oneshot ExecStart=/etc/systemd/system/late-mount [Install] WantedBy=multi-user.target systemd/late-mount: #!/bin/bash -e setfont Uni3-TerminusBold32x16.psf.gz cryptdisks_start sda1_crypt cryptdisks_start sdb1_crypt mount /home |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.