mason | ty all | 00:03 |
---|---|---|
mason | fsmithred: I got around to trying the timeout patch, and it does cut the timeout back, but I suspect we'll want a different solution. I'm going to look at how shutdownramfs's work. | 01:03 |
mason | Looking at how systemd accomplishes this smoothly is frustrating, as it's utterly opaque. | 01:04 |
mason | (transient crypt services appear to be generated on the fly) | 01:05 |
golinux | Duh . . . what did you expect. This is why we need a para on the free software page. | 01:05 |
* golinux apologizes for being a bot of a nag . . . | 01:06 | |
mason | hah | 01:06 |
golinux | bot > bit | 01:06 |
golinux | Well, maybe a bot too!! | 01:06 |
mason | No, it's useful. I'll write something up. I've just been distracted by this native ZFS set-up stuff, and now I want shutdown to work cleanly. | 01:06 |
golinux | I know. That's why I have held off mentioning it. | 01:08 |
mason | I don't mind pestering. It has a purpose. The guilt builds up until I act. | 01:09 |
golinux | Guilt is not a good model for contributors | 01:09 |
mason | hah | 01:10 |
mason | systemd's cryptsetup.c annoys me immediately. It doesn't wrap at 80 columns. | 01:10 |
golinux | In fact guilt is not much of a good model for anything constructive | 01:11 |
mason | https://bpaste.net/show/WNG- | 01:13 |
mason | This code is gross, as I read further. I'm going to head off to dinner, and I'll write up a paragraph about shrinking opportunities for free software authors as a result of systemd. You'll have it this evening. | 01:15 |
golinux | I've been cooking too. Now time to enjoy the fruits of my effort. Later . . . | 01:16 |
mason | o/ | 01:17 |
g4570n | golinux: rrq: Centurion_Dan: Thanks for your work, d-i is working fine again. Javier de EterTICs also sends his thanks | 02:41 |
mason | So, the shutdownramfs/haltramfs concept appears to have arisen with systemd. I need to research how other systems do it, because the notion of a final pivot and clean-up seems reasonable. Also worth understanding the goals... If a user can punch the power they can freeze the keys in RAM, which might mean an explicit clean of the memory at shutdown might be more or less futile. Maybe something like what | 03:02 |
mason | OpenSSH does against Spectre might be something to consider. | 03:02 |
mason | Or maybe that wouldn't be useful if they catch the RAM fresh. Dunno. | 03:03 |
koollman_ | once you can read ram and have physical access ... protection is rather tricky | 03:03 |
mason | koollman_: Right. | 03:04 |
koollman_ | need some live hardware-supported ram encryption. which no init system will solve | 03:04 |
mason | koollman_: So, I guess the question is, what's possible? Do we want to just make a cosmetic improvement in tearing down root on whatever on LUKS? Dunno as yet. | 03:04 |
mason | There's more reading to do. | 03:04 |
koollman_ | I don't think adding complexity for the sake of it helps. there's enough troubles in various cases when shutting down services in specific order already :) | 03:05 |
koollman_ | if there's a known attack to protect against, then it can be tested against various ideas | 03:05 |
koollman_ | as for alternative ideas: I like the rather clean concept of phases in s6. I suppose it would make implementing something similar to shutdown-ramfs easier. maybe. I would have to try :) | 03:09 |
koollman_ | (hm. not phases. 'stages') | 03:10 |
mason | Well. Very simply, "hold off on unlocking things we can't unlock until the end. If what's left is root, pivot, unmount, and then spin down the LUKS device." | 03:10 |
mason | However we get there, that's where we want to be I think. | 03:10 |
koollman_ | it's supposed to be simple. there are always weird corner cases :) | 03:11 |
mason | s/unlock/stop/ | 03:11 |
mason | Yeah. | 03:11 |
koollman_ | the luks device may be on something other than a raw device. which may have its own requirements. and block the pivot somehow (or must be stopped at just the right moment) | 03:13 |
mason | Right. | 03:13 |
koollman_ | (real world example: luks device on some userspace-exported nbd device. you kill the process ... can't umount or even sync :) ) | 03:14 |
mason | koollman_: So, one of my stray thoughts is that we let the user specify some of this. | 03:15 |
plasma41 | I'm getting ready to format and mail this week's notes. If anyone wants to make any final edits to the notes, please do so now. | 03:19 |
mason | plasma41: If you want to strike my part, we didn't really cover it and we can just do it again in a future meeting. | 03:21 |
mason | Or the first time. Whichever. | 03:21 |
plasma41 | mason: Is that the PPA question? | 03:23 |
mason | plasma41: That and alternative bootloaders. | 03:23 |
plasma41 | Ok, I removed them | 03:24 |
mason | ty | 03:24 |
plasma41 | Alright copying down the pad now | 03:49 |
fsmithred | mason, a good place to put a setting like that might be in /etc/default/ | 04:12 |
mason | fsmithred: Sounds reasonable, yeah. Or possibly as an option in crypttab, if we can limit it to just crypted devices. | 04:19 |
agris | I don't do drugs | 04:39 |
agris | tablets either | 04:39 |
plasma41 | Meeting notes posted | 04:45 |
mason | plasma41: ty | 04:46 |
mason | fsmithred: So, we should enumerate scenarios where this matters. | 17:31 |
mason | fsmithred: The other thing I read is that certain hardware may be happier itself having a clean shutdown, but I haven't found examples yet. | 17:32 |
fsmithred | I am no security expert | 17:32 |
mason | fsmithred: But, at the core of it, we're either setting a proper order so we can close the LUKS devices to clear the key, or we're working on a cosmetic fix. | 17:32 |
mason | Ordering transitory/hotplug storage shouldn't be a concern as we're talking explicitly about the root filesystem. | 17:33 |
fsmithred | right | 17:33 |
mason | fsmithred: Also no expert here. I'm just casting about, trying to figure out what we really want to do. | 17:33 |
mason | fsmithred: It could well be that given the nature of the problem, a cosmetic fix really is acceptable. | 17:34 |
fsmithred | my understanding is that reading ram requires physical access a very short time after shutdown or on a reboot | 17:35 |
mason | fsmithred: Right. | 17:35 |
fsmithred | short = seconds, I think | 17:35 |
mason | fsmithred: And if you have physical access, you can force the machine off by yanking power or doing something local. | 17:35 |
mason | The most interesting attack scenario, though, might be forcing a reset but then booting into a prepared environment that harvests memory. Consider a crashdump kernel, for instance. | 17:36 |
mason | But we still can't guarantee that we'd get a chance to clear the key from RAM even then. We can't intercept a sysrq-trigger event, for instance. | 17:37 |
fsmithred | can we run a script from the bios? | 17:38 |
mason | fsmithred: I'm thinking we just address the cosmetic issue, which you've largely done with the patch you pointed me to. That said, it might be good to actually have a local version of that code (local to Devuan) so that it doesn't get overwritten periodically. Also, if we do that we can offer a cleaner shutdown in the uninterrupted case. | 17:38 |
mason | fsmithred: A script from the BIOS to what end, and at what point? | 17:38 |
fsmithred | mx/antix solved that with a package called cryptsetup-functions-modified or something like that | 17:39 |
fsmithred | uses dpkg-divert to replace the cryptdisk.functions file | 17:39 |
mason | fsmithred: I'm thinking more, capture LUKS device names that matter for this - maybe in /etc/default - and have the cryptsetup shutdown scripting simply ignore those. | 17:39 |
golinux | (This discussion is interesting but will come to naught if we can't get the point release and Beowulf out the door. Carry on . . .) | 17:39 |
fsmithred | wipe the ram during post? | 17:39 |
* golinux goes off to b'fast | 17:40 | |
mason | fsmithred: Yeah, when I read about that it seemed interesting. I need to learn about dpkg-divert as I know nothing at present. | 17:40 |
mason | golinux: Enjoy! | 17:40 |
mason | fsmithred: Wiping the RAM during POST would be interesting, but we can't guarantee that we control the execution environment at that point. | 17:40 |
mason | We'd have to control the UEFI firmware or the BIOS to actually do that. Too far outside of our scope. | 17:41 |
fsmithred | oh, right. Every bios is different. | 17:41 |
fsmithred | and uefi implementations are like designer drugs | 17:42 |
mason | hah | 17:42 |
fsmithred | https://github.com/MX-Linux/cryptsetup-modified-functions | 17:47 |
mason | fsmithred: That's not quite right, though. | 17:50 |
fsmithred | what's wrong? | 17:50 |
mason | fsmithred: If you're going to set it to 1, why sleep at all? We're not going to try again and an arbitrary one-second sleep per device is at best only somewhat less annoying than a long delay per device. | 17:50 |
fsmithred | I have mine set to 1 2 and it's not annoying at all | 17:51 |
mason | Looking at https://github.com/MX-Linux/cryptsetup-modified-functions/blob/master/cryptdisks-functions it should simply remove the loop, not just set it to 1. | 17:51 |
fsmithred | ok, I don't know if anyone has suggested that or tried it before | 17:52 |
mason | fsmithred: So, on my laptop, 1 2 would mean three seconds for the first device, and three for the second, always, so six seconds of staring at predicted error messages. | 17:52 |
mason | fsmithred: line 198 says "sleep $i" which for you is sleep 1, then sleep 2, and it's per-device. | 17:52 |
fsmithred | maybe I don't have enough devices to make a difference | 17:53 |
fsmithred | I haven't noticed a 3-second pause. Maybe 1. | 17:53 |
mason | fsmithred: It's right there in the code. | 17:53 |
fsmithred | I know, I put it there several times after upgrades | 17:53 |
mason | fsmithred: Ought to look like this: https://bpaste.net/show/B-tA | 17:53 |
mason | That tries once, and will emit an error if it fails. It's the equivalent of the posted version, only without extraneous sleep and without the now-misleading loop scaffolding. | 17:54 |
mason | ah, dpkg-divert is simple enough | 17:59 |
fsmithred | yeah, they have both cryptdisks-functions and cryptdisks.functions which are for two different versions of cryptsetup | 18:01 |
fsmithred | that's a little weird | 18:01 |
mason | Was just noticing I had the dot and not the dash locally. | 18:02 |
fsmithred | the two files are very different | 18:03 |
fsmithred | but the fix is the same | 18:03 |
mason | Where's the file with the dash come from? I only see the one, owned by the cryptsetup package, with a dot. | 18:04 |
fsmithred | I have cryptdisks-funtions in beowulf and cryptdisks.functions in ascii | 18:05 |
mason | Ah, churn for the sake of churn. :P Awesome. | 18:05 |
mason | fsmithred: Here's my patch for ASCII: https://bpaste.net/show/Wm_t | 18:06 |
mason | Tested minimally and working without the cosmetic annoyances and delay. | 18:06 |
fsmithred | so, just remove those two lines - for... ...done | 18:08 |
mason | and the sleep | 18:08 |
mason | and then fix the indentation | 18:08 |
mason | Remember that the sleep does nothing without the for loop, as the sleep is to give an increasing back-off to let the device settle. | 18:09 |
mason | But if we're only trying once, there's no need for the sleep at all. Plus, it uses the loop variable $i. | 18:09 |
mason | Oh, hell, something else does too. Half a sec and I'll have a corrected patch. | 18:10 |
fsmithred | if [ $ret -ne 0 ] ? | 18:11 |
mason | Just tracing through. So, the line that checks return value in the bit we're patching... It's an odd choice by the original author. | 18:13 |
mason | It checkes for return code 1 from cryptsetup(8), meaning "wrong parameters", or return code 2 (no permission / bad passphrase) combined with the timeout having reached 16 seconds. | 18:13 |
mason | That seems... nonsensical. | 18:13 |
fsmithred | so... | 18:15 |
mason | So, we only log those two errors, except that we only log the "no permission / bad passphrase" error *if the thing has tried longer than 16 seconds*. | 18:15 |
fsmithred | if you don't have permission, you should try more times | 18:15 |
mason | I'd tend to think we want to log any error. | 18:15 |
mason | Right. That doesn't make sense to me. | 18:15 |
fsmithred | yeah, that's what I suggested - if it's nonzero, log it | 18:15 |
fsmithred | I'm about to try that | 18:16 |
mason | Right. | 18:16 |
mason | Oh, haha, that is what you suggested. | 18:16 |
mason | Yes, that. | 18:16 |
mason | The trick there is that we are still left with extraneous code, because we don't get to that test if ret = 0 | 18:17 |
mason | fsmithred: So, https://bpaste.net/show/ZT_x but I want to run it a couple times, because some error text just flashed by | 18:20 |
fsmithred | ok, I have to reboot to test this. It'll take a few minutes to shut stuff down and then start up again. | 18:21 |
mason | oh, I think we do want the break - testing | 18:22 |
fsmithred | you removed the test for nonzero | 18:22 |
fsmithred | but not the action | 18:22 |
mason | right, the action being, we want to log, no? | 18:22 |
mason | I might be misreading it. | 18:22 |
fsmithred | you'll log regardless of the exit code | 18:23 |
fsmithred | with "dst busy" | 18:24 |
mason | yeah, but we don't log the return code in that line | 18:24 |
fsmithred | which might be ok for testing | 18:24 |
mason | Yeah, we want the break too. Sigh, messy. | 18:25 |
mason | argh, no, we don't want the break. The break is for the for loop we've removed. Correct as it stood. Sigh. More coffee will fix this. | 18:27 |
fsmithred | I've got the break and the test for nonzero. Gonna reboot now. Back in a few minutes. (yeah, really) | 18:27 |
fsmithred | oh | 18:27 |
mason | fsmithred: If you removed the for loop you don't want the break. | 18:27 |
fsmithred | ok | 18:27 |
fsmithred | will remove it | 18:27 |
mason | I was just curious about some of the error text I saw, and thought I'd missed something. | 18:27 |
mason | kk | 18:27 |
mason | This patch should be correct: https://bpaste.net/show/ZT_x | 18:28 |
mason | Oh, except, I should find the source for log_action_foo_msg to make sure there's no extra behaviour there. Not sure why there are the two. We can probably just log $ret and the message in the end_msg call. | 18:28 |
fsmithred | brb | 18:29 |
mason | fsmithred: https://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptfunc.html | 18:34 |
fsmithred | seems to work. I rebooted twice. The shutdown was fast, and I didn't see any red flash by. | 18:34 |
mason | I think for maximum correctness we want one log message, the end one. I'll do one final change and test. | 18:35 |
mason | If there are better docs for the lsb/init-functions Debian ships I'd love to see them. | 18:35 |
fsmithred | I don't know. | 18:36 |
fsmithred | anyway, I need to take a break from the screen. Probably have to go move some boxes or something. | 18:36 |
fsmithred | bbl | 18:37 |
mason | fsmithred: FWIW, the init-functions matter. We need the log_action_cont_msg one, not log_action_end_msg or it gets ugly. | 19:08 |
mason | fsmithred: Here's where I ended up: https://bpaste.net/show/C8qM | 19:18 |
fsmithred | mason, I'm not seeing where you test for nonzero $ret | 20:39 |
mason | fsmithred: handle_crypttab_line_stop "$dst" "$src" "$key" "$opts" <&3 && break || ret=$? | 20:40 |
fsmithred | that sets ret but doesn't test it | 20:41 |
mason | fsmithred: If we don't error, we break. So if we get past that line, we've got an error. The test is implicit, unless I'm badly misunderstanding. | 20:41 |
mason | So, ret only gets set on fail, and we can trust that it's got a valid value from $? | 20:41 |
fsmithred | ah, I didn't see the break | 20:41 |
mason | Right. The test for ret=0 is implicit. | 20:41 |
mason | fsmithred: That's the trouble with the "cute" style they've adopted. It's better to waste a little whitespace and have control flow be more obvious. | 20:42 |
* fsmithred likes whitespace | 20:44 | |
fsmithred | I will test this later. | 20:44 |
fsmithred | need to get back to work | 20:45 |
mason | fsmithred: The really interesting bit is how badly it went using the wrong LSB init function. Those ought to be documented somewhere, and they seem not to be. I should probably write docs rather than waiting for someone else to do it. | 20:45 |
mason | kk | 20:45 |
Generated by irclog2html.py 2.17.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!