SSH no more
Ok, panic! I was trying to SSH into the RPi 3 of my drone (as usual), but a worrying
Connection closed by remote host
was the only message I was getting. Why panic? Because the HDMI port of the Pi is buried deep inside the drone, covered by cables and there is no chance to reach it without opening up the frame and moving a bunch of cables…
Ok, no panic, mount the USB pen on the desktop PC (it mounts), and take a look at auth.log
and syslog
.
Ah-ah! auth.log
is filled with these weird messages!
Jul 18 14:34:38 localhost sshd[1948]: PAM unable to dlopen(pam_unix.so): /lib/security/pam_unix.so: cannot open shared object file: No such file or directory
Jul 18 14:34:38 localhost sshd[1948]: PAM adding faulty module: pam_unix.so
Jul 18 14:34:38 localhost sshd[1948]: fatal: Access denied for user pelan by PAM account configuration [preauth]
Strange, on the Pi there is no /lib/security
dir altogether and I don’t even have /lib/security/pam_unix.so
on my amd64 desktop PC…
#> ll /lib/security/
total 36K
-rw-r--r-- 1 root root 19K Jul 13 2016 pam_ecryptfs.so
-rw-r--r-- 1 root root 15K Nov 12 2014 pam_freerdp.so
Ok, a guy on Unix & Linux on StackExchange says that editing /etc/ssh/sshd_config
and setting UsePAM no
will make me login. And it was true, but it seems that PAM rules over a lot of stuff, sudo
for example. 😢
The message this time was a even more cryptic:
sudo: policy plugin failed session initialization
I tried to google for it for some time, but then gave up and decided to reconnect brain. And that was the winning choice this time, because… I totally forgot to check syslog
!
And in fact it was covered by:
Jul 18 15:32:44 localhost kernel: [ 949.206136] BTRFS warning (device sda2): csum failed root 5 ino 322973 off 36864 csum 0x46e98bcc expected csum 0xa9f61a90 mirror 1
Jul 18 15:32:57 localhost kernel: [ 962.173692] BTRFS warning (device sda2): csum failed root 5 ino 322973 off 36864 csum 0x46e98bcc expected csum 0xa9f61a90 mirror 1
Jul 18 15:34:23 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Wed Jul 18 15:35:23 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul 18 15:34:23 localhost kernel: [ 1048.132095] BTRFS warning (device sda2): csum failed root 5 ino 322973 off 36864 csum 0x46e98bcc expected csum 0xa9f61a90 mirror 1
Jul 18 15:34:44 localhost kernel: [ 1068.852398] BTRFS warning (device sda2): csum failed root 5 ino 322973 off 36864 csum 0x46e98bcc expected csum 0xa9f61a90 mirror 1
Jul 18 15:34:46 localhost kernel: [ 1071.078087] BTRFS warning (device sda2): csum failed root 5 ino 322973 off 36864 csum 0x46e98bcc expected csum 0xa9f61a90 mirror 1
Jul 18 15:34:52 localhost kernel: [ 1076.894146] BTRFS warning (device sda2): csum failed root 5 ino 322973 off 36864 csum 0x46e98bcc expected csum 0xa9f61a90 mirror 1
So move the USB pen back to the PC and start a BTRFS scrub with sudo btrfs scrub start /mnt/test2/
. Then with:
> sudo btrfs scrub status /mnt/test2 -R
scrub status for 956245f3-b968-4a69-a1b0-54665461792e
scrub started at Wed Jul 18 17:48:46 2018 and finished after 00:03:51
data_extents_scrubbed: 132946
tree_extents_scrubbed: 20132
data_bytes_scrubbed: 4227706880
tree_bytes_scrubbed: 329842688
read_errors: 0
csum_errors: 2
verify_errors: 33
no_csum: 272
csum_discards: 0
super_errors: 0
malloc_errors: 0
uncorrectable_errors: 2
unverified_errors: 0
corrected_errors: 33
last_physical: 7927234560
I realized that I had a couple of unrecoverable errors. Is there a way to see which file has been corrupted (e.g. like with debugfs -R "ncheck $inode" $partition
+ ecryptfs-find?)
Fortunately it’s even easier!
dmesg | grep "checksum error at" | tail -4 | cut -d\ -f24- | sed 's/.$//'
Found, guess where? On Unix & Linux!.
Also guess who’s the culprit?
offset 32768, length 4096, links 1 (path: lib/arm-linux-gnueabihf/security/pam_unix.so
offset 36864, length 1876, links 1 (path: lib/arm-linux-gnueabihf/security/pam_unix.so
Yes, exactly him!
So now? Well, now that the problem has shown, it should be easy to fix. First: check which package this dynamic library belongs to (is this correct english, yes? 😆)
#> dlocate /lib/x86_64-linux-gnu/security/pam_unix.so
libpam-modules:amd64: /lib/x86_64-linux-gnu/security/pam_unix.so
Fine. Second: which version?
#> dpkg -l *libpam-modules*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================================-============================-============================-===================================================================================================
ii libpam-modules:amd64 1.1.8-3.2ubuntu2.1 amd64 Pluggable Authentication Modules for PAM
ii libpam-modules-bin 1.1.8-3.2ubuntu2.1 amd64 Pluggable Authentication Modules for PAM - helper binaries
Excellent. Third: we need it for armhf, let’s try to find it… here it is! Actually, Google showed me the i386
version, but it’s sufficient to change the i386 in the url with armhf.
Ok, copy link location, then
#> mkdir -p /tmp/libpam
#> cd /tmp/libpam
#> wget 'http://launchpadlibrarian.net/364006221/libpam-modules_1.1.8-3.2ubuntu2.1_armhf.deb'
#> dpkg-deb --extract libpam-modules_1.1.8-3.2ubuntu2.1_armhf.deb .
#> cd lib/arm-linux-gnueabihf/security
#> for i in *.so ; do echo $i ; md5sum $i /mnt/test2/lib/arm-linux-gnueabihf/security/$i ; done
Let’s take a look at the results. Ok, everything is identical, except… him.
9ff6356da6e8be0b0a4ea76a67dc9068 pam_umask.so
9ff6356da6e8be0b0a4ea76a67dc9068 /mnt/test2/lib/arm-linux-gnueabihf/security/pam_umask.so
pam_unix.so
6ccaaa1c7abf863e7df915cc7b31362d pam_unix.so
md5sum: /mnt/test2/lib/arm-linux-gnueabihf/security/pam_unix.so: Input/output error
pam_userdb.so
bf55dfb6db4adebfe77ea60fa11c5711 pam_userdb.so
bf55dfb6db4adebfe77ea60fa11c5711 /mnt/test2/lib/arm-linux-gnueabihf/security/pam_userdb.so
Let’s go on and replace it with a fresh copy.
#> sudo rm /mnt/test2/lib/arm-linux-gnueabihf/security/pam_unix.so
#> sudo cp pam_unix.so /mnt/test2/lib/arm-linux-gnueabihf/security/
#> sync
#> sudo umount /mnt/test2
D’oh, forgot to uncomment UsePAM yes
in /mnt/test2/etc/ssh/sshd_config
.
But, eventually…
#> pelan-drone
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.14.18-v7+ armv7l)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
Get cloud support with Ubuntu Advantage Cloud Guest:
http://www.ubuntu.com/business/services/cloud
77 packages can be updated.
35 updates are security updates.
Last login: Wed Jul 18 15:46:31 2018 from 10.17.0.155
The hooman wins again! 😆
But it’s also true that I need to find a slower but safer way to shutdown the distro than simply unplug the batteries.