NFS doesn't hate you

sudo nfstrace

This is the most useful tool ever to debug what’s happening between server and client. And believe me, I’ve spent whole precious days of my life debugging what the hell was wrong between a whole range of NFS servers and clients.

You can install it easily with just one command:

sudo apt install nfstrace

In my most recent case, I was trying to boot a RPI3 via zstd-compressed BTRFS filesystem (on external HDD, inside image, lol) mounted in a shared directory over NFSv4. Hearing from this guy here and then looking at the official NFSroot documentation it’s a matter of half an hour, right? What could go wrong? :D

A lot went wrong and NFS showed its best off, pouring out obscure and alien messages and opening up an endless pit of pain and unanswered Stackoverflow questions.

Begin: Retrying nfs mount ... short read: 24 < 28
NFS over TCP not available from [Server IP address]
pmap_getmaps.c: rpc problem: RPC: Authentication error; why = Client credential too weak
mount: RPC: Remote system error - Connection refused

Uh, please note that before trying this NFS boot thing (already working since many years on the RPI2 in my house), I diligently configured the NFS share, tested it on two machines (including the same RPI3 booted via BTRFS) and so on. I did my homework, but it wasn’t enough.

Ok, let’s start with the first two errors1: Its obviously a problem of client-side NFS version, right? After all we’re talking about a puny initramfs on the puniest of the Raspberry PIs, obviously some module will be missing, mount chose NFSv3 instead of v4, something like that, right?

Wrong. It seems that everything the kernel needs to mount a NFS share is mount.nfs (or mount.nfs4) and nfs.ko, full stop. The whole world of module you usually see doing:

#> lsmod | grep nfs
nfsv3                  49152  0
nfsv4                 659456  2
nfs                   299008  5 nfsv4,nfsv3
fscache                69632  2 nfsv4,nfs
nfsd                  401408  13
auth_rpcgss            73728  2 nfsd,rpcsec_gss_krb5
nfs_acl                16384  2 nfsd,nfsv3
lockd                 110592  3 nfsd,nfs,nfsv3
grace                  16384  2 nfsd,lockd
sunrpc                397312  30 auth_rpcgss,nfsd,rpcsec_gss_krb5,nfsv4,nfs_acl,lockd,nfs,nfsv3

is server-side and is not required (and not present) for a freshly installed client, let alone inside the puny initramfs on the puniest of the Raspberry PIs. As always happens, Kconfigs come in hand here.

Luckily enough, I discovered the existence of nfstrace just before throwing everything out of the window2.

This is a successful mount operation issued by the RPI3 when booted from USB instead of NFS.

Detect session 10.17.220.126:703 --> 10.17.0.155:2049 [TCP]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]NULL
        CALL  []
        REPLY []
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 1 tag: void minor version: 0
                [ SETCLIENTID(35) [ verifier: 0x00000000398f13c4 client id: Linux NFSv4.0 10.17.220.126/10.17.0.155 tcp callback: program: 40000000 location: netid: tcp addr: 10.17.220.126.157.69 callback ident: 1 ]  ]
        REPLY [  operations: 1
                [ SETCLIENTID(35) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 1 tag: void minor version: 0
                [ SETCLIENTID_CONFIRM(36) [  client id: cc578d5a4603cbaf verifier: 0x29638d5ae33dd073 ]  ]
        REPLY [  operations: 1
                [ SETCLIENTID_CONFIRM(36) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 3 tag: void minor version: 0
                [ PUTROOTFH(24) [  ] ] 
                [ GETFH(10) [  ] ] 
                [ GETATTR(9) [ mask: 0x10011ab0a23a TYPE CHANGE SIZE FSID FILEID MODE NUMLINKS OWNER OWNER_GROUP RAWDEV SPACE_USED TIME_ACCESS TIME_METADATA TIME_MODIFY MOUNTED_ON_FILEID ]  ]
        REPLY [  operations: 3
                [ PUTROOTFH(24) [ status: OK ] 
                [ GETFH(10) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ GETATTR(9) [ mask: 0x2065 SUPPORTED_ATTRS FH_EXPIRE_TYPE LINK_SUPPORT SYMLINK_SUPPORT ACLSUPPORT ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ GETATTR(9) [ mask: 0xc800040080000 LEASE_TIME MAXFILESIZE MAXREAD MAXWRITE TIME_DELTA ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ GETATTR(9) [ mask: 0x2065 SUPPORTED_ATTRS FH_EXPIRE_TYPE LINK_SUPPORT SYMLINK_SUPPORT ACLSUPPORT ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ GETATTR(9) [ mask: 0xc800040080000 LEASE_TIME MAXFILESIZE MAXREAD MAXWRITE TIME_DELTA ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ GETATTR(9) [ mask: 0x30000000 MAXLINK MAXNAME ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ GETATTR(9) [ mask: 0x2065 SUPPORTED_ATTRS FH_EXPIRE_TYPE LINK_SUPPORT SYMLINK_SUPPORT ACLSUPPORT ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ GETATTR(9) [ mask: 0x10011ab0a23a TYPE CHANGE SIZE FSID FILEID MODE NUMLINKS OWNER OWNER_GROUP RAWDEV SPACE_USED TIME_ACCESS TIME_METADATA TIME_MODIFY MOUNTED_ON_FILEID ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 3 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ ACCESS(3) [ READ LOOKUP MODIFY EXTEND DELETE  ] 
                [ GETATTR(9) [ mask: 0x18300000 CHANGE SIZE TIME_METADATA TIME_MODIFY ]  ]
        REPLY [  operations: 3
                [ PUTFH(22) [ status: OK ] 
                [ ACCESS(3) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 4 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ LOOKUP(15) [ object name: tmp ] 
                [ GETFH(10) [  ] ] 
                [ GETATTR(9) [ mask: 0x10011ab0a23a TYPE CHANGE SIZE FSID FILEID MODE NUMLINKS OWNER OWNER_GROUP RAWDEV SPACE_USED TIME_ACCESS TIME_METADATA TIME_MODIFY MOUNTED_ON_FILEID ]  ]
        REPLY [  operations: 4
                [ PUTFH(22) [ status: OK ] 
                [ LOOKUP(15) [ status: OK ] 
                [ GETFH(10) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 4 tag: void minor version: 0
                [ PUTFH(22) [ object: 0100010000000000 ] 
                [ LOOKUP(15) [ object name: tmp ] 
                [ GETFH(10) [  ] ] 
                [ GETATTR(9) [ mask: 0x10011ab0a23a TYPE CHANGE SIZE FSID FILEID MODE NUMLINKS OWNER OWNER_GROUP RAWDEV SPACE_USED TIME_ACCESS TIME_METADATA TIME_MODIFY MOUNTED_ON_FILEID ]  ]
        REPLY [  operations: 4
                [ PUTFH(22) [ status: OK ] 
                [ LOOKUP(15) [ status: OK ] 
                [ GETFH(10) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 01000700...741f0b48 ] 
                [ GETATTR(9) [ mask: 0x2065 SUPPORTED_ATTRS FH_EXPIRE_TYPE LINK_SUPPORT SYMLINK_SUPPORT ACLSUPPORT ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 01000700...741f0b48 ] 
                [ GETATTR(9) [ mask: 0xc800040080000 LEASE_TIME MAXFILESIZE MAXREAD MAXWRITE TIME_DELTA ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 01000700...741f0b48 ] 
                [ GETATTR(9) [ mask: 0x30000000 MAXLINK MAXNAME ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 01000700...741f0b48 ] 
                [ GETATTR(9) [ mask: 0x2065 SUPPORTED_ATTRS FH_EXPIRE_TYPE LINK_SUPPORT SYMLINK_SUPPORT ACLSUPPORT ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 2 tag: void minor version: 0
                [ PUTFH(22) [ object: 01000700...741f0b48 ] 
                [ GETATTR(9) [ mask: 0x10011ab0a23a TYPE CHANGE SIZE FSID FILEID MODE NUMLINKS OWNER OWNER_GROUP RAWDEV SPACE_USED TIME_ACCESS TIME_METADATA TIME_MODIFY MOUNTED_ON_FILEID ]  ]
        REPLY [  operations: 2
                [ PUTFH(22) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 3 tag: void minor version: 0
                [ PUTFH(22) [ object: 01000700...741f0b48 ] 
                [ ACCESS(3) [ READ LOOKUP MODIFY EXTEND DELETE  ] 
                [ GETATTR(9) [ mask: 0x18300000 CHANGE SIZE TIME_METADATA TIME_MODIFY ]  ]
        REPLY [  operations: 3
                [ PUTFH(22) [ status: OK ] 
                [ ACCESS(3) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]
10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 4 tag: void minor version: 0
                [ PUTFH(22) [ object: 01000700...741f0b48 ] 
                [ LOOKUP(15) [ object name: usbl-data ] 
                [ GETFH(10) [  ] ] 
                [ GETATTR(9) [ mask: 0x10011ab0a23a TYPE CHANGE SIZE FSID FILEID MODE NUMLINKS OWNER OWNER_GROUP RAWDEV SPACE_USED TIME_ACCESS TIME_METADATA TIME_MODIFY MOUNTED_ON_FILEID ]  ]
        REPLY [  operations: 4
                [ PUTFH(22) [ status: OK ] 
                [ LOOKUP(15) [ status: OK ] 
                [ GETFH(10) [ status: OK ] 
                [ GETATTR(9) [ status: OK ]  ]



10.17.220.126:703 --> 10.17.0.155:2049 [TCP]COMPOUND
        CALL  [ operations: 1 tag: void minor version: 0
                [ RENEW(30) [ client id: cc578d5a4603cbaf ]  ]
        REPLY [  operations: 1
                [ RENEW(30) [ status: OK ]  ]

This is an unsuccessful mount operation issued by the initramfs of the RPI3 when trying to boot with nfsroot. Can you spot the seven small differences? :)

Detect session 10.17.220.126:834 --> 10.17.0.155:2049 [TCP]
10.17.220.126:834 --> 10.17.0.155:2049 [TCP]NULL
	CALL  []
	REPLY []
Detect session 10.17.220.126:957 --> 10.17.0.155:2049 [TCP]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]NULL
	CALL  []
	REPLY []
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]FSINFO
	CALL  [ fsroot: 01000700...741f0b48 ]
	REPLY [ status: OK ]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]GETATTR
	CALL  [ object: 01000700...741f0b48 ]
	REPLY [ status: OK ]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]FSINFO
	CALL  [ fsroot: 01000700...741f0b48 ]
	REPLY [ status: OK ]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]GETATTR
	CALL  [ object: 01000700...741f0b48 ]
	REPLY [ status: OK ]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]ACCESS
	CALL  [ object: 01000700...741f0b48 access: READ LOOKUP MODIFY EXTEND DELETE  ]
	REPLY [ status: OK ]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]LOOKUP
	CALL  [ what:  dir: 01000700...741f0b48 name: etc ]
	REPLY [ status: ERR_NOENT ]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]LOOKUP
	CALL  [ what:  dir: 01000700...741f0b48 name: dev ]
	REPLY [ status: ERR_NOENT ]
10.17.220.126:957 --> 10.17.0.155:2049 [TCP]LOOKUP
	CALL  [ what:  dir: 01000700...741f0b48 name: run ]
	REPLY [ status: ERR_NOENT ]

Fast forward to the end of the story… I found the final solution to the problem here.

Unbelievably, my private network used for testing purposes (10.17.0.0/16) was not in /etc/hosts.allow, but this was not a problem mounting from a fully booted Ubuntu, it was only a problem when mounting from inside an initramfs.

#> cat /etc/hosts.allow

# /etc/hosts.allow: list of hosts that are allowed to access the system.
#                   See the manual pages hosts_access(5) and hosts_options(5).
#
# Example:    ALL: LOCAL @some_netgroup
#             ALL: .foobar.edu EXCEPT terminalserver.foobar.edu
#
# If you're going to protect the portmapper use the name "portmap" for the
# daemon name. Remember that you can only use the keyword "ALL" and IP
# addresses (NOT host or domain names) for the portmapper, as well as for
# rpc.mountd (the NFS mount daemon). See portmap(8) and rpc.mountd(8)
# for further information.
#
rpcbind mountd nfsd statd lockd rquotad : 127.0.0.1 192.168.1.0/24

So, /etc/hosts.deny was acting and denying all the RPC requests coming from that network.

#> cat /etc/hosts.deny
# /etc/hosts.deny: list of hosts that are _not_ allowed to access the system.
#                  See the manual pages hosts_access(5) and hosts_options(5).
#
# Example:    ALL: some.host.name, .some.domain
#             ALL EXCEPT in.fingerd: other.host.name, .other.domain
#
# If you're going to protect the portmapper use the name "portmap" for the
# daemon name. Remember that you can only use the keyword "ALL" and IP
# addresses (NOT host or domain names) for the portmapper, as well as for
# rpc.mountd (the NFS mount daemon). See portmap(8) and rpc.mountd(8)
# for further information.
#
# The PARANOID wildcard matches any host whose name does not match its
# address.
#
# You may wish to enable this to ensure any programs that don't
# validate looked up hostnames still leave understandable logs. In past
# versions of Debian this has been the default.
# ALL: PARANOID
rpcbind mountd nfsd statd lockd rquotad : ALL

The solution was at hand also here, at the fifth result on Google searching for

pmap_getmaps rpc problem rpc authentication error why = client credential too weak

But, hey, would you even trust a website called ca.com?

And finally, this is the result! Maybe you can’t see it but this baby here now runs a BTRFS filesystem via NFSv4 inside a loopback-mounted image on my desktop SSD!

And that one small there on the left is a Teensy 3.2! Say hello Teensy!

  1. Ehm, also note that looking for “short read: 24 < 28” on Google returns just a couple of results, but the first one contains the exact solution to this problem! So it’s just me that is blind! 

  2. Yes, I already tried to read NFS logs at the maximum debug level, and I found them… say… a bit obscure…