Monday, February 23, 2009

BART on large file systems

I wanted to use bart(1M) to quickly compare contents of two file systems. But it didn't work...

# bart create /some/filesystem/
! Version 1.0
! Monday, February 23, 2009 (11:53:57)
# Format:
#fname D size mode acl dirmtime uid gid
#fname P size mode acl mtime uid gid
#fname S size mode acl mtime uid gid
#fname F size mode acl mtime uid gid contents
#fname L size mode acl lnmtime uid gid dest
#fname B size mode acl mtime uid gid devnode
#fname C size mode acl mtime uid gid devnode
#

And it simply exits.

# truss bart create /some/filesystem/
[...]
statvfs("/some/filesystem", 0x08086E48) Err#79 EOVERFLOW
[...]
#

It should probably be statvfs64() but lets check what's going on.

# cat statvfs.d
#!/usr/sbin/dtrace -Fs

syscall::statvfs:entry
/execname == "bart"/
{
self->on=1;
}

syscall::statvfs:return
/self->on/
{
self->on=0;
}

fbt:::entry
/self->on/
{
}

fbt:::return
/self->on/
{
trace(arg1);
}

# ./statvfs.d >/tmp/a &
# bart create /some/filesystem/
# cat /tmp/a
CPU FUNCTION
2 -> statvfs32
[...]
2 -> cstatvfs32
2 -> fsop_statfs
[...]
2 <- fsop_statfs 0 2 <- cstatvfs32 79 [...] 2 <- statvfs32 79 #

It should have used statvfs64() which should have used cstat64_32()

Seems like /usr/bin/bart is a 32bit binary compiled without largefile(5) awareness. There is a bug for exactly the case already opened - 6436517.

While DTrace wasn't really necessary here it helped me to very quickly see what is actually going on in kernel and why it fails especially with a glance at the source thanks to OS OpenGrok. It helped to find the bug.

ps. the workaround in my case is to temporarily set a zfs/fs quota on /some/filesystem to 1TB - and then it works.

No comments: