An overview of bcachefs debugging facilities
Everything about the internal operation of the system should be easily visible at runtime, via sysfs, debugfs or tracepoints. If you notice something that isn't sufficiently visible, please file a bug.
If something goes wonky or is behaving unexpectedly, there should be enough information readily and easily available at runtime to understand what bcachefs is doing and why.
Also, when an error occurs, the error message should print out all the relevant information we have; it should print out enough information for the issue to be debugged, without hunting for more.
And if something goes really wrong and fsck isn't able to recover, there should be tooling for working with the developers to get that fixed, too.
Runtime facilities
For inspection of a running bcachefs filesystem, including questions like "what is my filesystem doing and why?", we have:
sysfs:
/sys/fs/bcachefs/<uuid>/
Here we've got basic information about the filesystem and member devices. There's also an
options
directory which allows filesystem options to be set and queried at runtime, atime_stats
with statistics on various events we track latency for, and aninternal
directory with additional debug info.debugfs:
/sys/kernel/debug/bcachefs/<uuid>/
Debugfs also shows the full contents of every btree - all metadata is a key in a btree, so this means all filesystem metadata is inspectable here. There's additional per-btree files that show other useful btree information (how full are btree nodes, bkey packing statistics, etc.).
tracepoints and counters
In addition to the usual tracepoints, we keep persistent counters for every tracepoint event, so that it's possible to see if slowpath events have been occuring without tracing having been previously enabled.
/sys/fs/bcachefs/<uuid>/counters
shows, for every event, the number of events since filesystem creation, and since mount.
Hints on where to get started
Is something spinning? Does the system appear to be trying to get work done, without getting anything done?
Check top
: this shows CPU usage by thread - is something spinning?
Check perf top
: this shows CPU usage, broken out by function/module - what code is spinning?
Check perf top -e bcachefs:*
: this shows counters for all bcachefs events - are we hitting a rare or slowpath event?
Is everything stuck?
Check btree_transactions
in debugfs -
/sys/kernel/debug/bcachefs/<uuid>/btree_transactions
; other files there may
also be relevant.
Is something stuck?
Check sysfs dev-0/alloc_debug
: this shows various internal allocator state -
perhaps the allocator is stuck?
Something funny with rebalance/background data tasks?
Check sysfs internal/rebalance_work
, internal/moving_ctxts
All of this stuff could use reorganizing and expanding, of course.
Offline filesystem inspection
The bcachefs list
subcommand lists the contents of the btrees - extents, inodes, dirents, and more.
The bcachefs list_journal
subcommand lists the contents of the journal. This
can be used to discover what operation caused an error, e.g. reported by fsck,
by searching for the transaction that last updated those key(s).
Unrepairable filesystem debugging
If there's an issue that fsck can't fix, use the bcachefs dump
subcommand,
and then magic wormhole,
to send your filesystem metadata to the developers.
For the developer
Internally, bcachefs uses printbufs
for formatting text in a generic and
structured way, and we try to write to_text()
functions for as many types as
possible.
This makes it much easier to write good error messages, and add new debug tools
to sysfs/debugfs; when to_text()
functions already exist for all the relevant
types, this work is much easier.
Try to keep up with and extend this approach when working with the code.