Kent Overstreet [Tue, 9 Jun 2020 19:59:03 +0000 (15:59 -0400)]
bcachefs: btree_update_nodes_written() requires alloc reserve
Also, in the btree_update_start() path, if we already have a journal
pre-reservation we don't want to take another - that's a deadlock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 5 Jun 2020 13:01:23 +0000 (09:01 -0400)]
bcachefs: Check gfp_flags correctly in bch2_btree_cache_scan()
bch2_btree_node_mem_alloc() uses memalloc_nofs_save()/GFP_NOFS, but
GFP_NOFS does include __GFP_IO - oops. We used to use GFP_NOIO, but as
we're a filesystem now GFP_NOFS makes more sense now and is looser.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 8 Jun 2020 18:28:16 +0000 (14:28 -0400)]
bcachefs: Call bch2_btree_iter_traverse() if necessary in commit path
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 8 Jun 2020 17:26:48 +0000 (13:26 -0400)]
bcachefs: bch2_trans_downgrade()
bch2_btree_iter_downgrade() was looping over all iterators in a
transaction; bch2_trans_downgrade() should be doing that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 4 Jun 2020 03:47:50 +0000 (23:47 -0400)]
bcachefs: Improve warning for copygc failing to move data
This will help narrow down which code is at fault when this happens.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 4 Jun 2020 03:46:15 +0000 (23:46 -0400)]
bcachefs: Always increment bucket gen on bucket reuse
Not doing so confuses copygc
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 4 Jun 2020 02:11:10 +0000 (22:11 -0400)]
bcachefs: Kill old allocator startup code
It's not needed anymore since we can now write to buckets before
updating the alloc btree.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 3 Jun 2020 22:27:07 +0000 (18:27 -0400)]
bcachefs: Improve assorted error messages
This also consolidates the various checks in bch2_mark_pointer() and
bch2_trans_mark_pointer().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 2 Jun 2020 23:41:47 +0000 (19:41 -0400)]
bcachefs: Fix a deadlock in bch2_btree_node_get_sibling()
There was a bad interaction with bch2_btree_iter_set_pos_same_leaf(),
which can leave a btree node locked that is just outside iter->pos,
breaking the lock ordering checks in __bch2_btree_node_lock(). Ideally
we should get rid of this corner case, but for now fix it locally with
verbose comments.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 2 Jun 2020 20:36:11 +0000 (16:36 -0400)]
bcachefs: Add debug code to print btree transactions
Intented to help debug deadlocks, since we can't use lockdep to check
btree node lock ordering.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 3 Jun 2020 20:20:22 +0000 (16:20 -0400)]
bcachefs: Set filesystem features earlier in fs init path
Before we were setting features after allocating btree nodes, which
meant we were using the old btree pointer format.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 2 Jun 2020 20:30:54 +0000 (16:30 -0400)]
bcachefs: Add an option to disable reflink support
Reflink might be buggy, so we're adding an option so users can help
bisect what's going on.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 28 May 2020 20:06:13 +0000 (16:06 -0400)]
bcachefs: Fixes for going RO
Now that interior btree updates are fully transactional, we don't need
to write out alloc info in a loop. However, interior btree updates do
put more things in the journal, so we still need a loop in the RO
sequence.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 28 May 2020 19:51:50 +0000 (15:51 -0400)]
bcachefs: Don't require alloc btree to be updated before buckets are used
This is to break a circular dependency in the shutdown path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 28 May 2020 21:15:41 +0000 (17:15 -0400)]
bcachefs: fsck_error_lock requires GFP_NOFS
this fixes a lockdep splat
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 25 May 2020 18:57:06 +0000 (14:57 -0400)]
bcachefs: Interior btree updates are now fully transactional
We now update the alloc info (bucket sector counts) atomically with
journalling the update to the interior btree nodes, and we also set new
btree roots atomically with the journalled part of the btree update.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 26 May 2020 00:35:53 +0000 (20:35 -0400)]
bcachefs: Factor out bch2_fs_btree_interior_update_init()
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 25 May 2020 23:29:48 +0000 (19:29 -0400)]
bcachefs: Add a mechanism for passing extra journal entries to bch2_trans_commit()
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 24 May 2020 18:06:10 +0000 (14:06 -0400)]
bcachefs: Fix reading of alloc info after unclean shutdown
When updates to interior nodes started being journalled, that meant that
after an unclean shutdown, until journal replay is done we can't walk
the btree without overlaying the updates from the journal.
The initial btree gc was changed to walk the btree overlaying keys from
the journal - but bch2_alloc_read() and bch2_stripes_read() were missed.
Major whoops...
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 27 May 2020 18:10:27 +0000 (14:10 -0400)]
bcachefs: fix memalloc_nofs_restore() usage
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 24 May 2020 18:20:00 +0000 (14:20 -0400)]
bcachefs: Better error messages on bucket sector count overflows
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 24 May 2020 17:37:44 +0000 (13:37 -0400)]
bcachefs: Be more rigorous about marking the filesystem clean
Previously, there was at least one error path where we could mark the
filesystem clean when we hadn't sucessfully written out alloc info.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 26 May 2020 01:25:31 +0000 (21:25 -0400)]
bcachefs: Handle printing of null bkeys
This fixes a null ptr deref.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 25 May 2020 22:47:21 +0000 (18:47 -0400)]
bcachefs: Add vmalloc fallback for decompress workspace
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 23 May 2020 15:44:12 +0000 (11:44 -0400)]
bcachefs: Print out d_type in dirent_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Yuxuan Shui [Fri, 22 May 2020 14:50:05 +0000 (15:50 +0100)]
bcachefs: fix stack corruption
When a bkey_on_stack is passed to bch_read_indirect_extent, there is no
guarantee that it will be big enough to hold the bkey. And
bch_read_indirect_extent is not aware of bkey_on_stack to call realloc
on it. This cause a stack corruption.
This commit makes bch_read_indirect_extent aware of bkey_on_stack so it
can call realloc when appropriate.
Tested-by: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 21 May 2020 21:23:40 +0000 (17:23 -0400)]
bcachefs: Wrap vmap() in memalloc_nofs_save()/restore()
vmalloc() and vmap() don't take GFP_NOFS - this should be pushed further
up the IO path, but for now just doing the simple fix.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 15 May 2020 01:45:08 +0000 (21:45 -0400)]
bcachefs: Fix another iterator counting bug
We were marking the end of where we could insert incorrectly for
indirect extents.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 13 May 2020 21:53:33 +0000 (17:53 -0400)]
bcachefs: Fix setquota
We were returning -EINTR because we were failing to retry the btree
transaction.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 13 May 2020 04:15:28 +0000 (00:15 -0400)]
bcachefs: Fix a workqueue deadlock
writes running out of a workqueue (via dio path) could block and prevent
other writes from calling bch2_write_index() and completing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 12 May 2020 22:34:16 +0000 (18:34 -0400)]
bcachefs: Validate that we read the correct btree node
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 12 May 2020 00:01:07 +0000 (20:01 -0400)]
bcachefs: Fixes for startup on very full filesystems
- Always pass BTREE_INSERT_USE_RESERVE when writing alloc btree keys
- Don't strand buckest on the copygc freelist until after recovery is
done and we're starting copygc.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 9 May 2020 03:15:42 +0000 (23:15 -0400)]
bcachefs: Fix initialization of bounce mempools
When they were converted to kvpmalloc pools they weren't converted to
pass the actual size of the allocation. Oops.
Also, validate the real length in the zstd decompression path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 6 May 2020 19:37:04 +0000 (15:37 -0400)]
bcachefs: Some compression improvements
In __bio_map_or_bounce(), the check for if the bio is physically
contiguous is improved; it's now more readable and handles multi page
but contiguous bios.
Also when decompressing, we were doing a redundant memcpy in the case
where we were able to use vmap to map a bio contigiously.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 2 May 2020 20:21:35 +0000 (16:21 -0400)]
bcachefs: Fix two more deadlocks
Deadlock on shutdown:
btree_update_nodes_written() unblocks btree nodes from being written;
after doing so, it has to check if they were marked as needing to be
written and if so kick off those writes - if that doesn't happen, we'll
never release journal pins and shutdown will get stuck when flushing the
journal.
There was an error path where this didn't happen, because in the error
path we don't actually want those btree nodes write to happen; however,
we still have to kick off the write path so the journal pins get
released. The btree write path checks if we're in a journal error state
and doesn't do the actual write if we are.
Also - there was another deadlock because btree_update_nodes_written()
was taking the btree update off of the unwritten_list too soon - before
getting a journal reservation, which could fail and have to be retried.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 1 May 2020 23:56:31 +0000 (19:56 -0400)]
bcachefs: Fix another deadlock in btree_update_nodes_written()
We also can't be blocking on btree node write locks while holding
btree_interior_update_lock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 Apr 2020 16:57:04 +0000 (12:57 -0400)]
bcachefs: Add some printks for error paths
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 Apr 2020 19:28:25 +0000 (15:28 -0400)]
bcachefs: Don't issue writes that are more than 1 MB
the bcachefs io path in io.c can't bounce writes larger than that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 24 Apr 2020 21:57:59 +0000 (17:57 -0400)]
bcachefs: More fixes for counting extent update iterators
This is unfortunately really fragile - hopefully we'll be able to think
of a new approach at some point.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 24 Apr 2020 22:25:11 +0000 (18:25 -0400)]
bcachefs: Fix a deadlock
btree_node_lock_increment() was incorrectly skipping over the current
iter when checking if we should increment a node we already have locked.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 24 Apr 2020 18:08:56 +0000 (14:08 -0400)]
bcachefs: Handle -EINTR bch2_migrate_index_update()
peek_slot() shouldn't return -EINTR when there's only a single live
iterator, but that's tricky to guarantee - we seem to be returning
-EINTR when we shouldn't, but it's easy enough to handle in the caller.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 24 Apr 2020 18:08:18 +0000 (14:08 -0400)]
bcachefs: Fix for the bkey compat path
In the write path, we were calling bch2_bkey_ops.compat() in the wrong
place.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 11 Apr 2020 16:32:27 +0000 (12:32 -0400)]
bcachefs: Add a few tracepoints
Transaction restart tracing should probably be overhaulled at some
point.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 11 Apr 2020 16:31:16 +0000 (12:31 -0400)]
bcachefs: Slightly reduce btree split threshold
2/3rds performs a lot better than 3/4ths on the tested workloda, leading
to significanly fewer btree node compactions.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 11 Apr 2020 16:30:30 +0000 (12:30 -0400)]
bcachefs: Improve lockdep annotation in journalling code
bch2_journal_res_get() in nonblocking mode is equivalent to a trylock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 11 Apr 2020 16:29:32 +0000 (12:29 -0400)]
bcachefs: Fix a locking bug in bch2_journal_pin_copy()
There was a race where the src pin would be flushed - releasing the last
pin on that sequence number - before adding the new journal pin. Oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 7 Apr 2020 21:27:12 +0000 (17:27 -0400)]
bcachefs: Fix another deadlock in the btree interior update path
Can't take read locks on btree nodes while holding
btree_interior_update_lock. Also, fix a bug where we were leaking
journal prereservations.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 7 Apr 2020 21:31:38 +0000 (17:31 -0400)]
bcachefs: Fix a locking bug in bch2_btree_ptr_debugcheck()
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 7 Apr 2020 17:49:14 +0000 (13:49 -0400)]
bcachefs: Account for ioclock slop when throttling rebalance thread
This should fix an issue where the rebalance thread was spinning
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 6 Apr 2020 01:49:17 +0000 (21:49 -0400)]
bcachefs: Fix a deadlock on starting an interior btree update
Not legal to block on a journal prereservation with btree locks held.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 4 Apr 2020 20:47:59 +0000 (16:47 -0400)]
bcachefs: Fix a debug mode assertion
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 4 Apr 2020 19:49:42 +0000 (15:49 -0400)]
bcachefs: Fix a debug assertion
This assertion was passing the wrong btree node type when inserting into
interior nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 4 Apr 2020 19:45:06 +0000 (15:45 -0400)]
bcachefs: Fix another error path locking bug
btree_update_nodes_written() was leaking a btree node lock on failure to
get a journal reservation.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 4 Apr 2020 17:54:19 +0000 (13:54 -0400)]
bcachefs: Fix a null ptr deref during journal replay
We were calling bch2_extent_can_insert() incorrectly; it should only be
called when the extents-to-keys pass is running because that's when we
could be splitting a compressed extent. Calling bch2_extent_can_insert()
without passing in a disk reservation was causing a null ptr deref.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 1 Apr 2020 21:28:39 +0000 (17:28 -0400)]
bcachefs: Add another mssing bch2_trans_iter_put() call
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 1 Apr 2020 21:14:14 +0000 (17:14 -0400)]
bcachefs: Trace where btree iterators are allocated
This will help with iterator overflow bugs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 1 Apr 2020 20:07:57 +0000 (16:07 -0400)]
bcachefs: Fix fallocate FL_INSERT_RANGE
This was another bug because of bch2_btree_iter_set_pos() invalidating
iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 31 Mar 2020 20:25:30 +0000 (16:25 -0400)]
bcachefs: Add print method for bch2_btree_ptr_v2
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 31 Mar 2020 20:23:43 +0000 (16:23 -0400)]
bcachefs: Fix journalling of interior node updates
We weren't journalling updates done while splitting/compacting nodes -
oops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 30 Mar 2020 22:11:13 +0000 (18:11 -0400)]
bcachefs: Fix iterating of journal keys within a btree node
Extent btrees no longer have weird special behaviour for min_key.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 30 Mar 2020 21:43:21 +0000 (17:43 -0400)]
bcachefs: Fix a locking bug
Dropping the wrong kind of lock can't lead to anything good...
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 30 Mar 2020 18:29:06 +0000 (14:29 -0400)]
bcachefs: Fix inodes pass in fsck
It wasn't updated for the patch that switched inodes to using the offset
field of struct bkey.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 30 Mar 2020 18:05:05 +0000 (14:05 -0400)]
bcachefs: Fix ec_stripe_update_ptrs()
bch2_btree_iter_set_pos() invalidates the key returned by peek().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 29 Mar 2020 20:48:53 +0000 (16:48 -0400)]
bcachefs: Check btree topology at startup
When initial btree gc was changed to overlay journal keys as it walks
the btree, it also stopped checking btree topology.
Previously, checking btree topology was a fairly complicated affair -
but it's much easier now that btree_ptr_v2 has min_key in the pointer.
This rewrites the old range_checks code and uses it in both runtime and
initial gc.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 30 Mar 2020 16:33:30 +0000 (12:33 -0400)]
bcachefs: Don't allocate memory while holding journal reservation
This fixes a lockdep splat - allocating memory can call
bch2_clear_page_bits() which takes mark_lock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 29 Mar 2020 21:01:05 +0000 (17:01 -0400)]
bcachefs: Reduce max nr of btree iters when lockdep is on
This is so we don't overflow MAX_LOCK_DEPTH.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 7 Jan 2020 18:29:32 +0000 (13:29 -0500)]
bcachefs: Kill bkey_type_successor
Previously, BTREE_ID_INODES was special - inodes were indexed by the
inode field, which meant the offset field of struct bpos wasn't used,
which led to special cases in e.g. the btree iterator code.
Now, inodes in the inodes btree are indexed by the offset field.
Also: prevously min_key was special for extents btrees, min_key for
extents would equal max_key for the previous node. Now, min_key =
bkey_successor() of the previous node, same as non extent btrees.
This means we can completely get rid of
btree_type_sucessor/predecessor.
Also make some improvements to the metadata IO validate/compat code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 29 Mar 2020 18:21:44 +0000 (14:21 -0400)]
bcachefs: Switch a BUG_ON() to a warning
This has popped and thus needs to be debugged, but the assertion firing
isn't necessarily fatal so switch it to a warning.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 29 Mar 2020 16:33:41 +0000 (12:33 -0400)]
bcachefs: Use kvpmalloc mempools for compression bounce
This fixes an issue where mounting would fail because of memory
fragmentation - previously the compression bounce buffers were using
get_free_pages().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 28 Mar 2020 22:26:01 +0000 (18:26 -0400)]
bcachefs: Read journal when keep_journal on
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 28 Mar 2020 23:17:23 +0000 (19:17 -0400)]
bcachefs: Various fixes for interior update path
The locking was wrong, and we could get a use after free in the error
path where we weren't taking the entrie being freed off the unwritten
list.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 27 Mar 2020 21:38:51 +0000 (17:38 -0400)]
bcachefs: Use memalloc_nofs_save()
vmalloc allocations don't always obey GFP_NOFS - memalloc_nofs_save() is
the prefered approach for the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 25 Mar 2020 20:13:00 +0000 (16:13 -0400)]
bcachefs: Improve error message in fsck
Seeing the extents that were overlapping is highly useful for figuring
out what went wrong.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 25 Mar 2020 20:12:33 +0000 (16:12 -0400)]
bcachefs: Add an option for keeping journal entries after startup
This will be used by the userspace debug tools.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 25 Mar 2020 21:57:29 +0000 (17:57 -0400)]
bcachefs: Fix an assertion when nothing to replay
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 9 Feb 2020 00:06:31 +0000 (19:06 -0500)]
bcachefs: Journal updates to interior nodes
Previously, the btree has always been self contained and internally
consistent on disk without anything from the journal - the journal just
contained pointers to the btree roots.
However, this meant that btree node split or compact operations - i.e.
anything that changes btree node topology and involves updates to
interior nodes - would require that interior btree node to be written
immediately, which means emitting a btree node write that's mostly empty
(using 4k of space on disk if the filesystemm blocksize is 4k to only
write perhaps ~100 bytes of new keys).
More importantly, this meant most btree node writes had to be FUA, and
consumer drives have a history of slow and/or buggy FUA support - other
filesystes have been bit by this.
This patch changes the interior btree update path to journal updates to
interior nodes, after the writes for the new btree nodes have completed.
Best of all, it turns out to simplify the interior node update path
somewhat.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 16 Mar 2020 02:32:03 +0000 (22:32 -0400)]
bcachefs: Replay interior node keys
This slightly modifies the journal replay code so that it can replay
updates to interior nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 16 Mar 2020 03:29:43 +0000 (23:29 -0400)]
bcachefs: trans_commit() path can now insert to interior nodes
This will be needed for the upcoming patches to journal updates to
interior btree nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 24 Mar 2020 21:00:48 +0000 (17:00 -0400)]
bcachefs: Disable extent merging
Extent merging is currently broken, and will be reimplemented
differently soon - right now it only happens when btree nodes are being
compacted, which makes it difficult to test.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 21 Mar 2020 18:47:00 +0000 (14:47 -0400)]
bcachefs: Fix a locking bug in fsck
This works around a btree locking issue - we can't be holding read locks
while taking write locks, which currently means we can't have live
iterators holding read locks at commit time.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 21 Mar 2020 18:08:01 +0000 (14:08 -0400)]
bcachefs: Fix count_iters_for_insert()
This fixes a transaction iterator overflow.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 18 Mar 2020 17:40:28 +0000 (13:40 -0400)]
bcachefs: Fix an iterator bug
We were incorrectly not restarting the transaction when re-traversing
iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 18 Mar 2020 15:46:46 +0000 (11:46 -0400)]
bcachefs: Shut down quicker
Internal writes (i.e. copygc/rebalance operations) shouldn't be blocking
on the allocator when we're going RO.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 18 Mar 2020 15:40:07 +0000 (11:40 -0400)]
bcachefs: BCH_FEATURE_new_extent_overwrite is now required
The patch "bcachefs: Move extent overwrite handling out of core btree
code" should have been flipping on this feature bit; extent btree nodes
in the old format have to be rewritten before we can insert into them
with the new extent update path. Not turning on this feature bit was
causing us to go into an infinite loop where we keep rewriting btree
nodes over and over.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 16 Mar 2020 21:23:37 +0000 (17:23 -0400)]
bcachefs: Clear BCH_FEATURE_extents_above_btree_updates on clean shutdown
This is needed so that users can roll back to before "
d9bb516b2d
bcachefs: Move extent overwrite handling out of core btree code", which
it appears may still be buggy.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 16 Mar 2020 19:48:58 +0000 (15:48 -0400)]
bcachefs: Fix another iterator leak
This updates bch2_rbio_narrow_crcs() to the current style for
transactional btree code, and fixes a rare panic on iterator overflow.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 16 Mar 2020 19:49:23 +0000 (15:49 -0400)]
bcachefs: Don't use peek_filter() unnecessarily
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 16 Mar 2020 18:49:52 +0000 (14:49 -0400)]
bcachefs: Fix a use after free in dio write path
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 16 Mar 2020 02:41:10 +0000 (22:41 -0400)]
bcachefs: Drop unused export
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 30 Dec 2019 19:37:25 +0000 (14:37 -0500)]
bcachefs: Move extent overwrite handling out of core btree code
Ever since the btree code was first written, handling of overwriting
existing extents - including partially overwriting and splittin existing
extents - was handled as part of the core btree insert path. The modern
transaction and iterator infrastructure didn't exist then, so that was
the only way for it to be done.
This patch moves that outside of the core btree code to a pass that runs
at transaction commit time.
This is a significant simplification to the btree code and overall
reduction in code size, but more importantly it gets us much closer to
the core btree code being completely independent of extents and is
important prep work for snapshots.
This introduces a new feature bit; the old and new extent update models
are incompatible when the filesystem needs journal replay.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 5 Mar 2020 23:44:59 +0000 (18:44 -0500)]
bcachefs: btree_iter_peek_with_updates()
Introduce a new iterator method that provides a consistent view of the
btree plus uncommitted updates.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 15 Mar 2020 20:15:08 +0000 (16:15 -0400)]
bcachefs: Fix build when CONFIG_BCACHEFS_DEBUG=n
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 18 Feb 2020 21:17:55 +0000 (16:17 -0500)]
bcachefs: More btree iter invariants
Ensure that iter->pos always lies between the start and end of iter->k
(the last key returned). Also, bch2_btree_iter_set_pos() now invalidates
the key that peek() or next() returned.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 14 Mar 2020 01:41:22 +0000 (21:41 -0400)]
bcachefs: Simplify bch2_btree_iter_peek_slot()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 18 Feb 2020 21:17:55 +0000 (16:17 -0500)]
bcachefs: Iterator debug code improvements
More aggressively checking iterator invariants, and fixing the resulting
bugs. Also greatly simplifying iter_next() and iter_next_slot() - they
were hyper optimized before, but the optimizations were getting too
brittle.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 5 Mar 2020 23:43:31 +0000 (18:43 -0500)]
bcachefs: Skip 0 size deleted extents in journal replay
These are created by the new extent update path, but not used yet by the
recovery code and they break the existing recovery code, so we can just
skip them.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 9 Mar 2020 20:15:54 +0000 (16:15 -0400)]
bcachefs: Traverse iterator in journal replay
This fixes a bug where we end up spinning in journal replay - in theory
this shouldn't be necessary though, transaction reset should be
re-traversing all iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 9 Mar 2020 18:19:58 +0000 (14:19 -0400)]
bcachefs: Don't log errors that are expected during shutdown
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 7 Mar 2020 22:20:39 +0000 (17:20 -0500)]
bcachefs: Fix bch2_dump_bset()
It's used in the write path when the bset isn't in the btree node
buffer.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 7 Mar 2020 18:30:55 +0000 (13:30 -0500)]
bcachefs: Fix another iterator leak
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>