Darrick J. Wong [Tue, 8 Jun 2021 16:38:24 +0000 (09:38 -0700)]
Merge tag 'fix-inode-health-reports-5.14_2021-06-08' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2
xfs: preserve inode health reports for longer
This is a quick series to make sure that inode sickness reports stick
around in memory for some amount of time.
v2: rebase to 5.13-rc4
v3: require explicit request to reclaim sick inodes, drop weird icache
miss interaction with DONTCACHE
* tag 'fix-inode-health-reports-5.14_2021-06-08' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: selectively keep sick inodes in memory
xfs: drop IDONTCACHE on inodes when we mark them sick
xfs: only reset incore inode health state flags when reclaiming an inode
Darrick J. Wong [Mon, 7 Jun 2021 16:34:50 +0000 (09:34 -0700)]
xfs: selectively keep sick inodes in memory
It's important that the filesystem retain its memory of sick inodes for
a little while after problems are found so that reports can be collected
about what was wrong. Don't let inode reclamation free sick inodes
unless we're unmounting or the fs already went down.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Darrick J. Wong [Mon, 7 Jun 2021 16:34:50 +0000 (09:34 -0700)]
xfs: drop IDONTCACHE on inodes when we mark them sick
When we decide to mark an inode sick, clear the DONTCACHE flag so that
the incore inode will be kept around until memory pressure forces it out
of memory. This increases the chances that the sick status will be
caught by someone compiling a health report later on.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Darrick J. Wong [Mon, 7 Jun 2021 16:34:49 +0000 (09:34 -0700)]
xfs: only reset incore inode health state flags when reclaiming an inode
While running some fuzz tests on inode metadata, I noticed that the
filesystem health report (as provided by xfs_spaceman) failed to report
the file corruption even when spaceman was run immediately after running
xfs_scrub to detect the corruption. That isn't the intended behavior;
one ought to be able to run scrub to detect errors in the ondisk
metadata and be able to access to those reports for some time after the
scrub.
After running the same sequence through an instrumented kernel, I
discovered the reason why -- scrub igets the file, scans it, marks it
sick, and ireleases the inode. When the VFS lets go of the incore
inode, it moves to RECLAIMABLE state. If spaceman igets the incore
inode before it moves to RECLAIM state, iget reinitializes the VFS
state, clears the sick and checked masks, and hands back the inode. At
this point, the caller has the exact same incore inode, but with all the
health state erased.
In other words, we're erasing the incore inode's health state flags when
we've decided NOT to sever the link between the incore inode and the
ondisk inode. This is wrong, so we need to remove the lines that zero
the fields from xfs_iget_cache_hit.
As a precaution, we add the same lines into xfs_reclaim_inode just after
we sever the link between incore and ondisk inode. Strictly speaking
this isn't necessary because once an inode has gone through reclaim it
must go through xfs_inode_alloc (which also zeroes the state) and
xfs_iget is careful to check for mismatches between the inode it pulls
out of the radix tree and the one it wants.
Fixes: 6772c1f11206 ("xfs: track metadata health status")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Darrick J. Wong [Tue, 8 Jun 2021 16:26:44 +0000 (09:26 -0700)]
Merge tag 'inode-walk-cleanups-5.14_2021-06-03' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2
xfs: clean up incore inode walk functions
This ambitious series aims to cleans up redundant inode walk code in
xfs_icache.c, hide implementation details of the quotaoff dquot release
code, and eliminates indirect function calls from incore inode walks.
The first thing it does is to move all the code that quotaoff calls to
release dquots from all incore inodes into xfs_icache.c. Next, it
separates the goal of an inode walk from the actual radix tree tags that
may or may not be involved and drops the kludgy XFS_ICI_NO_TAG thing.
Finally, we split the speculative preallocation (blockgc) and quotaoff
dquot release code paths into separate functions so that we can keep the
implementations cohesive.
Christoph suggested last cycle that we 'simply' change quotaoff not to
allow deactivating quota entirely, but as these cleanups are to enable
one major change in behavior (deferred inode inactivation) I do not want
to add a second behavior change (quotaoff) as a dependency.
To be blunt: Additional cleanups are not in scope for this series.
Next, I made two observations about incore inode radix tree walks --
since there's a 1:1 mapping between the walk goal and the per-inode
processing function passed in, we can use the goal to make a direct call
to the processing function. Furthermore, the only caller to supply a
nonzero iter_flags argument is quotaoff, and there's only one INEW flag.
From that observation, I concluded that it's quite possible to remove
two parameters from the xfs_inode_walk* function signatures -- the
iter_flags, and the execute function pointer. The middle of the series
moves the INEW functionality into the one piece (quotaoff) that wants
it, and removes the indirect calls.
The final observation is that the inode reclaim walk loop is now almost
the same as xfs_inode_walk, so it's silly to maintain two copies. Merge
the reclaim loop code into xfs_inode_walk.
Lastly, refactor the per-ag radix tagging functions since there's
duplicated code that can be consolidated.
This series is a prerequisite for the next two patchsets, since deferred
inode inactivation will add another inode radix tree tag and iterator
function to xfs_inode_walk.
v2: walk the vfs inode list when running quotaoff instead of the radix
tree, then rework the (now completely internal) inode walk function
to take the tag as the main parameter.
v3: merge the reclaim loop into xfs_inode_walk, then consolidate the
radix tree tagging functions
v4: rebase to 5.13-rc4
v5: combine with the quotaoff patchset, reorder functions to minimize
forward declarations, split inode walk goals from radix tree tags
to reduce conceptual confusion
v6: start moving the inode cache code towards the xfs_icwalk prefix
* tag 'inode-walk-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: refactor per-AG inode tagging functions
xfs: merge xfs_reclaim_inodes_ag into xfs_inode_walk_ag
xfs: pass struct xfs_eofblocks to the inode scan callback
xfs: fix radix tree tag signs
xfs: make the icwalk processing functions clean up the grab state
xfs: clean up inode state flag tests in xfs_blockgc_igrab
xfs: remove indirect calls from xfs_inode_walk{,_ag}
xfs: remove iter_flags parameter from xfs_inode_walk_*
xfs: move xfs_inew_wait call into xfs_dqrele_inode
xfs: separate the dqrele_all inode grab logic from xfs_inode_walk_ag_grab
xfs: pass the goal of the incore inode walk to xfs_inode_walk()
xfs: rename xfs_inode_walk functions to xfs_icwalk
xfs: move the inode walk functions further down
xfs: detach inode dquots at the end of inactivation
xfs: move the quotaoff dqrele inode walk into xfs_icache.c
[djwong: added variable names to function declarations while fixing
merge conflicts]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Tue, 8 Jun 2021 16:22:34 +0000 (09:22 -0700)]
Merge tag 'assorted-fixes-5.14-1_2021-06-03' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2
xfs: assorted fixes for 5.14, part 1
This branch contains the first round of various small fixes for 5.14.
* tag 'assorted-fixes-5.14-1_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: don't take a spinlock unconditionally in the DIO fastpath
xfs: mark xfs_bmap_set_attrforkoff static
xfs: Remove redundant assignment to busy
xfs: sort variable alphabetically to avoid repeated declaration
Darrick J. Wong [Tue, 8 Jun 2021 16:21:24 +0000 (09:21 -0700)]
Merge tag 'unit-conversion-cleanups-5.14_2021-06-03' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2
xfs: various unit conversions
Crafting the realtime file extent size hint fixes revealed various
opportunities to clean up unit conversions, so now that gets its own
series.
* tag 'unit-conversion-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: remove unnecessary shifts
xfs: clean up open-coded fs block unit conversions
Dave Chinner [Tue, 8 Jun 2021 16:19:22 +0000 (09:19 -0700)]
xfs: drop the AGI being passed to xfs_check_agi_freecount
From: Dave Chinner <dchinner@redhat.com>
Stephen Rothwell reported this compiler warning from linux-next:
fs/xfs/libxfs/xfs_ialloc.c: In function 'xfs_difree_finobt':
fs/xfs/libxfs/xfs_ialloc.c:2032:20: warning: unused variable 'agi' [-Wunused-variable]
2032 | struct xfs_agi *agi = agbp->b_addr;
Which is fallout from agno -> perag conversions that were done in
this function. xfs_check_agi_freecount() is the only user of "agi"
in xfs_difree_finobt() now, and it only uses the agi to get the
current free inode count. We hold that in the perag structure, so
there's not need to directly reference the raw AGI to get this
information.
The btree cursor being passed to xfs_check_agi_freecount() has a
reference to the perag being operated on, so use that directly in
xfs_check_agi_freecount() rather than passing an AGI.
Fixes: 7b13c5155182 ("xfs: use perag for ialloc btree cursors")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Tue, 8 Jun 2021 16:13:13 +0000 (09:13 -0700)]
Merge tag 'xfs-perag-conv-tag' of git://git./linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2
xfs: initial agnumber -> perag conversions for shrink
If we want to use active references to the perag to be able to gate
shrink removing AGs and hence perags safely, we've got a fair bit of
work to do actually use perags in all the places we need to.
There's a lot of code that iterates ag numbers and then
looks up perags from that, often multiple times for the same perag
in the one operation. If we want to use reference counted perags for
access control, then we need to convert all these uses to perag
iterators, not agno iterators.
[Patches 1-4]
The first step of this is consolidating all the perag management -
init, free, get, put, etc into a common location. THis is spread all
over the place right now, so move it all into libxfs/xfs_ag.[ch].
This does expose kernel only bits of the perag to libxfs and hence
userspace, so the structures and code is rearranged to minimise the
number of ifdefs that need to be added to the userspace codebase.
The perag iterator in xfs_icache.c is promoted to a first class API
and expanded to the needs of the code as required.
[Patches 5-10]
These are the first basic perag iterator conversions and changes to
pass the perag down the stack from those iterators where
appropriate. A lot of this is obvious, simple changes, though in
some places we stop passing the perag down the stack because the
code enters into an as yet unconverted subsystem that still uses raw
AGs.
[Patches 11-16]
These replace the agno passed in the btree cursor for per-ag btree
operations with a perag that is passed to the cursor init function.
The cursor takes it's own reference to the perag, and the reference
is dropped when the cursor is deleted. Hence we get reference
coverage for the entire time the cursor is active, even if the code
that initialised the cursor drops it's reference before the cursor
or any of it's children (duplicates) have been deleted.
The first patch adds the perag infrastructure for the cursor, the
next four patches convert a btree cursor at a time, and the last
removes the agno from the cursor once it is unused.
[Patches 17-21]
These patches are a demonstration of the simplifications and
cleanups that come from plumbing the perag through interfaces that
select and then operate on a specific AG. In this case the inode
allocation algorithm does up to three walks across all AGs before it
either allocates an inode or fails. Two of these walks are purely
just to select the AG, and even then it doesn't guarantee inode
allocation success so there's a third walk if the selected AG
allocation fails.
These patches collapse the selection and allocation into a single
loop, simplifies the error handling because xfs_dir_ialloc() always
returns ENOSPC if no AG was selected for inode allocation or we fail
to allocate an inode in any AG, gets rid of xfs_dir_ialloc()
wrapper, converts inode allocation to run entirely from a single
perag instance, and then factors xfs_dialloc() into a much, much
simpler loop which is easy to understand.
Hence we end up with the same inode allocation logic, but it only
needs two complete iterations at worst, makes AG selection and
allocation atomic w.r.t. shrink and chops out out over 100 lines of
code from this hot code path.
[Patch 22]
Converts the unlink path to pass perags through it.
There's more conversion work to be done, but this patchset gets
through a large chunk of it in one hit. Most of the iterators are
converted, so once this is solidified we can move on to converting
these to active references for being able to free perags while the
fs is still active.
* tag 'xfs-perag-conv-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (23 commits)
xfs: remove xfs_perag_t
xfs: use perag through unlink processing
xfs: clean up and simplify xfs_dialloc()
xfs: inode allocation can use a single perag instance
xfs: get rid of xfs_dir_ialloc()
xfs: collapse AG selection for inode allocation
xfs: simplify xfs_dialloc_select_ag() return values
xfs: remove agno from btree cursor
xfs: use perag for ialloc btree cursors
xfs: convert allocbt cursors to use perags
xfs: convert refcount btree cursor to use perags
xfs: convert rmap btree cursor to using a perag
xfs: add a perag to the btree cursor
xfs: pass perags around in fsmap data dev functions
xfs: push perags through the ag reservation callouts
xfs: pass perags through to the busy extent code
xfs: convert secondary superblock walk to use perags
xfs: convert xfs_iwalk to use perag references
xfs: convert raw ag walks to use for_each_perag
xfs: make for_each_perag... a first class citizen
...
Darrick J. Wong [Tue, 8 Jun 2021 16:10:01 +0000 (09:10 -0700)]
Merge tag 'xfs-buf-bulk-alloc-tag' of git://git./linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2
xfs: buffer cache bulk page allocation
This patchset makes use of the new bulk page allocation interface to
reduce the overhead of allocating large numbers of pages in a
loop.
The first two patches are refactoring buffer memory allocation and
converting the uncached buffer path to use the same page allocation
path, followed by converting the page allocation path to use bulk
allocation.
The rest of the patches are then consolidation of the page
allocation and freeing code to simplify the code and remove a chunk
of unnecessary abstraction. This is largely based on a series of
changes made by Christoph Hellwig.
* tag 'xfs-buf-bulk-alloc-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: merge xfs_buf_allocate_memory
xfs: cleanup error handling in xfs_buf_get_map
xfs: get rid of xb_to_gfp()
xfs: simplify the b_page_count calculation
xfs: remove ->b_offset handling for page backed buffers
xfs: move page freeing into _xfs_buf_free_pages()
xfs: merge _xfs_buf_get_pages()
xfs: use alloc_pages_bulk_array() for buffers
xfs: use xfs_buf_alloc_pages for uncached buffers
xfs: split up xfs_buf_allocate_memory
Dave Chinner [Mon, 7 Jun 2021 01:50:48 +0000 (11:50 +1000)]
xfs: merge xfs_buf_allocate_memory
It only has one caller and is now a simple function, so merge it
into the caller.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Mon, 7 Jun 2021 01:50:47 +0000 (11:50 +1000)]
xfs: cleanup error handling in xfs_buf_get_map
Use a single goto label for freeing the buffer and returning an
error.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Dave Chinner [Mon, 7 Jun 2021 01:50:17 +0000 (11:50 +1000)]
xfs: get rid of xb_to_gfp()
Only used in one place, so just open code the logic in the macro.
Based on a patch from Christoph Hellwig.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Mon, 7 Jun 2021 01:50:00 +0000 (11:50 +1000)]
xfs: simplify the b_page_count calculation
Ever since we stopped using the Linux page cache to back XFS buffers
there is no need to take the start sector into account for
calculating the number of pages in a buffer, as the data always
start from the beginning of the buffer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[dgc: modified to suit this series]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Mon, 7 Jun 2021 01:49:50 +0000 (11:49 +1000)]
xfs: remove ->b_offset handling for page backed buffers
->b_offset can only be non-zero for _XBF_KMEM backed buffers, so
remove all code dealing with it for page backed buffers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[dgc: modified to fit this patchset]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Mon, 31 May 2021 18:32:02 +0000 (11:32 -0700)]
xfs: refactor per-AG inode tagging functions
In preparation for adding another incore inode tree tag, refactor the
code that sets and clears tags from the per-AG inode tree and the tree
of per-AG structures, and remove the open-coded versions used by the
blockgc code.
Note: For reclaim, we now rely on the radix tree tags instead of the
reclaimable inode count more heavily than we used to. The conversion
should be fine, but the logic isn't 100% identical.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:32:02 +0000 (11:32 -0700)]
xfs: merge xfs_reclaim_inodes_ag into xfs_inode_walk_ag
Merge these two inode walk loops together, since they're pretty similar
now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:32:01 +0000 (11:32 -0700)]
xfs: pass struct xfs_eofblocks to the inode scan callback
Pass a pointer to the actual eofb structure around the inode scanner
functions instead of a void pointer, now that none of the functions is
used as a callback.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:32:01 +0000 (11:32 -0700)]
xfs: fix radix tree tag signs
Radix tree tags are supposed to be unsigned ints, so fix the callers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:32:00 +0000 (11:32 -0700)]
xfs: make the icwalk processing functions clean up the grab state
Soon we're going to be adding two new callers to the incore inode walk
code: reclaim of incore inodes, and (later) inactivation of inodes.
Both states operate on inodes that no longer have any VFS state, so we
need to move the xfs_irele calls into the processing functions.
In other words, icwalk processing functions are responsible for cleaning
up whatever state changes are made by the corresponding icwalk igrab
function that picked the inode for processing.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Wed, 2 Jun 2021 06:01:44 +0000 (23:01 -0700)]
xfs: clean up inode state flag tests in xfs_blockgc_igrab
Clean up the definition of which inode states are not eligible for
speculative preallocation garbage collecting by creating a private
#define. The deferred inactivation patchset will add two new entries to
the set of flags-to-ignore, so we want the definition not to end up a
cluttered mess.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:32:00 +0000 (11:32 -0700)]
xfs: remove indirect calls from xfs_inode_walk{,_ag}
It turns out that there is a 1:1 mapping between the execute and goal
parameters that are passed to xfs_inode_walk_ag:
xfs_blockgc_scan_inode <=> XFS_ICWALK_BLOCKGC
xfs_dqrele_inode <=> XFS_ICWALK_DQRELE
Because of this exact correspondence, we don't need the execute function
pointer and can replace it with a direct call.
For the price of a forward static declaration, we can eliminate the
indirect function call. This likely has a negligible impact on
performance (since the execute function runs transactions), but it also
simplifies the function signature.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:31:59 +0000 (11:31 -0700)]
xfs: remove iter_flags parameter from xfs_inode_walk_*
The sole iter_flags is XFS_INODE_WALK_INEW_WAIT, and there are no users.
Remove the flag, and the parameter, and all the code that used it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:31:59 +0000 (11:31 -0700)]
xfs: move xfs_inew_wait call into xfs_dqrele_inode
Move the INEW wait into xfs_dqrele_inode so that we can drop the
iter_flags parameter in the next patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:31:58 +0000 (11:31 -0700)]
xfs: separate the dqrele_all inode grab logic from xfs_inode_walk_ag_grab
Disentangle the dqrele_all inode grab code from the "generic" inode walk
grabbing code, and and use the opportunity to document why the dqrele
grab function does what it does. Since xfs_inode_walk_ag_grab is now
only used for blockgc, rename it to reflect that.
Ultimately, there will be four reasons to perform a walk of incore
inodes: quotaoff dquote releasing (dqrele), garbage collection of
speculative preallocations (blockgc), reclamation of incore inodes
(reclaim), and deferred inactivation (inodegc). Each of these four have
their own slightly different criteria for deciding if they want to
handle an inode, so it makes more sense to have four cohesive igrab
functions than one confusing parameteric grab function like we do now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Tue, 1 Jun 2021 20:49:52 +0000 (13:49 -0700)]
xfs: pass the goal of the incore inode walk to xfs_inode_walk()
As part of removing the indirect calls and radix tag implementation
details from the incore inode walk loop, create an enum to represent the
goal of the inode iteration. More immediately, this separate removes
the need for the "ICI_NOTAG" define which makes little sense.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Wed, 2 Jun 2021 05:41:25 +0000 (22:41 -0700)]
xfs: rename xfs_inode_walk functions to xfs_icwalk
Shorten the prefix so that all the incore inode cache walk code has
"xfs_icwalk" in the name somewhere.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Tue, 1 Jun 2021 20:29:41 +0000 (13:29 -0700)]
xfs: move the inode walk functions further down
Move the inode walk functions further down in the file to limit the
forward declarations to the two walk functions as we add new code that
uses the inode walks. We'll clean them out later (i.e. after the
deferred inode inactivation series).
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:31:57 +0000 (11:31 -0700)]
xfs: detach inode dquots at the end of inactivation
Once we're done with inactivating an inode, we're finished updating
metadata for that inode. This means that we can detach the dquots at
the end and not have to wait for reclaim to do it for us.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:31:57 +0000 (11:31 -0700)]
xfs: move the quotaoff dqrele inode walk into xfs_icache.c
The only external caller of xfs_inode_walk* happens in quotaoff, when we
want to walk all the incore inodes to detach the dquots. Move this code
to xfs_icache.c so that we can hide xfs_inode_walk as the starting step
in more cleanups of inode walks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Dave Chinner [Wed, 2 Jun 2021 22:00:38 +0000 (15:00 -0700)]
xfs: don't take a spinlock unconditionally in the DIO fastpath
Because this happens at high thread counts on high IOPS devices
doing mixed read/write AIO-DIO to a single file at about a million
iops:
64.09% 0.21% [kernel] [k] io_submit_one
- 63.87% io_submit_one
- 44.33% aio_write
- 42.70% xfs_file_write_iter
- 41.32% xfs_file_dio_write_aligned
- 25.51% xfs_file_write_checks
- 21.60% _raw_spin_lock
- 21.59% do_raw_spin_lock
- 19.70% __pv_queued_spin_lock_slowpath
This also happens of the IO completion IO path:
22.89% 0.69% [kernel] [k] xfs_dio_write_end_io
- 22.49% xfs_dio_write_end_io
- 21.79% _raw_spin_lock
- 20.97% do_raw_spin_lock
- 20.10% __pv_queued_spin_lock_slowpath
IOWs, fio is burning ~14 whole CPUs on this spin lock.
So, do an unlocked check against inode size first, then if we are
at/beyond EOF, take the spinlock and recheck. This makes the
spinlock disappear from the overwrite fastpath.
I'd like to report that fixing this makes things go faster. It
doesn't - it just exposes the the XFS_ILOCK as the next severe
contention point doing extent mapping lookups, and that now burns
all the 14 CPUs this spinlock was burning.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 2 Jun 2021 21:58:59 +0000 (14:58 -0700)]
xfs: mark xfs_bmap_set_attrforkoff static
xfs_bmap_set_attrforkoff is only used inside of xfs_bmap.c, so mark it
static.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Jiapeng Chong [Wed, 2 Jun 2021 21:56:29 +0000 (14:56 -0700)]
xfs: Remove redundant assignment to busy
Variable busy is set to false, but this value is never read as it is
overwritten or not used later on, hence it is a redundant assignment
and can be removed.
Clean up the following clang-analyzer warning:
fs/xfs/libxfs/xfs_alloc.c:1679:2: warning: Value stored to 'busy' is
never read [clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Shaokun Zhang [Wed, 2 Jun 2021 21:54:09 +0000 (14:54 -0700)]
xfs: sort variable alphabetically to avoid repeated declaration
Variable 'xfs_agf_buf_ops', 'xfs_agi_buf_ops', 'xfs_dquot_buf_ops' and
'xfs_symlink_buf_ops' are declared twice, so sort these variables
alphabetically and remove the repeated declaration.
Cc: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:51 +0000 (10:48 +1000)]
xfs: remove xfs_perag_t
Almost unused, gets rid of another typedef.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:51 +0000 (10:48 +1000)]
xfs: use perag through unlink processing
Unlinked lists are held in the perag, and freeing of inodes needs to
be passed a perag, too, so look up the perag early in the unlink
processing and use it throughout.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: clean up and simplify xfs_dialloc()
Because it's a mess.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: inode allocation can use a single perag instance
Now that we've internalised the two-phase inode allocation, we can
now easily make the AG selection and allocation atomic from the
perspective of a single perag context. This will ensure AGs going
offline/away cannot occur between the selection and allocation
steps.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: get rid of xfs_dir_ialloc()
This is just a simple wrapper around the per-ag inode allocation
that doesn't need to exist. The internal mechanism to select and
allocate within an AG does not need to be exposed outside
xfs_ialloc.c, and it being exposed simply makes it harder to follow
the code and simplify it.
This is simplified by internalising xf_dialloc_select_ag() and
xfs_dialloc_ag() into a single xfs_dialloc() function and then
xfs_dir_ialloc() can go away.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: collapse AG selection for inode allocation
xfs_dialloc_select_ag() does a lot of repetitive work. It first
calls xfs_ialloc_ag_select() to select the AG to start allocation
attempts in, which can do up to two entire loops across the perags
that inodes can be allocated in. This is simply checking if there is
spce available to allocate inodes in an AG, and it returns when it
finds the first candidate AG.
xfs_dialloc_select_ag() then does it's own iterative walk across
all the perags locking the AGIs and trying to allocate inodes from
the locked AG. It also doesn't limit the search to mp->m_maxagi,
so it will walk all AGs whether they can allocate inodes or not.
Hence if we are really low on inodes, we could do almost 3 entire
walks across the whole perag range before we find an allocation
group we can allocate inodes in or report ENOSPC.
Because xfs_ialloc_ag_select() returns on the first candidate AG it
finds, we can simply do these checks directly in
xfs_dialloc_select_ag() before we lock and try to allocate inodes.
This reduces the inode allocation pass down to 2 perag sweeps at
most - one for aligned inode cluster allocation and if we can't
allocate full, aligned inode clusters anywhere we'll do another pass
trying to do sparse inode cluster allocation.
This also removes a big chunk of duplicate code.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: simplify xfs_dialloc_select_ag() return values
The only caller of xfs_dialloc_select_ag() will always return
-ENOSPC to it's caller if the agbp returned from
xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate
AGI we can allocate inodes from is always an ENOSPC condition, so
move this logic up into xfs_dialloc_select_ag() so we can simplify
the return logic in this function.
xfs_dialloc_select_ag() now only ever returns 0 with a locked
agbp, or an error with no agbp.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: remove agno from btree cursor
Now that everything passes a perag, the agno is not needed anymore.
Convert all the users to use pag->pag_agno instead and remove the
agno from the cursor. This was largely done as an automated search
and replace.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: use perag for ialloc btree cursors
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: convert allocbt cursors to use perags
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: convert refcount btree cursor to use perags
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: convert rmap btree cursor to using a perag
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: add a perag to the btree cursor
Which will eventually completely replace the agno in it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: pass perags around in fsmap data dev functions
Needs a [from, to] ranged AG walk, and the perag to be stuffed into
the info structure for callouts to use.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: push perags through the ag reservation callouts
We currently pass an agno from the AG reservation functions to the
individual feature accounting functions, which in future may have to
do perag lookups to access per-AG state. Instead, pre-emptively
plumb the perag through from the highest AG reservation layer to the
feature callouts so they won't have to look it up again.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: pass perags through to the busy extent code
All of the callers of the busy extent API either have perag
references available to use so we can pass a perag to the busy
extent functions rather than having them have to do unnecessary
lookups.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: convert secondary superblock walk to use perags
Clean up the last external manual AG walk.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: convert xfs_iwalk to use perag references
Rather than manually walking the ags and passing agnunbers around,
pass the perag for the AG we are currently working on around in the
iwalk structure.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: convert raw ag walks to use for_each_perag
Convert the raw walks to an iterator, pulling the current AG out of
pag->pag_agno instead of the loop iterator variable.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: make for_each_perag... a first class citizen
for_each_perag_tag() is defined in xfs_icache.c for local use.
Promote this to xfs_ag.h and define equivalent iteration functions
so that we can use them to iterate AGs instead to replace open coded
perag walks and perag lookups.
We also convert as many of the straight forward open coded AG walks
to use these iterators as possible. Anything that is not a direct
conversion to an iterator is ignored and will be updated in future
commits.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: move perag structure and setup to libxfs/xfs_ag.[ch]
Move the xfs_perag infrastructure to the libxfs files that contain
all the per AG infrastructure. This helps set up for passing perags
around all the code instead of bare agnos with minimal extra
includes for existing files.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: prepare for moving perag definitions and support to libxfs
The perag structures really need to be defined with the rest of the
AG support infrastructure. The struct xfs_perag and init/teardown
has been placed in xfs_mount.[ch] because there are differences in
the structure between kernel and userspace. Mainly that userspace
doesn't have a lot of the internal stuff that the kernel has for
caches and discard and other such structures.
However, it makes more sense to move this to libxfs than to keep
this separation because we are now moving to use struct perags
everywhere in the code instead of passing raw agnumber_t values
about. Hence we shoudl really move the support infrastructure to
libxfs/xfs_ag.[ch].
To do this without breaking userspace, first we need to rearrange
the structures and code so that all the kernel specific code is
located together. This makes it simple for userspace to ifdef out
the all the parts it does not need, minimising the code differences
between kernel and userspace. The next commit will do the move...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Wed, 2 Jun 2021 00:48:24 +0000 (10:48 +1000)]
xfs: move xfs_perag_get/put to xfs_ag.[ch]
They are AG functions, not superblock functions, so move them to the
appropriate location.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Mon, 31 May 2021 18:31:56 +0000 (11:31 -0700)]
xfs: remove unnecessary shifts
The superblock verifier already validates that (1 << blocklog) ==
blocksize, so use the value directly instead of doing math.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Darrick J. Wong [Mon, 31 May 2021 18:31:56 +0000 (11:31 -0700)]
xfs: clean up open-coded fs block unit conversions
Replace some open-coded fs block unit conversions with the standard
conversion macro.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Dave Chinner [Tue, 1 Jun 2021 03:40:36 +0000 (13:40 +1000)]
xfs: move page freeing into _xfs_buf_free_pages()
Rather than open coding it just before we call
_xfs_buf_free_pages(). Also, rename the function to
xfs_buf_free_pages() as the leading underscore has no useful
meaning.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 1 Jun 2021 03:40:36 +0000 (13:40 +1000)]
xfs: merge _xfs_buf_get_pages()
Only called from one place now, so merge it into
xfs_buf_alloc_pages(). Because page array allocation is dependent on
bp->b_pages being null, always ensure that when the pages array is
freed we always set bp->b_pages to null.
Also convert the page array to use kmalloc() rather than
kmem_alloc() so we can use the gfp flags we've already calculated
for the allocation context instead of hard coding KM_NOFS semantics.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 1 Jun 2021 03:40:36 +0000 (13:40 +1000)]
xfs: use alloc_pages_bulk_array() for buffers
Because it's more efficient than allocating pages one at a time in a
loop.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 1 Jun 2021 03:40:35 +0000 (13:40 +1000)]
xfs: use xfs_buf_alloc_pages for uncached buffers
Use the newly factored out page allocation code. This adds
automatic buffer zeroing for non-read uncached buffers.
This also allows us to greatly simply the error handling in
xfs_buf_get_uncached(). Because xfs_buf_alloc_pages() cleans up
partial allocation failure, we can just call xfs_buf_free() in all
error cases now to clean up after failures.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 1 Jun 2021 03:40:02 +0000 (13:40 +1000)]
xfs: split up xfs_buf_allocate_memory
Based on a patch from Christoph Hellwig.
This splits out the heap allocation and page allocation portions of
the buffer memory allocation into two separate helper functions.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Linus Torvalds [Sun, 30 May 2021 21:58:25 +0000 (11:58 -1000)]
Linux 5.13-rc4
Linus Torvalds [Sun, 30 May 2021 04:24:00 +0000 (18:24 -1000)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"This is a bit larger than usual at rc4 time. The reason is due to
Lee's work of fixing newly reported build warnings.
The rest is fixes as usual"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (22 commits)
MAINTAINERS: adjust to removing i2c designware platform data
i2c: s3c2410: fix possible NULL pointer deref on read message after write
i2c: mediatek: Disable i2c start_en and clear intr_stat brfore reset
i2c: i801: Don't generate an interrupt on bus reset
i2c: mpc: implement erratum A-004447 workaround
powerpc/fsl: set fsl,i2c-erratum-
a004447 flag for P1010 i2c controllers
powerpc/fsl: set fsl,i2c-erratum-
a004447 flag for P2041 i2c controllers
dt-bindings: i2c: mpc: Add fsl,i2c-erratum-
a004447 flag
i2c: busses: i2c-stm32f4: Remove incorrectly placed ' ' from function name
i2c: busses: i2c-st: Fix copy/paste function misnaming issues
i2c: busses: i2c-pnx: Provide descriptions for 'alg_data' data structure
i2c: busses: i2c-ocores: Place the expected function names into the documentation headers
i2c: busses: i2c-eg20t: Fix 'bad line' issue and provide description for 'msgs' param
i2c: busses: i2c-designware-master: Fix misnaming of 'i2c_dw_init_master()'
i2c: busses: i2c-cadence: Fix incorrectly documented 'enum cdns_i2c_slave_mode'
i2c: busses: i2c-ali1563: File headers are not good candidates for kernel-doc
i2c: muxes: i2c-arb-gpio-challenge: Demote non-conformant kernel-doc headers
i2c: busses: i2c-nomadik: Fix formatting issue pertaining to 'timeout'
i2c: sh_mobile: Use new clock calculation formulas for RZ/G2E
i2c: I2C_HISI should depend on ACPI
...
Linus Torvalds [Sun, 30 May 2021 04:16:09 +0000 (18:16 -1000)]
Merge tag 'seccomp-fixes-v5.13-rc4' of git://git./linux/kernel/git/kees/linux
Pull seccomp fixes from Kees Cook:
"This fixes a hard-to-hit race condition in the addfd user_notif
feature of seccomp, visible since v5.9.
And a small documentation fix"
* tag 'seccomp-fixes-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
seccomp: Refactor notification handler to prepare for new semantics
Documentation: seccomp: Fix user notification documentation
Linus Torvalds [Sun, 30 May 2021 04:10:10 +0000 (18:10 -1000)]
Merge tag 'riscv-for-linus-5.13-rc4' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"A handful of RISC-V related fixes:
- avoid errors when the stack tracing code is tracing itself.
- resurrect the memtest= kernel command line argument on RISC-V,
which was briefly enabled during the merge window before a
refactoring disabled it.
- build fix and some warning cleanups"
* tag 'riscv-for-linus-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: kexec: Fix W=1 build warnings
riscv: kprobes: Fix build error when MMU=n
riscv: Select ARCH_USE_MEMTEST
riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled
Linus Torvalds [Sun, 30 May 2021 03:47:19 +0000 (17:47 -1000)]
Merge tag 'xfs-5.13-fixes-3' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"This week's pile mitigates some decades-old problems in how extent
size hints interact with realtime volumes, fixes some failures in
online shrink, and fixes a problem where directory and symlink
shrinking on extremely fragmented filesystems could fail.
The most user-notable change here is to point users at our (new) IRC
channel on OFTC. Freedom isn't free, it costs folks like you and me;
and if you don't kowtow, they'll expel everyone and take over your
channel. (Ok, ok, that didn't fit the song lyrics...)
Summary:
- Fix a bug where unmapping operations end earlier than expected,
which can cause chaos on multi-block directory and symlink shrink
operations.
- Fix an erroneous assert that can trigger if we try to transition a
bmap structure from btree format to extents format with zero
extents. This was exposed by xfs/538"
* tag 'xfs-5.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: bunmapi has unnecessary AG lock ordering issues
xfs: btree format inode forks can have zero extents
xfs: add new IRC channel to MAINTAINERS
xfs: validate extsz hints against rt extent size when rtinherit is set
xfs: standardize extent size hint validation
xfs: check free AG space when making per-AG reservations
Sargun Dhillon [Mon, 17 May 2021 19:39:06 +0000 (12:39 -0700)]
seccomp: Refactor notification handler to prepare for new semantics
This refactors the user notification code to have a do / while loop around
the completion condition. This has a small change in semantic, in that
previously we ignored addfd calls upon wakeup if the notification had been
responded to, but instead with the new change we check for an outstanding
addfd calls prior to returning to userspace.
Rodrigo Campos also identified a bug that can result in addfd causing
an early return, when the supervisor didn't actually handle the
syscall [1].
[1]: https://lore.kernel.org/lkml/
20210413160151.3301-1-rodrigo@kinvolk.io/
Fixes: 7cf97b125455 ("seccomp: Introduce addfd ioctl to seccomp user notifier")
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Rodrigo Campos <rodrigo@kinvolk.io>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210517193908.3113-3-sargun@sargun.me
Linus Torvalds [Sat, 29 May 2021 16:55:55 +0000 (06:55 -1000)]
Merge tag 'thermal-v5.13-rc4' of git://git./linux/kernel/git/thermal/linux
Pull thermal fixes from Daniel Lezcano:
- Fix uninitialized error code value for the SPMI adc driver (Yang
Yingliang)
- Fix kernel doc warning (Yang Li)
- Fix wrong read-write thermal trip point initialization (Srinivas
Pandruvada)
* tag 'thermal-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/drivers/qcom: Fix error code in adc_tm5_get_dt_channel_data()
thermal/ti-soc-thermal: Fix kernel-doc
thermal/drivers/intel: Initialize RW trip to THERMAL_TEMP_INVALID
Linus Torvalds [Sat, 29 May 2021 16:41:50 +0000 (06:41 -1000)]
Merge tag 'char-misc-5.13-rc4' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some tiny char/misc driver fixes for 5.13-rc4.
Nothing huge here, just some tiny fixes for reported issues:
- two interconnect driver fixes
- kgdb build warning fix for gcc-11
- hgafb regression fix
- soundwire driver fix
- mei driver fix
All have been in linux-next with no reported issues"
* tag 'char-misc-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mei: request autosuspend after sending rx flow control
kgdb: fix gcc-11 warnings harder
video: hgafb: correctly handle card detect failure during probe
soundwire: qcom: fix handling of qcom,ports-block-pack-mode
interconnect: qcom: Add missing MODULE_DEVICE_TABLE
interconnect: qcom: bcm-voter: add a missing of_node_put()
Linus Torvalds [Sat, 29 May 2021 16:33:28 +0000 (06:33 -1000)]
Merge tag 'driver-core-5.13-rc4' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are three small driver core / debugfs fixes for 5.13-rc4:
- debugfs fix for incorrect "lockdown" mode for selinux accesses
- two device link changes, one bugfix and one cleanup
All of these have been in linux-next for over a week with no reported
problems"
* tag 'driver-core-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers: base: Reduce device link removal code duplication
drivers: base: Fix device link removal
debugfs: fix security_locked_down() call for SELinux
Linus Torvalds [Sat, 29 May 2021 16:29:13 +0000 (06:29 -1000)]
Merge tag 'staging-5.13-rc4' of git://git./linux/kernel/git/gregkh/staging
Pull staging and IIO driver fixes from Greg KH:
"Here are some small IIO and staging driver fixes for reported issues
for 5.13-rc4.
Nothing major here, tiny changes for reported problems, full details
are in the shortlog if people are curious.
All have been in linux-next for a while with no reported problems"
* tag 'staging-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: adc: ad7793: Add missing error code in ad7793_setup()
iio: adc: ad7923: Fix undersized rx buffer.
iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()
iio: dac: ad5770r: Put fwnode in error case during ->probe()
iio: gyro: fxas21002c: balance runtime power in error path
staging: emxx_udc: fix loop in _nbu2ss_nuke()
staging: iio: cdc: ad7746: avoid overwrite of num_channels
iio: adc: ad7192: handle regulator voltage error first
iio: adc: ad7192: Avoid disabling a clock that was never enabled.
iio: adc: ad7124: Fix potential overflow due to non sequential channel numbers
iio: adc: ad7124: Fix missbalanced regulator enable / disable on error.
Linus Torvalds [Sat, 29 May 2021 16:25:16 +0000 (06:25 -1000)]
Merge tag 'tty-5.13-rc4' of git://git./linux/kernel/git/gregkh/tty
Pull tty / serial driver fixes from Greg KH:
"Here are some small fixes for reported problems for tty and serial
drivers for 5.13-rc4.
They consist of:
- 8250 bugfixes and new device support
- lockdown security mode fixup
- syzbot found problems fixed
- 8250_omap fix for interrupt storm
- revert of 8250_omap driver fix as it caused worse problem than the
original issue
All but the last patch have been in linux-next for a while, the last
one is a revert of a problem found in linux-next with the 8250_omap
driver change"
* tag 'tty-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: 8250: 8250_omap: Fix possible interrupt storm"
serial: 8250_pci: handle FL_NOIRQ board flag
serial: rp2: use 'request_firmware' instead of 'request_firmware_nowait'
serial: 8250_pci: Add support for new HPE serial device
serial: 8250: 8250_omap: Fix possible interrupt storm
serial: 8250: Use BIT(x) for UART_{CAP,BUG}_*
serial: 8250: Add UART_BUG_TXRACE workaround for Aspeed VUART
serial: 8250_dw: Add device HID for new AMD UART controller
serial: sh-sci: Fix off-by-one error in FIFO threshold register setting
serial: core: fix suspicious security_locked_down() call
serial: tegra: Fix a mask operation that is always true
Linus Torvalds [Sat, 29 May 2021 16:11:21 +0000 (06:11 -1000)]
Merge tag 'usb-5.13-rc4' of git://git./linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt fixes from Greg KH:
"Here are a number of tiny USB and Thunderbolt driver fixes for
5.13-rc4.
They consist of:
- thunderbolt fixes for some NVM bound issues
- xhci fixes for reported problems
- control-request fixups
- documentation build warning fixes
- new usb-serial driver device ids
- typec bugfixes for reported issues
- usbfs warning fixups (could be triggered from userspace)
- other tiny fixes for reported problems.
All of these have been in linux-next with no reported issues"
* tag 'usb-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
xhci: Fix 5.12 regression of missing xHC cache clearing command after a Stall
xhci: fix giving back URB with incorrect status regression in 5.12
usb: gadget: udc: renesas_usb3: Fix a race in usb3_start_pipen()
usb: typec: tcpm: Respond Not_Supported if no snk_vdo
usb: typec: tcpm: Properly interrupt VDM AMS
USB: trancevibrator: fix control-request direction
usb: Restore the usb_header label
usb: typec: tcpm: Use LE to CPU conversion when accessing msg->header
usb: typec: ucsi: Clear pending after acking connector change
usb: typec: mux: Fix matching with typec_altmode_desc
misc/uss720: fix memory leak in uss720_probe
usb: dwc3: gadget: Properly track pending and queued SG
USB: usbfs: Don't WARN about excessively large memory allocations
thunderbolt: usb4: Fix NVM read buffer bounds and offset issue
thunderbolt: dma_port: Fix NVM read buffer bounds and offset issue
usb: chipidea: udc: assign interrupt number to USB gadget structure
usb: cdnsp: Fix lack of removing request from pending list.
usb: cdns3: Fix runtime PM imbalance on error
USB: serial: pl2303: add device id for ADLINK ND-6530 GC
USB: serial: ti_usb_3410_5052: add startech.com device id
...
Linus Torvalds [Sat, 29 May 2021 16:02:25 +0000 (06:02 -1000)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"ARM fixes:
- Another state update on exit to userspace fix
- Prevent the creation of mixed 32/64 VMs
- Fix regression with irqbypass not restarting the guest on failed
connect
- Fix regression with debug register decoding resulting in
overlapping access
- Commit exception state on exit to usrspace
- Fix the MMU notifier return values
- Add missing 'static' qualifiers in the new host stage-2 code
x86 fixes:
- fix guest missed wakeup with assigned devices
- fix WARN reported by syzkaller
- do not use BIT() in UAPI headers
- make the kvm_amd.avic parameter bool
PPC fixes:
- make halt polling heuristics consistent with other architectures
selftests:
- various fixes
- new performance selftest memslot_perf_test
- test UFFD minor faults in demand_paging_test"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
selftests: kvm: fix overlapping addresses in memslot_perf_test
KVM: X86: Kill off ctxt->ud
KVM: X86: Fix warning caused by stale emulation context
KVM: X86: Use kvm_get_linear_rip() in single-step and #DB/#BP interception
KVM: x86/mmu: Fix comment mentioning skip_4k
KVM: VMX: update vcpu posted-interrupt descriptor when assigning device
KVM: rename KVM_REQ_PENDING_TIMER to KVM_REQ_UNBLOCK
KVM: x86: add start_assignment hook to kvm_x86_ops
KVM: LAPIC: Narrow the timer latency between wait_lapic_expire and world switch
selftests: kvm: do only 1 memslot_perf_test run by default
KVM: X86: Use _BITUL() macro in UAPI headers
KVM: selftests: add shared hugetlbfs backing source type
KVM: selftests: allow using UFFD minor faults for demand paging
KVM: selftests: create alias mappings when using shared memory
KVM: selftests: add shmem backing source type
KVM: selftests: refactor vm_mem_backing_src_type flags
KVM: selftests: allow different backing source types
KVM: selftests: compute correct demand paging size
KVM: selftests: simplify setup_demand_paging error handling
KVM: selftests: Print a message if /dev/kvm is missing
...
Linus Torvalds [Sat, 29 May 2021 15:51:53 +0000 (05:51 -1000)]
Merge tag 's390-5.13-3' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
"Fix races in vfio-ccw request handling"
* tag 's390-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
vfio-ccw: Serialize FSM IDLE state with I/O completion
vfio-ccw: Reset FSM state to IDLE inside FSM
vfio-ccw: Check initialized flag in cp_init()
Paolo Bonzini [Fri, 28 May 2021 19:10:58 +0000 (15:10 -0400)]
selftests: kvm: fix overlapping addresses in memslot_perf_test
vm_create allocates memory and maps it close to GPA. This memory
is separate from what is allocated in subsequent calls to
vm_userspace_mem_region_add, so it is incorrect to pass the
test memory size to vm_create_default. Just pass a small
fixed amount of memory which can be used later for page table,
otherwise GPAs are already allocated at MEM_GPA and the
test aborts.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Sat, 29 May 2021 00:47:48 +0000 (14:47 -1000)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Ten small fixes, all in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: qla2xxx: Wait for stop_phase1 at WWN removal
scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq
scsi: vmw_pvscsi: Set correct residual data length
scsi: bnx2fc: Return failure if io_req is already in ABTS processing
scsi: aic7xxx: Remove multiple definition of globals
scsi: aic7xxx: Restore several defines for aic7xxx firmware build
scsi: target: iblock: Fix smp_processor_id() BUG messages
scsi: libsas: Use _safe() loop in sas_resume_port()
scsi: target: tcmu: Fix xarray RCU warning
scsi: target: core: Avoid smp_processor_id() in preemptible code
Linus Torvalds [Sat, 29 May 2021 00:42:37 +0000 (14:42 -1000)]
Merge tag 'block-5.13-2021-05-28' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request (Christoph):
- fix a memory leak in nvme_cdev_add (Guoqing Jiang)
- fix inline data size comparison in nvmet_tcp_queue_response (Hou
Pu)
- fix false keep-alive timeout when a controller is torn down
(Sagi Grimberg)
- fix a nvme-tcp Kconfig dependency (Sagi Grimberg)
- short-circuit reconnect retries for FC (Hannes Reinecke)
- decode host pathing error for connect (Hannes Reinecke)
- MD pull request (Song):
- Fix incorrect chunk boundary assert (Christoph)
- Fix s390/dasd verification panic (Stefan)
* tag 'block-5.13-2021-05-28' of git://git.kernel.dk/linux-block:
nvmet: fix false keep-alive timeout when a controller is torn down
nvmet-tcp: fix inline data size comparison in nvmet_tcp_queue_response
nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME
md/raid5: remove an incorrect assert in in_chunk_boundary
s390/dasd: add missing discipline function
nvme-fabrics: decode host pathing error for connect
nvme-fc: short-circuit reconnect retries
nvme: fix potential memory leaks in nvme_cdev_add
Linus Torvalds [Sat, 29 May 2021 00:35:55 +0000 (14:35 -1000)]
Merge tag 'io_uring-5.13-2021-05-28' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A few minor fixes:
- Fix an issue with hashed wait removal on exit (Zqiang, Pavel)
- Fix a recent data race introduced in this series (Marco)"
* tag 'io_uring-5.13-2021-05-28' of git://git.kernel.dk/linux-block:
io_uring: fix data race to avoid potential NULL-deref
io-wq: Fix UAF when wakeup wqe in hash waitqueue
io_uring/io-wq: close io-wq full-stop gap
Linus Torvalds [Sat, 29 May 2021 00:28:58 +0000 (14:28 -1000)]
Merge tag 'drm-fixes-2021-05-29' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Pretty quiet this week, couple of amdgpu, one i915, and a few misc otherwise.
ttm:
- prevent irrelevant swapout
amdgpu:
- MultiGPU fan fix
- VCN powergating fixes
amdkfd:
- Fix SDMA register offset error
meson:
- fix shutdown crash
i915:
- Re-enable LTTPR non-transparent LT mode for DPCD_REV < 1.4"
* tag 'drm-fixes-2021-05-29' of git://anongit.freedesktop.org/drm/drm:
drm/ttm: Skip swapout if ttm object is not populated
drm/i915: Reenable LTTPR non-transparent LT mode for DPCD_REV<1.4
drm/meson: fix shutdown crash when component not probed
drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate
drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gate
drm/amdgpu/jpeg2.0: add cancel_delayed_work_sync before power gate
drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gate
drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gate
drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gate
drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gate
drm/amdkfd: correct sienna_cichlid SDMA RLC register offset error
drm/amd/pm: correct MGpuFanBoost setting
Linus Torvalds [Sat, 29 May 2021 00:23:05 +0000 (14:23 -1000)]
Merge tag 'perf-tools-fixes-for-v5.13-2021-05-28' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix error checking of BPF prog attachment in 'perf stat'.
- Fix getting maximum number of fds in the vendor events JSON parser.
- Move debug initialization earlier, fixing a segfault in some cases.
- Fix eventcode of power10 JSON events.
* tag 'perf-tools-fixes-for-v5.13-2021-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf vendor events powerpc: Fix eventcode of power10 JSON events
perf stat: Fix error check for bpf_program__attach
perf debug: Move debug initialization earlier
perf jevents: Fix getting maximum number of fds
Linus Torvalds [Sat, 29 May 2021 00:15:47 +0000 (14:15 -1000)]
Merge tag '5.13-rc4-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three SMB3 fixes.
Two for stable, and the other fixes a problem pointed out with a
recently added ioctl"
* tag '5.13-rc4-smb3' of git://git.samba.org/sfrench/cifs-2.6:
cifs: change format of CIFS_FULL_KEY_DUMP ioctl
cifs: fix string declarations and assignments in tracepoints
cifs: set server->cipher_type to AES-128-CCM for SMB3.0
Linus Torvalds [Fri, 28 May 2021 18:53:19 +0000 (08:53 -1000)]
Merge tag 'nfs-for-5.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Stable fixes:
- Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
- Fix Oops in xs_tcp_send_request() when transport is disconnected
- Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
Bugfixes:
- Fix instances where signal_pending() should be fatal_signal_pending()
- fix an incorrect limit in filelayout_decode_layout()
- Fixes for the SUNRPC backlogged RPC queue
- Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
- Revert commit
586a0787ce35 ("Clean up rpcrdma_prepare_readch()")"
* tag 'nfs-for-5.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: Remove trailing semicolon in macros
xprtrdma: Revert
586a0787ce35
NFSv4: Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
NFS: Clean up reset of the mirror accounting variables
NFS: Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
NFS: Fix an Oopsable condition in __nfs_pageio_add_request()
SUNRPC: More fixes for backlog congestion
SUNRPC: Fix Oops in xs_tcp_send_request() when transport is disconnected
NFSv4: Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
SUNRPC in case of backlog, hand free slots directly to waiting task
pNFS/NFSv4: Remove redundant initialization of 'rd_size'
NFS: fix an incorrect limit in filelayout_decode_layout()
fs/nfs: Use fatal_signal_pending instead of signal_pending
Linus Torvalds [Fri, 28 May 2021 18:47:50 +0000 (08:47 -1000)]
Merge tag 'sound-5.13-rc4' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A slightly high volume at this time due to pending ASoC fixes.
While there are a few generic simple-card fixes for regressions, most
of the changes are device-specific fixes: ASoC Intel SOF, codec
clocks, other codec / platform fixes as well as usual HD-audio and
USB-audio"
* tag 'sound-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (37 commits)
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 17 G8
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 15 G8
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook G8
ALSA: hda/realtek: fix mute/micmute LEDs for HP 855 G8
ALSA: hda/realtek: Chain in pop reduction fixup for ThinkStation P340
ALSA: usb-audio: scarlett2: snd_scarlett_gen2_controls_create() can be static
ALSA: hda/realtek: the bass speaker can't output sound on Yoga 9i
ALSA: hda/realtek: Headphone volume is controlled by Front mixer
ALSA: usb-audio: scarlett2: Improve driver startup messages
ALSA: usb-audio: scarlett2: Fix device hang with ehci-pci
ALSA: usb-audio: fix control-request direction
ASoC: qcom: lpass-cpu: Use optional clk APIs
ASoC: cs35l33: fix an error code in probe()
ASoC: SOF: Intel: hda: don't send DAI_CONFIG IPC for older firmware
ASoC: fsl: fix SND_SOC_IMX_RPMSG dependency
ASoC: cs42l52: Minor tidy up of error paths
ASoC: cs35l32: Add missing regmap use_single config
ASoC: cs35l34: Add missing regmap use_single config
ASoC: cs42l73: Add missing regmap use_single config
ASoC: cs53l30: Add missing regmap use_single config
...
Linus Torvalds [Fri, 28 May 2021 18:31:48 +0000 (08:31 -1000)]
Merge tag 'clang-features-v5.13-rc4' of git://git./linux/kernel/git/kees/linux
Pull clang feature fixes from Kees Cook:
- Correctly pass stack frame size checking under LTO (Nick Desaulniers)
- Avoid CFI mismatches by checking initcall_t types (Marco Elver)
* tag 'clang-features-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
Makefile: LTO: have linker check -Wframe-larger-than
init: verify that function is initcall_t at compile-time
Linus Torvalds [Fri, 28 May 2021 18:24:13 +0000 (08:24 -1000)]
Merge tag 'mips-fixes_5.13_1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fix function/preempt trace hangs
- a few build fixes
* tag 'mips-fixes_5.13_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Fix kernel hang under FUNCTION_GRAPH_TRACER and PREEMPT_TRACER
MIPS: ralink: export rt_sysc_membase for rt2880_wdt.c
MIPS: launch.h: add include guard to prevent build errors
MIPS: alchemy: xxs1500: add gpio-au1000.h header file
Paolo Bonzini [Fri, 28 May 2021 17:02:03 +0000 (13:02 -0400)]
Merge tag 'kvmarm-fixes-5.13-2' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.13, take #2
- Another state update on exit to userspace fix
- Prevent the creation of mixed 32/64 VMs
Wanpeng Li [Fri, 28 May 2021 00:01:37 +0000 (17:01 -0700)]
KVM: X86: Kill off ctxt->ud
ctxt->ud is consumed only by x86_decode_insn(), we can kill it off by
passing emulation_type to x86_decode_insn() and dropping ctxt->ud
altogether. Tracking that info in ctxt for literally one call is silly.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
1622160097-37633-2-git-send-email-wanpengli@tencent.com>
Wanpeng Li [Fri, 28 May 2021 00:01:36 +0000 (17:01 -0700)]
KVM: X86: Fix warning caused by stale emulation context
Reported by syzkaller:
WARNING: CPU: 7 PID: 10526 at linux/arch/x86/kvm//x86.c:7621 x86_emulate_instruction+0x41b/0x510 [kvm]
RIP: 0010:x86_emulate_instruction+0x41b/0x510 [kvm]
Call Trace:
kvm_mmu_page_fault+0x126/0x8f0 [kvm]
vmx_handle_exit+0x11e/0x680 [kvm_intel]
vcpu_enter_guest+0xd95/0x1b40 [kvm]
kvm_arch_vcpu_ioctl_run+0x377/0x6a0 [kvm]
kvm_vcpu_ioctl+0x389/0x630 [kvm]
__x64_sys_ioctl+0x8e/0xd0
do_syscall_64+0x3c/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Commit
4a1e10d5b5d8 ("KVM: x86: handle hardware breakpoints during emulation())
adds hardware breakpoints check before emulation the instruction and parts of
emulation context initialization, actually we don't have the EMULTYPE_NO_DECODE flag
here and the emulation context will not be reused. Commit
c8848cee74ff ("KVM: x86:
set ctxt->have_exception in x86_decode_insn()) triggers the warning because it
catches the stale emulation context has #UD, however, it is not during instruction
decoding which should result in EMULATION_FAILED. This patch fixes it by moving
the second part emulation context initialization into init_emulate_ctxt() and
before hardware breakpoints check. The ctxt->ud will be dropped by a follow-up
patch.
syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=
134683fdd00000
Reported-by: syzbot+71271244f206d17f6441@syzkaller.appspotmail.com
Fixes: 4a1e10d5b5d8 (KVM: x86: handle hardware breakpoints during emulation)
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
1622160097-37633-1-git-send-email-wanpengli@tencent.com>
Yuan Yao [Wed, 26 May 2021 06:38:28 +0000 (14:38 +0800)]
KVM: X86: Use kvm_get_linear_rip() in single-step and #DB/#BP interception
The kvm_get_linear_rip() handles x86/long mode cases well and has
better readability, __kvm_set_rflags() also use the paired
function kvm_is_linear_rip() to check the vcpu->arch.singlestep_rip
set in kvm_arch_vcpu_ioctl_set_guest_debug(), so change the
"CS.BASE + RIP" code in kvm_arch_vcpu_ioctl_set_guest_debug() and
handle_exception_nmi() to this one.
Signed-off-by: Yuan Yao <yuan.yao@intel.com>
Message-Id: <
20210526063828.1173-1-yuan.yao@linux.intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sargun Dhillon [Mon, 17 May 2021 19:39:05 +0000 (12:39 -0700)]
Documentation: seccomp: Fix user notification documentation
The documentation had some previously incorrect information about how
userspace notifications (and responses) were handled due to a change
from a previously proposed patchset.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210517193908.3113-2-sargun@sargun.me
Lukas Bulwahn [Mon, 19 Apr 2021 06:18:09 +0000 (08:18 +0200)]
MAINTAINERS: adjust to removing i2c designware platform data
Commit
5a517b5bf687 ("i2c: designware: Get rid of legacy platform data")
removes ./include/linux/platform_data/i2c-designware.h, but misses to
adjust the SYNOPSYS DESIGNWARE I2C DRIVER section in MAINTAINERS.
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:
warning: no file matches F: include/linux/platform_data/i2c-designware.h
Remove the file entry to this removed file as well.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Kajol Jain [Tue, 25 May 2021 06:37:23 +0000 (12:07 +0530)]
perf vendor events powerpc: Fix eventcode of power10 JSON events
Fixed the eventcode values in the power10 JSON event files to prepend
"0x" since these are hexadecimal values.
The patch also changes the event description of the PM_EXEC_STALL_LOAD_FINISH
and PM_EXEC_STALL_NTC_FLUSH event and move some events to correct files.
Fixes: 32daa5d7899e ("perf vendor events: Initial JSON/events list for power10 platform")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Paul A. Clarke <pc@us.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20210525063723.1191514-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Greg Kroah-Hartman [Fri, 28 May 2021 08:58:49 +0000 (10:58 +0200)]
Revert "serial: 8250: 8250_omap: Fix possible interrupt storm"
This reverts commit
31fae7c8b18c3f8029a2a5dce97a3182c1a167a0.
Tony writes:
I just noticed this causes the following regression in Linux
next when pressing a key on uart console after boot at least on
omap3. This seems to happen on serial_port_in(port, UART_RX) in
the quirk handling.
So let's drop this.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/YLCCJzkkB4N7LTQS@atomide.com
Fixes: 31fae7c8b18c ("serial: 8250: 8250_omap: Fix possible interrupt storm")
Reported-by: Tony Lindgren <tony@atomide.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Krzysztof Kozlowski [Wed, 26 May 2021 12:39:37 +0000 (08:39 -0400)]
i2c: s3c2410: fix possible NULL pointer deref on read message after write
Interrupt handler processes multiple message write requests one after
another, till the driver message queue is drained. However if driver
encounters a read message without preceding START, it stops the I2C
transfer as it is an invalid condition for the controller. At least the
comment describes a requirement "the controller forces us to send a new
START when we change direction". This stop results in clearing the
message queue (i2c->msg = NULL).
The code however immediately jumped back to label "retry_write" which
dereferenced the "i2c->msg" making it a possible NULL pointer
dereference.
The Coverity analysis:
1. Condition !is_msgend(i2c), taking false branch.
if (!is_msgend(i2c)) {
2. Condition !is_lastmsg(i2c), taking true branch.
} else if (!is_lastmsg(i2c)) {
3. Condition i2c->msg->flags & 1, taking true branch.
if (i2c->msg->flags & I2C_M_RD) {
4. write_zero_model: Passing i2c to s3c24xx_i2c_stop, which sets i2c->msg to NULL.
s3c24xx_i2c_stop(i2c, -EINVAL);
5. Jumping to label retry_write.
goto retry_write;
6. var_deref_model: Passing i2c to is_msgend, which dereferences null i2c->msg.
if (!is_msgend(i2c)) {"
All previous calls to s3c24xx_i2c_stop() in this interrupt service
routine are followed by jumping to end of function (acknowledging
the interrupt and returning). This seems a reasonable choice also here
since message buffer was entirely emptied.
Addresses-Coverity: Explicit null dereferenced
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Qii Wang [Thu, 27 May 2021 12:04:04 +0000 (20:04 +0800)]
i2c: mediatek: Disable i2c start_en and clear intr_stat brfore reset
The i2c controller driver do dma reset after transfer timeout,
but sometimes dma reset will trigger an unexpected DMA_ERR irq.
It will cause the i2c controller to continuously send interrupts
to the system and cause soft lock-up. So we need to disable i2c
start_en and clear intr_stat to stop i2c controller before dma
reset when transfer timeout.
Fixes: aafced673c06("i2c: mediatek: move dma reset before i2c reset")
Signed-off-by: Qii Wang <qii.wang@mediatek.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Dave Airlie [Fri, 28 May 2021 03:28:18 +0000 (13:28 +1000)]
Merge tag 'drm-intel-fixes-2021-05-27' of ssh://git.freedesktop.org/git/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.13-rc4:
- Re-enable LTTPR non-transparent LT mode for DPCD_REV<1.4
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/875yz4bnmj.fsf@intel.com