From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 26 May 2022 19:32:41 +0000 (-0700)
Subject: Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git... 
X-Git-Url: http://git.maquefel.me/?a=commitdiff_plain;h=98931dd95fd489fcbfa97da563505a6f071d7c77;p=linux.git

Merge tag 'mm-stable-2022-05-25' of git://git./linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Almost all of MM here. A few things are still getting finished off,
  reviewed, etc.

   - Yang Shi has improved the behaviour of khugepaged collapsing of
     readonly file-backed transparent hugepages.

   - Johannes Weiner has arranged for zswap memory use to be tracked and
     managed on a per-cgroup basis.

   - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
     runtime enablement of the recent huge page vmemmap optimization
     feature.

   - Baolin Wang contributes a series to fix some issues around hugetlb
     pagetable invalidation.

   - Zhenwei Pi has fixed some interactions between hwpoisoned pages and
     virtualization.

   - Tong Tiangen has enabled the use of the presently x86-only
     page_table_check debugging feature on arm64 and riscv.

   - David Vernet has done some fixup work on the memcg selftests.

   - Peter Xu has taught userfaultfd to handle write protection faults
     against shmem- and hugetlbfs-backed files.

   - More DAMON development from SeongJae Park - adding online tuning of
     the feature and support for monitoring of fixed virtual address
     ranges. Also easier discovery of which monitoring operations are
     available.

   - Nadav Amit has done some optimization of TLB flushing during
     mprotect().

   - Neil Brown continues to labor away at improving our swap-over-NFS
     support.

   - David Hildenbrand has some fixes to anon page COWing versus
     get_user_pages().

   - Peng Liu fixed some errors in the core hugetlb code.

   - Joao Martins has reduced the amount of memory consumed by
     device-dax's compound devmaps.

   - Some cleanups of the arch-specific pagemap code from Anshuman
     Khandual.

   - Muchun Song has found and fixed some errors in the TLB flushing of
     transparent hugepages.

   - Roman Gushchin has done more work on the memcg selftests.

  ... and, of course, many smaller fixes and cleanups. Notably, the
  customary million cleanup serieses from Miaohe Lin"

* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
  mm: kfence: use PAGE_ALIGNED helper
  selftests: vm: add the "settings" file with timeout variable
  selftests: vm: add "test_hmm.sh" to TEST_FILES
  selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
  selftests: vm: add migration to the .gitignore
  selftests/vm/pkeys: fix typo in comment
  ksm: fix typo in comment
  selftests: vm: add process_mrelease tests
  Revert "mm/vmscan: never demote for memcg reclaim"
  mm/kfence: print disabling or re-enabling message
  include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
  include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
  mm: fix a potential infinite loop in start_isolate_page_range()
  MAINTAINERS: add Muchun as co-maintainer for HugeTLB
  zram: fix Kconfig dependency warning
  mm/shmem: fix shmem folio swapoff hang
  cgroup: fix an error handling path in alloc_pagecache_max_30M()
  mm: damon: use HPAGE_PMD_SIZE
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  nodemask.h: fix compilation error with GCC12
  ...
---

98931dd95fd489fcbfa97da563505a6f071d7c77
diff --cc Documentation/admin-guide/kernel-parameters.txt
index a9066cfb85a04,7e079b6994fe6..32073f873662f
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@@ -1694,18 -1659,8 +1694,18 @@@
  			Documentation/admin-guide/mm/hugetlbpage.rst.
  			Format: size[KMG]
  
 +	hugetlb_cma=	[HW,CMA] The size of a CMA area used for allocation
 +			of gigantic hugepages. Or using node format, the size
 +			of a CMA area per node can be specified.
 +			Format: nn[KMGTPE] or (node format)
 +				<node>:nn[KMGTPE][,<node>:nn[KMGTPE]]
 +
 +			Reserve a CMA area of given size and allocate gigantic
 +			hugepages using the CMA allocator. If enabled, the
 +			boot-time allocation of gigantic hugepages is skipped.
 +
  	hugetlb_free_vmemmap=
- 			[KNL] Reguires CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
+ 			[KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
  			enabled.
  			Allows heavy hugetlb users to free up some more
  			memory (7 * PAGE_SIZE for each 2MB hugetlb page).
diff --cc Documentation/filesystems/locking.rst
index 515bc48ab58b4,ef2efa1cb4ec5..d1bf77ef3bc16
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@@ -258,19 -258,20 +258,20 @@@ prototypes:
  	int (*launder_folio)(struct folio *);
  	bool (*is_partially_uptodate)(struct folio *, size_t from, size_t count);
  	int (*error_remove_page)(struct address_space *, struct page *);
- 	int (*swap_activate)(struct file *);
+ 	int (*swap_activate)(struct swap_info_struct *sis, struct file *f, sector_t *span)
  	int (*swap_deactivate)(struct file *);
+ 	int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);
  
  locking rules:
 -	All except dirty_folio and freepage may block
 +	All except dirty_folio and free_folio may block
  
  ======================	======================== =========	===============
 -ops			PageLocked(page)	 i_rwsem	invalidate_lock
 +ops			folio locked		 i_rwsem	invalidate_lock
  ======================	======================== =========	===============
  writepage:		yes, unlocks (see below)
 -readpage:		yes, unlocks				shared
 +read_folio:		yes, unlocks				shared
  writepages:
 -dirty_folio		maybe
 +dirty_folio:		maybe
  readahead:		yes, unlocks				shared
  write_begin:		locks the page		 exclusive
  write_end:		yes, unlocks		 exclusive
@@@ -287,15 -288,16 +288,16 @@@ is_partially_uptodate:	ye
  error_remove_page:	yes
  swap_activate:		no
  swap_deactivate:	no
+ swap_rw:		yes, unlocks
  ======================	======================== =========	===============
  
 -->write_begin(), ->write_end() and ->readpage() may be called from
 +->write_begin(), ->write_end() and ->read_folio() may be called from
  the request handler (/dev/loop).
  
 -->readpage() unlocks the page, either synchronously or via I/O
 +->read_folio() unlocks the folio, either synchronously or via I/O
  completion.
  
 -->readahead() unlocks the pages that I/O is attempted on like ->readpage().
 +->readahead() unlocks the folios that I/O is attempted on like ->read_folio().
  
  ->writepage() is used for two purposes: for "memory cleansing" and for
  "sync".  These are quite different operations and the behaviour may differ
diff --cc Documentation/filesystems/vfs.rst
index 12a011d2cbc6b,d4bd213d215b4..08069ecd49a6e
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@@ -747,10 -747,11 +747,11 @@@ cache in your filesystem.  The followin
  
  		bool (*is_partially_uptodate) (struct folio *, size_t from,
  					       size_t count);
 -		void (*is_dirty_writeback) (struct page *, bool *, bool *);
 +		void (*is_dirty_writeback)(struct folio *, bool *, bool *);
  		int (*error_remove_page) (struct mapping *mapping, struct page *page);
- 		int (*swap_activate)(struct file *);
+ 		int (*swap_activate)(struct swap_info_struct *sis, struct file *f, sector_t *span)
  		int (*swap_deactivate)(struct file *);
+ 		int (*swap_rw)(struct kiocb *iocb, struct iov_iter *iter);
  	};
  
  ``writepage``
diff --cc arch/riscv/Kconfig
index 52a4b0bdb5024,a2285067f2fcc..c0853f1474a70
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@@ -38,8 -38,8 +38,9 @@@ config RISC
  	select ARCH_SUPPORTS_ATOMIC_RMW
  	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
  	select ARCH_SUPPORTS_HUGETLBFS if MMU
+ 	select ARCH_SUPPORTS_PAGE_TABLE_CHECK
  	select ARCH_USE_MEMTEST
 +	select ARCH_USE_QUEUED_RWLOCKS
  	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
  	select ARCH_WANT_FRAME_POINTERS
  	select ARCH_WANT_GENERAL_HUGETLB
diff --cc arch/x86/mm/Makefile
index d957dc15b3712,fb6b41a48ae55..f8220fd2c169a
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@@ -20,7 -20,7 +20,7 @@@ CFLAGS_REMOVE_mem_encrypt_identity.o	= 
  endif
  
  obj-y				:=  init.o init_$(BITS).o fault.o ioremap.o extable.o mmap.o \
- 				    pgtable.o physaddr.o tlb.o cpu_entry_area.o maccess.o
 -				    pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o maccess.o pgprot.o
++				    pgtable.o physaddr.o tlb.o cpu_entry_area.o maccess.o pgprot.o
  
  obj-y				+= pat/
  
diff --cc fs/nfs/file.c
index d764b3ce79050,bfb4b707b07ea..6f5425e89ca63
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@@ -522,8 -537,7 +534,7 @@@ const struct address_space_operations n
  	.write_begin = nfs_write_begin,
  	.write_end = nfs_write_end,
  	.invalidate_folio = nfs_invalidate_folio,
 -	.releasepage = nfs_release_page,
 +	.release_folio = nfs_release_folio,
- 	.direct_IO = nfs_direct_IO,
  #ifdef CONFIG_MIGRATION
  	.migratepage = nfs_migrate_page,
  #endif
diff --cc include/linux/highmem-internal.h
index 337bd9f329218,a694ca95c4ed5..cddb42ff04730
--- a/include/linux/highmem-internal.h
+++ b/include/linux/highmem-internal.h
@@@ -239,16 -234,23 +239,28 @@@ static inline void __kunmap_atomic(voi
  static inline unsigned int nr_free_highpages(void) { return 0; }
  static inline unsigned long totalhigh_pages(void) { return 0UL; }
  
 +static inline bool is_kmap_addr(const void *x)
 +{
 +	return false;
 +}
 +
  #endif /* CONFIG_HIGHMEM */
  
- /*
-  * Prevent people trying to call kunmap_atomic() as if it were kunmap()
-  * kunmap_atomic() should get the return value of kmap_atomic, not the page.
+ /**
+  * kunmap_atomic - Unmap the virtual address mapped by kmap_atomic() - deprecated!
+  * @__addr:       Virtual address to be unmapped
+  *
+  * Unmaps an address previously mapped by kmap_atomic() and re-enables
+  * pagefaults. Depending on PREEMP_RT configuration, re-enables also
+  * migration and preemption. Users should not count on these side effects.
+  *
+  * Mappings should be unmapped in the reverse order that they were mapped.
+  * See kmap_local_page() for details on nesting.
+  *
+  * @__addr can be any address within the mapped page, so there is no need
+  * to subtract any offset that has been added. In contrast to kunmap(),
+  * this function takes the address returned from kmap_atomic(), not the
+  * page passed to it. The compiler will warn you if you pass the page.
   */
  #define kunmap_atomic(__addr)					\
  do {								\
diff --cc include/linux/slab.h
index 58bb9392775d5,3d2f2a3ca17e5..0fefdf528e0d2
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@@ -216,10 -209,22 +216,22 @@@ void kmem_dump_obj(void *object)
  #define ARCH_SLAB_MINALIGN __alignof__(unsigned long long)
  #endif
  
+ /*
+  * Arches can define this function if they want to decide the minimum slab
+  * alignment at runtime. The value returned by the function must be a power
+  * of two and >= ARCH_SLAB_MINALIGN.
+  */
+ #ifndef arch_slab_minalign
+ static inline unsigned int arch_slab_minalign(void)
+ {
+ 	return ARCH_SLAB_MINALIGN;
+ }
+ #endif
+ 
  /*
 - * kmalloc and friends return ARCH_KMALLOC_MINALIGN aligned
 - * pointers. kmem_cache_alloc and friends return ARCH_SLAB_MINALIGN
 - * aligned pointers.
 + * kmem_cache_alloc and friends return pointers aligned to ARCH_SLAB_MINALIGN.
 + * kmalloc and friends return pointers aligned to both ARCH_KMALLOC_MINALIGN
 + * and ARCH_SLAB_MINALIGN, but here we only assume the former alignment.
   */
  #define __assume_kmalloc_alignment __assume_aligned(ARCH_KMALLOC_MINALIGN)
  #define __assume_slab_alignment __assume_aligned(ARCH_SLAB_MINALIGN)
diff --cc tools/testing/selftests/cgroup/cgroup_util.c
index 4297d580e3f82,b4d7027a44c3e..4c52cc6f2f9cc
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@@ -180,28 -176,13 +176,25 @@@ long cg_read_lc(const char *cgroup, con
  int cg_write(const char *cgroup, const char *control, char *buf)
  {
  	char path[PATH_MAX];
- 	ssize_t len = strlen(buf);
+ 	ssize_t len = strlen(buf), ret;
  
  	snprintf(path, sizeof(path), "%s/%s", cgroup, control);
- 
- 	if (write_text(path, buf, len) == len)
- 		return 0;
- 
- 	return -1;
+ 	ret = write_text(path, buf, len);
+ 	return ret == len ? 0 : ret;
  }
  
 +int cg_write_numeric(const char *cgroup, const char *control, long value)
 +{
 +	char buf[64];
 +	int ret;
 +
 +	ret = sprintf(buf, "%lu", value);
 +	if (ret < 0)
 +		return ret;
 +
 +	return cg_write(cgroup, control, buf);
 +}
 +
  int cg_find_unified_root(char *root, size_t len)
  {
  	char buf[10 * PAGE_SIZE];