mm/slub: directly load freelist from cpu partial slab in the likely case
authorChengming Zhou <zhouchengming@bytedance.com>
Tue, 23 Jan 2024 09:33:29 +0000 (09:33 +0000)
committerVlastimil Babka <vbabka@suse.cz>
Tue, 23 Jan 2024 10:43:40 +0000 (11:43 +0100)
The likely case is that we get a usable slab from the cpu partial list,
we can directly load freelist from it and return back, instead of going
the other way that need more work, like reenable interrupt and recheck.

But we need to remove the "VM_BUG_ON(!new.frozen)" in get_freelist()
for reusing it, since cpu partial slab is not frozen. It seems
acceptable since it's only for debug purpose.

And get_freelist() also assumes it can return NULL if the freelist is
empty, which is not possible for the cpu partial slab case, so we
add "VM_BUG_ON(!freelist)" after get_freelist() to make it explicit.

There is some small performance improvement too, which shows by:
perf bench sched messaging -g 5 -t -l 100000

            mm-stable   slub-optimize
Total time      7.473    7.209

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
mm/slub.c

index 2ef88bbf56a37f6e8266a8627a9aef2a71e5bf12..fda402b2d649e797404589fc4419820ee6b8a561 100644 (file)
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3326,7 +3326,6 @@ static inline void *get_freelist(struct kmem_cache *s, struct slab *slab)
                counters = slab->counters;
 
                new.counters = counters;
-               VM_BUG_ON(!new.frozen);
 
                new.inuse = slab->objects;
                new.frozen = freelist != NULL;
@@ -3498,18 +3497,20 @@ new_slab:
 
                slab = slub_percpu_partial(c);
                slub_set_percpu_partial(c, slab);
-               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
-               stat(s, CPU_PARTIAL_ALLOC);
 
-               if (unlikely(!node_match(slab, node) ||
-                            !pfmemalloc_match(slab, gfpflags))) {
-                       slab->next = NULL;
-                       __put_partials(s, slab);
-                       continue;
+               if (likely(node_match(slab, node) &&
+                          pfmemalloc_match(slab, gfpflags))) {
+                       c->slab = slab;
+                       freelist = get_freelist(s, slab);
+                       VM_BUG_ON(!freelist);
+                       stat(s, CPU_PARTIAL_ALLOC);
+                       goto load_freelist;
                }
 
-               freelist = freeze_slab(s, slab);
-               goto retry_load_slab;
+               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
+
+               slab->next = NULL;
+               __put_partials(s, slab);
        }
 #endif