sock: Doc behaviors for pressure heurisitics
authorAbel Wu <wuyun.abel@bytedance.com>
Thu, 19 Oct 2023 12:00:25 +0000 (20:00 +0800)
committerPaolo Abeni <pabeni@redhat.com>
Tue, 24 Oct 2023 08:38:30 +0000 (10:38 +0200)
There are now two accounting infrastructures for skmem, while the
heuristics in __sk_mem_raise_allocated() were actually introduced
before memcg was born.

Add some comments to clarify whether they can be applied to both
infrastructures or not.

Suggested-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231019120026.42215-2-wuyun.abel@bytedance.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
net/core/sock.c

index 43842520db865d6baf6f578c5145480707ad65a6..9f969e3c2ddfa3f8c2e98e6b2b3dbfc451408246 100644 (file)
@@ -3067,7 +3067,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
        if (allocated > sk_prot_mem_limits(sk, 2))
                goto suppress_allocation;
 
-       /* guarantee minimum buffer size under pressure */
+       /* Guarantee minimum buffer size under pressure (either global
+        * or memcg) to make sure features described in RFC 7323 (TCP
+        * Extensions for High Performance) work properly.
+        *
+        * This rule does NOT stand when exceeds global or memcg's hard
+        * limit, or else a DoS attack can be taken place by spawning
+        * lots of sockets whose usage are under minimum buffer size.
+        */
        if (kind == SK_MEM_RECV) {
                if (atomic_read(&sk->sk_rmem_alloc) < sk_get_rmem0(sk, prot))
                        return 1;
@@ -3088,6 +3095,11 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 
                if (!sk_under_memory_pressure(sk))
                        return 1;
+
+               /* Try to be fair among all the sockets under global
+                * pressure by allowing the ones that below average
+                * usage to raise.
+                */
                alloc = sk_sockets_allocated_read_positive(sk);
                if (sk_prot_mem_limits(sk, 2) > alloc *
                    sk_mem_pages(sk->sk_wmem_queued +