From: Peter Maydell Date: Mon, 14 Oct 2024 16:05:54 +0000 (+0100) Subject: docs/devel/rcu: Convert to rST format X-Git-Url: http://git.maquefel.me/?a=commitdiff_plain;h=90655d815a202104721e342009c1357101a04985;p=qemu.git docs/devel/rcu: Convert to rST format Convert docs/devel/rcu.txt to rST format. Signed-off-by: Peter Maydell Message-id: 20240816132212.3602106-6-peter.maydell@linaro.org --- diff --git a/MAINTAINERS b/MAINTAINERS index c622433983..bf62247872 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3059,7 +3059,7 @@ Read, Copy, Update (RCU) M: Paolo Bonzini S: Maintained F: docs/devel/lockcnt.rst -F: docs/devel/rcu.txt +F: docs/devel/rcu.rst F: include/qemu/rcu*.h F: tests/unit/rcutorture.c F: tests/unit/test-rcu-*.c diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index 88fa0e9450..ab9fbc4482 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -8,6 +8,7 @@ Details about QEMU's various subsystems including how to add features to them. qom atomics + rcu block-coroutine-wrapper clocks ebpf_rss diff --git a/docs/devel/rcu.rst b/docs/devel/rcu.rst new file mode 100644 index 0000000000..dd07c1d919 --- /dev/null +++ b/docs/devel/rcu.rst @@ -0,0 +1,394 @@ +Using RCU (Read-Copy-Update) for synchronization +================================================ + +Read-copy update (RCU) is a synchronization mechanism that is used to +protect read-mostly data structures. RCU is very efficient and scalable +on the read side (it is wait-free), and thus can make the read paths +extremely fast. + +RCU supports concurrency between a single writer and multiple readers, +thus it is not used alone. Typically, the write-side will use a lock to +serialize multiple updates, but other approaches are possible (e.g., +restricting updates to a single task). In QEMU, when a lock is used, +this will often be the "iothread mutex", also known as the "big QEMU +lock" (BQL). Also, restricting updates to a single task is done in +QEMU using the "bottom half" API. + +RCU is fundamentally a "wait-to-finish" mechanism. The read side marks +sections of code with "critical sections", and the update side will wait +for the execution of all *currently running* critical sections before +proceeding, or before asynchronously executing a callback. + +The key point here is that only the currently running critical sections +are waited for; critical sections that are started **after** the beginning +of the wait do not extend the wait, despite running concurrently with +the updater. This is the reason why RCU is more scalable than, +for example, reader-writer locks. It is so much more scalable that +the system will have a single instance of the RCU mechanism; a single +mechanism can be used for an arbitrary number of "things", without +having to worry about things such as contention or deadlocks. + +How is this possible? The basic idea is to split updates in two phases, +"removal" and "reclamation". During removal, we ensure that subsequent +readers will not be able to get a reference to the old data. After +removal has completed, a critical section will not be able to access +the old data. Therefore, critical sections that begin after removal +do not matter; as soon as all previous critical sections have finished, +there cannot be any readers who hold references to the data structure, +and these can now be safely reclaimed (e.g., freed or unref'ed). + +Here is a picture:: + + thread 1 thread 2 thread 3 + ------------------- ------------------------ ------------------- + enter RCU crit.sec. + | finish removal phase + | begin wait + | | enter RCU crit.sec. + exit RCU crit.sec | | + complete wait | + begin reclamation phase | + exit RCU crit.sec. + + +Note how thread 3 is still executing its critical section when thread 2 +starts reclaiming data. This is possible, because the old version of the +data structure was not accessible at the time thread 3 began executing +that critical section. + + +RCU API +------- + +The core RCU API is small: + +``void rcu_read_lock(void);`` + Used by a reader to inform the reclaimer that the reader is + entering an RCU read-side critical section. + +``void rcu_read_unlock(void);`` + Used by a reader to inform the reclaimer that the reader is + exiting an RCU read-side critical section. Note that RCU + read-side critical sections may be nested and/or overlapping. + +``void synchronize_rcu(void);`` + Blocks until all pre-existing RCU read-side critical sections + on all threads have completed. This marks the end of the removal + phase and the beginning of reclamation phase. + + Note that it would be valid for another update to come while + ``synchronize_rcu`` is running. Because of this, it is better that + the updater releases any locks it may hold before calling + ``synchronize_rcu``. If this is not possible (for example, because + the updater is protected by the BQL), you can use ``call_rcu``. + +``void call_rcu1(struct rcu_head * head, void (*func)(struct rcu_head *head));`` + This function invokes ``func(head)`` after all pre-existing RCU + read-side critical sections on all threads have completed. This + marks the end of the removal phase, with func taking care + asynchronously of the reclamation phase. + + The ``foo`` struct needs to have an ``rcu_head`` structure added, + perhaps as follows:: + + struct foo { + struct rcu_head rcu; + int a; + char b; + long c; + }; + + so that the reclaimer function can fetch the ``struct foo`` address + and free it:: + + call_rcu1(&foo.rcu, foo_reclaim); + + void foo_reclaim(struct rcu_head *rp) + { + struct foo *fp = container_of(rp, struct foo, rcu); + g_free(fp); + } + + ``call_rcu1`` is typically used via either the ``call_rcu`` or + ``g_free_rcu`` macros, which handle the common case where the + ``rcu_head`` member is the first of the struct. + +``void call_rcu(T *p, void (*func)(T *p), field-name);`` + If the ``struct rcu_head`` is the first field in the struct, you can + use this macro instead of ``call_rcu1``. + +``void g_free_rcu(T *p, field-name);`` + This is a special-case version of ``call_rcu`` where the callback + function is ``g_free``. + In the example given in ``call_rcu1``, one could have written simply:: + + g_free_rcu(&foo, rcu); + +``typeof(*p) qatomic_rcu_read(p);`` + ``qatomic_rcu_read()`` is similar to ``qatomic_load_acquire()``, but + it makes some assumptions on the code that calls it. This allows a + more optimized implementation. + + ``qatomic_rcu_read`` assumes that whenever a single RCU critical + section reads multiple shared data, these reads are either + data-dependent or need no ordering. This is almost always the + case when using RCU, because read-side critical sections typically + navigate one or more pointers (the pointers that are changed on + every update) until reaching a data structure of interest, + and then read from there. + + RCU read-side critical sections must use ``qatomic_rcu_read()`` to + read data, unless concurrent writes are prevented by another + synchronization mechanism. + + Furthermore, RCU read-side critical sections should traverse the + data structure in a single direction, opposite to the direction + in which the updater initializes it. + +``void qatomic_rcu_set(p, typeof(*p) v);`` + ``qatomic_rcu_set()`` is similar to ``qatomic_store_release()``, + though it also makes assumptions on the code that calls it in + order to allow a more optimized implementation. + + In particular, ``qatomic_rcu_set()`` suffices for synchronization + with readers, if the updater never mutates a field within a + data item that is already accessible to readers. This is the + case when initializing a new copy of the RCU-protected data + structure; just ensure that initialization of ``*p`` is carried out + before ``qatomic_rcu_set()`` makes the data item visible to readers. + If this rule is observed, writes will happen in the opposite + order as reads in the RCU read-side critical sections (or if + there is just one update), and there will be no need for other + synchronization mechanism to coordinate the accesses. + +The following APIs must be used before RCU is used in a thread: + +``void rcu_register_thread(void);`` + Mark a thread as taking part in the RCU mechanism. Such a thread + will have to report quiescent points regularly, either manually + or through the ``QemuCond``/``QemuSemaphore``/``QemuEvent`` APIs. + +``void rcu_unregister_thread(void);`` + Mark a thread as not taking part anymore in the RCU mechanism. + It is not a problem if such a thread reports quiescent points, + either manually or by using the + ``QemuCond``/``QemuSemaphore``/``QemuEvent`` APIs. + +Note that these APIs are relatively heavyweight, and should **not** be +nested. + +Convenience macros +------------------ + +Two macros are provided that automatically release the read lock at the +end of the scope. + +``RCU_READ_LOCK_GUARD()`` + Takes the lock and will release it at the end of the block it's + used in. + +``WITH_RCU_READ_LOCK_GUARD() { code }`` + Is used at the head of a block to protect the code within the block. + +Note that a ``goto`` out of the guarded block will also drop the lock. + +Differences with Linux +---------------------- + +- Waiting on a mutex is possible, though discouraged, within an RCU critical + section. This is because spinlocks are rarely (if ever) used in userspace + programming; not allowing this would prevent upgrading an RCU read-side + critical section to become an updater. + +- ``qatomic_rcu_read`` and ``qatomic_rcu_set`` replace ``rcu_dereference`` and + ``rcu_assign_pointer``. They take a **pointer** to the variable being accessed. + +- ``call_rcu`` is a macro that has an extra argument (the name of the first + field in the struct, which must be a struct ``rcu_head``), and expects the + type of the callback's argument to be the type of the first argument. + ``call_rcu1`` is the same as Linux's ``call_rcu``. + + +RCU Patterns +------------ + +Many patterns using read-writer locks translate directly to RCU, with +the advantages of higher scalability and deadlock immunity. + +In general, RCU can be used whenever it is possible to create a new +"version" of a data structure every time the updater runs. This may +sound like a very strict restriction, however: + +- the updater does not mean "everything that writes to a data structure", + but rather "everything that involves a reclamation step". See the + array example below + +- in some cases, creating a new version of a data structure may actually + be very cheap. For example, modifying the "next" pointer of a singly + linked list is effectively creating a new version of the list. + +Here are some frequently-used RCU idioms that are worth noting. + + +RCU list processing +^^^^^^^^^^^^^^^^^^^ + +TBD (not yet used in QEMU) + + +RCU reference counting +^^^^^^^^^^^^^^^^^^^^^^ + +Because grace periods are not allowed to complete while there is an RCU +read-side critical section in progress, the RCU read-side primitives +may be used as a restricted reference-counting mechanism. For example, +consider the following code fragment:: + + rcu_read_lock(); + p = qatomic_rcu_read(&foo); + /* do something with p. */ + rcu_read_unlock(); + +The RCU read-side critical section ensures that the value of ``p`` remains +valid until after the ``rcu_read_unlock()``. In some sense, it is acquiring +a reference to ``p`` that is later released when the critical section ends. +The write side looks simply like this (with appropriate locking):: + + qemu_mutex_lock(&foo_mutex); + old = foo; + qatomic_rcu_set(&foo, new); + qemu_mutex_unlock(&foo_mutex); + synchronize_rcu(); + free(old); + +If the processing cannot be done purely within the critical section, it +is possible to combine this idiom with a "real" reference count:: + + rcu_read_lock(); + p = qatomic_rcu_read(&foo); + foo_ref(p); + rcu_read_unlock(); + /* do something with p. */ + foo_unref(p); + +The write side can be like this:: + + qemu_mutex_lock(&foo_mutex); + old = foo; + qatomic_rcu_set(&foo, new); + qemu_mutex_unlock(&foo_mutex); + synchronize_rcu(); + foo_unref(old); + +or with ``call_rcu``:: + + qemu_mutex_lock(&foo_mutex); + old = foo; + qatomic_rcu_set(&foo, new); + qemu_mutex_unlock(&foo_mutex); + call_rcu(foo_unref, old, rcu); + +In both cases, the write side only performs removal. Reclamation +happens when the last reference to a ``foo`` object is dropped. +Using ``synchronize_rcu()`` is undesirably expensive, because the +last reference may be dropped on the read side. Hence you can +use ``call_rcu()`` instead:: + + foo_unref(struct foo *p) { + if (qatomic_fetch_dec(&p->refcount) == 1) { + call_rcu(foo_destroy, p, rcu); + } + } + + +Note that the same idioms would be possible with reader/writer +locks:: + + read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock); + p = foo; p = foo; + /* do something with p. */ foo = new; + read_unlock(&foo_rwlock); free(p); + write_mutex_unlock(&foo_rwlock); + free(p); + + ------------------------------------------------------------------ + + read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock); + p = foo; old = foo; + foo_ref(p); foo = new; + read_unlock(&foo_rwlock); foo_unref(old); + /* do something with p. */ write_mutex_unlock(&foo_rwlock); + read_lock(&foo_rwlock); + foo_unref(p); + read_unlock(&foo_rwlock); + +``foo_unref`` could use a mechanism such as bottom halves to move deallocation +out of the write-side critical section. + + +RCU resizable arrays +^^^^^^^^^^^^^^^^^^^^ + +Resizable arrays can be used with RCU. The expensive RCU synchronization +(or ``call_rcu``) only needs to take place when the array is resized. +The two items to take care of are: + +- ensuring that the old version of the array is available between removal + and reclamation; + +- avoiding mismatches in the read side between the array data and the + array size. + +The first problem is avoided simply by not using ``realloc``. Instead, +each resize will allocate a new array and copy the old data into it. +The second problem would arise if the size and the data pointers were +two members of a larger struct:: + + struct mystuff { + ... + int data_size; + int data_alloc; + T *data; + ... + }; + +Instead, we store the size of the array with the array itself:: + + struct arr { + int size; + int alloc; + T data[]; + }; + struct arr *global_array; + + read side: + rcu_read_lock(); + struct arr *array = qatomic_rcu_read(&global_array); + x = i < array->size ? array->data[i] : -1; + rcu_read_unlock(); + return x; + + write side (running under a lock): + if (global_array->size == global_array->alloc) { + /* Creating a new version. */ + new_array = g_malloc(sizeof(struct arr) + + global_array->alloc * 2 * sizeof(T)); + new_array->size = global_array->size; + new_array->alloc = global_array->alloc * 2; + memcpy(new_array->data, global_array->data, + global_array->alloc * sizeof(T)); + + /* Removal phase. */ + old_array = global_array; + qatomic_rcu_set(&global_array, new_array); + synchronize_rcu(); + + /* Reclamation phase. */ + free(old_array); + } + + +References +---------- + +* The `Linux kernel RCU documentation `__ diff --git a/docs/devel/rcu.txt b/docs/devel/rcu.txt deleted file mode 100644 index 2e6cc607a1..0000000000 --- a/docs/devel/rcu.txt +++ /dev/null @@ -1,406 +0,0 @@ -Using RCU (Read-Copy-Update) for synchronization -================================================ - -Read-copy update (RCU) is a synchronization mechanism that is used to -protect read-mostly data structures. RCU is very efficient and scalable -on the read side (it is wait-free), and thus can make the read paths -extremely fast. - -RCU supports concurrency between a single writer and multiple readers, -thus it is not used alone. Typically, the write-side will use a lock to -serialize multiple updates, but other approaches are possible (e.g., -restricting updates to a single task). In QEMU, when a lock is used, -this will often be the "iothread mutex", also known as the "big QEMU -lock" (BQL). Also, restricting updates to a single task is done in -QEMU using the "bottom half" API. - -RCU is fundamentally a "wait-to-finish" mechanism. The read side marks -sections of code with "critical sections", and the update side will wait -for the execution of all *currently running* critical sections before -proceeding, or before asynchronously executing a callback. - -The key point here is that only the currently running critical sections -are waited for; critical sections that are started _after_ the beginning -of the wait do not extend the wait, despite running concurrently with -the updater. This is the reason why RCU is more scalable than, -for example, reader-writer locks. It is so much more scalable that -the system will have a single instance of the RCU mechanism; a single -mechanism can be used for an arbitrary number of "things", without -having to worry about things such as contention or deadlocks. - -How is this possible? The basic idea is to split updates in two phases, -"removal" and "reclamation". During removal, we ensure that subsequent -readers will not be able to get a reference to the old data. After -removal has completed, a critical section will not be able to access -the old data. Therefore, critical sections that begin after removal -do not matter; as soon as all previous critical sections have finished, -there cannot be any readers who hold references to the data structure, -and these can now be safely reclaimed (e.g., freed or unref'ed). - -Here is a picture: - - thread 1 thread 2 thread 3 - ------------------- ------------------------ ------------------- - enter RCU crit.sec. - | finish removal phase - | begin wait - | | enter RCU crit.sec. - exit RCU crit.sec | | - complete wait | - begin reclamation phase | - exit RCU crit.sec. - - -Note how thread 3 is still executing its critical section when thread 2 -starts reclaiming data. This is possible, because the old version of the -data structure was not accessible at the time thread 3 began executing -that critical section. - - -RCU API -======= - -The core RCU API is small: - - void rcu_read_lock(void); - - Used by a reader to inform the reclaimer that the reader is - entering an RCU read-side critical section. - - void rcu_read_unlock(void); - - Used by a reader to inform the reclaimer that the reader is - exiting an RCU read-side critical section. Note that RCU - read-side critical sections may be nested and/or overlapping. - - void synchronize_rcu(void); - - Blocks until all pre-existing RCU read-side critical sections - on all threads have completed. This marks the end of the removal - phase and the beginning of reclamation phase. - - Note that it would be valid for another update to come while - synchronize_rcu is running. Because of this, it is better that - the updater releases any locks it may hold before calling - synchronize_rcu. If this is not possible (for example, because - the updater is protected by the BQL), you can use call_rcu. - - void call_rcu1(struct rcu_head * head, - void (*func)(struct rcu_head *head)); - - This function invokes func(head) after all pre-existing RCU - read-side critical sections on all threads have completed. This - marks the end of the removal phase, with func taking care - asynchronously of the reclamation phase. - - The foo struct needs to have an rcu_head structure added, - perhaps as follows: - - struct foo { - struct rcu_head rcu; - int a; - char b; - long c; - }; - - so that the reclaimer function can fetch the struct foo address - and free it: - - call_rcu1(&foo.rcu, foo_reclaim); - - void foo_reclaim(struct rcu_head *rp) - { - struct foo *fp = container_of(rp, struct foo, rcu); - g_free(fp); - } - - For the common case where the rcu_head member is the first of the - struct, you can use the following macro. - - void call_rcu(T *p, - void (*func)(T *p), - field-name); - void g_free_rcu(T *p, - field-name); - - call_rcu1 is typically used through these macro, in the common case - where the "struct rcu_head" is the first field in the struct. If - the callback function is g_free, in particular, g_free_rcu can be - used. In the above case, one could have written simply: - - g_free_rcu(&foo, rcu); - - typeof(*p) qatomic_rcu_read(p); - - qatomic_rcu_read() is similar to qatomic_load_acquire(), but it makes - some assumptions on the code that calls it. This allows a more - optimized implementation. - - qatomic_rcu_read assumes that whenever a single RCU critical - section reads multiple shared data, these reads are either - data-dependent or need no ordering. This is almost always the - case when using RCU, because read-side critical sections typically - navigate one or more pointers (the pointers that are changed on - every update) until reaching a data structure of interest, - and then read from there. - - RCU read-side critical sections must use qatomic_rcu_read() to - read data, unless concurrent writes are prevented by another - synchronization mechanism. - - Furthermore, RCU read-side critical sections should traverse the - data structure in a single direction, opposite to the direction - in which the updater initializes it. - - void qatomic_rcu_set(p, typeof(*p) v); - - qatomic_rcu_set() is similar to qatomic_store_release(), though it also - makes assumptions on the code that calls it in order to allow a more - optimized implementation. - - In particular, qatomic_rcu_set() suffices for synchronization - with readers, if the updater never mutates a field within a - data item that is already accessible to readers. This is the - case when initializing a new copy of the RCU-protected data - structure; just ensure that initialization of *p is carried out - before qatomic_rcu_set() makes the data item visible to readers. - If this rule is observed, writes will happen in the opposite - order as reads in the RCU read-side critical sections (or if - there is just one update), and there will be no need for other - synchronization mechanism to coordinate the accesses. - -The following APIs must be used before RCU is used in a thread: - - void rcu_register_thread(void); - - Mark a thread as taking part in the RCU mechanism. Such a thread - will have to report quiescent points regularly, either manually - or through the QemuCond/QemuSemaphore/QemuEvent APIs. - - void rcu_unregister_thread(void); - - Mark a thread as not taking part anymore in the RCU mechanism. - It is not a problem if such a thread reports quiescent points, - either manually or by using the QemuCond/QemuSemaphore/QemuEvent - APIs. - -Note that these APIs are relatively heavyweight, and should _not_ be -nested. - -Convenience macros -================== - -Two macros are provided that automatically release the read lock at the -end of the scope. - - RCU_READ_LOCK_GUARD() - - Takes the lock and will release it at the end of the block it's - used in. - - WITH_RCU_READ_LOCK_GUARD() { code } - - Is used at the head of a block to protect the code within the block. - -Note that 'goto'ing out of the guarded block will also drop the lock. - -DIFFERENCES WITH LINUX -====================== - -- Waiting on a mutex is possible, though discouraged, within an RCU critical - section. This is because spinlocks are rarely (if ever) used in userspace - programming; not allowing this would prevent upgrading an RCU read-side - critical section to become an updater. - -- qatomic_rcu_read and qatomic_rcu_set replace rcu_dereference and - rcu_assign_pointer. They take a _pointer_ to the variable being accessed. - -- call_rcu is a macro that has an extra argument (the name of the first - field in the struct, which must be a struct rcu_head), and expects the - type of the callback's argument to be the type of the first argument. - call_rcu1 is the same as Linux's call_rcu. - - -RCU PATTERNS -============ - -Many patterns using read-writer locks translate directly to RCU, with -the advantages of higher scalability and deadlock immunity. - -In general, RCU can be used whenever it is possible to create a new -"version" of a data structure every time the updater runs. This may -sound like a very strict restriction, however: - -- the updater does not mean "everything that writes to a data structure", - but rather "everything that involves a reclamation step". See the - array example below - -- in some cases, creating a new version of a data structure may actually - be very cheap. For example, modifying the "next" pointer of a singly - linked list is effectively creating a new version of the list. - -Here are some frequently-used RCU idioms that are worth noting. - - -RCU list processing -------------------- - -TBD (not yet used in QEMU) - - -RCU reference counting ----------------------- - -Because grace periods are not allowed to complete while there is an RCU -read-side critical section in progress, the RCU read-side primitives -may be used as a restricted reference-counting mechanism. For example, -consider the following code fragment: - - rcu_read_lock(); - p = qatomic_rcu_read(&foo); - /* do something with p. */ - rcu_read_unlock(); - -The RCU read-side critical section ensures that the value of "p" remains -valid until after the rcu_read_unlock(). In some sense, it is acquiring -a reference to p that is later released when the critical section ends. -The write side looks simply like this (with appropriate locking): - - qemu_mutex_lock(&foo_mutex); - old = foo; - qatomic_rcu_set(&foo, new); - qemu_mutex_unlock(&foo_mutex); - synchronize_rcu(); - free(old); - -If the processing cannot be done purely within the critical section, it -is possible to combine this idiom with a "real" reference count: - - rcu_read_lock(); - p = qatomic_rcu_read(&foo); - foo_ref(p); - rcu_read_unlock(); - /* do something with p. */ - foo_unref(p); - -The write side can be like this: - - qemu_mutex_lock(&foo_mutex); - old = foo; - qatomic_rcu_set(&foo, new); - qemu_mutex_unlock(&foo_mutex); - synchronize_rcu(); - foo_unref(old); - -or with call_rcu: - - qemu_mutex_lock(&foo_mutex); - old = foo; - qatomic_rcu_set(&foo, new); - qemu_mutex_unlock(&foo_mutex); - call_rcu(foo_unref, old, rcu); - -In both cases, the write side only performs removal. Reclamation -happens when the last reference to a "foo" object is dropped. -Using synchronize_rcu() is undesirably expensive, because the -last reference may be dropped on the read side. Hence you can -use call_rcu() instead: - - foo_unref(struct foo *p) { - if (qatomic_fetch_dec(&p->refcount) == 1) { - call_rcu(foo_destroy, p, rcu); - } - } - - -Note that the same idioms would be possible with reader/writer -locks: - - read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock); - p = foo; p = foo; - /* do something with p. */ foo = new; - read_unlock(&foo_rwlock); free(p); - write_mutex_unlock(&foo_rwlock); - free(p); - - ------------------------------------------------------------------ - - read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock); - p = foo; old = foo; - foo_ref(p); foo = new; - read_unlock(&foo_rwlock); foo_unref(old); - /* do something with p. */ write_mutex_unlock(&foo_rwlock); - read_lock(&foo_rwlock); - foo_unref(p); - read_unlock(&foo_rwlock); - -foo_unref could use a mechanism such as bottom halves to move deallocation -out of the write-side critical section. - - -RCU resizable arrays --------------------- - -Resizable arrays can be used with RCU. The expensive RCU synchronization -(or call_rcu) only needs to take place when the array is resized. -The two items to take care of are: - -- ensuring that the old version of the array is available between removal - and reclamation; - -- avoiding mismatches in the read side between the array data and the - array size. - -The first problem is avoided simply by not using realloc. Instead, -each resize will allocate a new array and copy the old data into it. -The second problem would arise if the size and the data pointers were -two members of a larger struct: - - struct mystuff { - ... - int data_size; - int data_alloc; - T *data; - ... - }; - -Instead, we store the size of the array with the array itself: - - struct arr { - int size; - int alloc; - T data[]; - }; - struct arr *global_array; - - read side: - rcu_read_lock(); - struct arr *array = qatomic_rcu_read(&global_array); - x = i < array->size ? array->data[i] : -1; - rcu_read_unlock(); - return x; - - write side (running under a lock): - if (global_array->size == global_array->alloc) { - /* Creating a new version. */ - new_array = g_malloc(sizeof(struct arr) + - global_array->alloc * 2 * sizeof(T)); - new_array->size = global_array->size; - new_array->alloc = global_array->alloc * 2; - memcpy(new_array->data, global_array->data, - global_array->alloc * sizeof(T)); - - /* Removal phase. */ - old_array = global_array; - qatomic_rcu_set(&global_array, new_array); - synchronize_rcu(); - - /* Reclamation phase. */ - free(old_array); - } - - -SOURCES -======= - -* Documentation/RCU/ from the Linux kernel