From: Jonathan Neuschäfer Date: Sun, 8 Mar 2020 21:14:43 +0000 (+0100) Subject: docs: Move Intel Many Integrated Core documentation (mic) under misc-devices X-Git-Url: http://git.maquefel.me/?a=commitdiff_plain;h=19796c348ab62287fc6434f296ae279d7b97e39f;p=linux.git docs: Move Intel Many Integrated Core documentation (mic) under misc-devices It doesn't need to be a top-level chapter. This patch also updates MAINTAINERS and makes sure the F: lines are properly sorted. Signed-off-by: Jonathan Neuschäfer Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20200308211519.8414-1-j.neuschaefer@gmx.net Signed-off-by: Jonathan Corbet --- diff --git a/Documentation/index.rst b/Documentation/index.rst index e99d0bd2589d8..6fdad61ee443c 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -131,7 +131,6 @@ needed). usb/index PCI/index misc-devices/index - mic/index scheduler/index Architecture-agnostic documentation diff --git a/Documentation/mic/index.rst b/Documentation/mic/index.rst deleted file mode 100644 index 3a8d06367ef12..0000000000000 --- a/Documentation/mic/index.rst +++ /dev/null @@ -1,16 +0,0 @@ -============================================= -Intel Many Integrated Core (MIC) architecture -============================================= - -.. toctree:: - :maxdepth: 1 - - mic_overview - scif_overview - -.. only:: subproject and html - - Indices - ======= - - * :ref:`genindex` diff --git a/Documentation/mic/mic_overview.rst b/Documentation/mic/mic_overview.rst deleted file mode 100644 index 17d956bdaf7c0..0000000000000 --- a/Documentation/mic/mic_overview.rst +++ /dev/null @@ -1,85 +0,0 @@ -====================================================== -Intel Many Integrated Core (MIC) architecture overview -====================================================== - -An Intel MIC X100 device is a PCIe form factor add-in coprocessor -card based on the Intel Many Integrated Core (MIC) architecture -that runs a Linux OS. It is a PCIe endpoint in a platform and therefore -implements the three required standard address spaces i.e. configuration, -memory and I/O. The host OS loads a device driver as is typical for -PCIe devices. The card itself runs a bootstrap after reset that -transfers control to the card OS downloaded from the host driver. The -host driver supports OSPM suspend and resume operations. It shuts down -the card during suspend and reboots the card OS during resume. -The card OS as shipped by Intel is a Linux kernel with modifications -for the X100 devices. - -Since it is a PCIe card, it does not have the ability to host hardware -devices for networking, storage and console. We provide these devices -on X100 coprocessors thus enabling a self-bootable equivalent -environment for applications. A key benefit of our solution is that it -leverages the standard virtio framework for network, disk and console -devices, though in our case the virtio framework is used across a PCIe -bus. A Virtio Over PCIe (VOP) driver allows creating user space -backends or devices on the host which are used to probe virtio drivers -for these devices on the MIC card. The existing VRINGH infrastructure -in the kernel is used to access virtio rings from the host. The card -VOP driver allows card virtio drivers to communicate with their user -space backends on the host via a device page. Ring 3 apps on the host -can add, remove and configure virtio devices. A thin MIC specific -virtio_config_ops is implemented which is borrowed heavily from -previous similar implementations in lguest and s390. - -MIC PCIe card has a dma controller with 8 channels. These channels are -shared between the host s/w and the card s/w. 0 to 3 are used by host -and 4 to 7 by card. As the dma device doesn't show up as PCIe device, -a virtual bus called mic bus is created and virtual dma devices are -created on it by the host/card drivers. On host the channels are private -and used only by the host driver to transfer data for the virtio devices. - -The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a -low level communications API across PCIe currently implemented for MIC. -More details are available at scif_overview.txt. - -The Coprocessor State Management (COSM) driver on the host allows for -boot, shutdown and reset of Intel MIC devices. It communicates with a COSM -"client" driver on the MIC cards over SCIF to perform these functions. - -Here is a block diagram of the various components described above. The -virtio backends are situated on the host rather than the card given better -single threaded performance for the host compared to MIC, the ability of -the host to initiate DMA's to/from the card using the MIC DMA engine and -the fact that the virtio block storage backend can only be on the host:: - - +----------+ | +----------+ - | Card OS | | | Host OS | - +----------+ | +----------+ - | - +-------+ +--------+ +------+ | +---------+ +--------+ +--------+ - | Virtio| |Virtio | |Virtio| | |Virtio | |Virtio | |Virtio | - | Net | |Console | |Block | | |Net | |Console | |Block | - | Driver| |Driver | |Driver| | |backend | |backend | |backend | - +---+---+ +---+----+ +--+---+ | +---------+ +----+---+ +--------+ - | | | | | | | - | | | |User | | | - | | | |------|------------|--+------|------- - +---------+---------+ |Kernel | - | | | - +---------+ +---+----+ +------+ | +------+ +------+ +--+---+ +-------+ - |MIC DMA | | VOP | | SCIF | | | SCIF | | COSM | | VOP | |MIC DMA| - +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+ - | | | | | | | - +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+ - |MIC | | VOP | |SCIF | | |SCIF | | COSM | | VOP | | MIC | - |HW Bus | | HW Bus| |HW Bus| | |HW Bus| | Bus | |HW Bus| |HW Bus | - +---------+ +--------+ +--+---+ | +--+---+ +------+ +------+ +-------+ - | | | | | | | - | +-----------+--+ | | | +---------------+ | - | |Intel MIC | | | | |Intel MIC | | - | |Card Driver | | | | |Host Driver | | - +---+--------------+------+ | +----+---------------+-----+ - | | | - +-------------------------------------------------------------+ - | | - | PCIe Bus | - +-------------------------------------------------------------+ diff --git a/Documentation/mic/scif_overview.rst b/Documentation/mic/scif_overview.rst deleted file mode 100644 index 4c8ad9e437064..0000000000000 --- a/Documentation/mic/scif_overview.rst +++ /dev/null @@ -1,108 +0,0 @@ -======================================== -Symmetric Communication Interface (SCIF) -======================================== - -The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low -level communications API across PCIe currently implemented for MIC. Currently -SCIF provides inter-node communication within a single host platform, where a -node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of -communicating over the PCIe bus while providing an API that is symmetric -across all the nodes in the PCIe network. An important design objective for SCIF -is to deliver the maximum possible performance given the communication -abilities of the hardware. SCIF has been used to implement an offload compiler -runtime and OFED support for MPI implementations for MIC coprocessors. - -SCIF API Components -=================== - -The SCIF API has the following parts: - -1. Connection establishment using a client server model -2. Byte stream messaging intended for short messages -3. Node enumeration to determine online nodes -4. Poll semantics for detection of incoming connections and messages -5. Memory registration to pin down pages -6. Remote memory mapping for low latency CPU accesses via mmap -7. Remote DMA (RDMA) for high bandwidth DMA transfers -8. Fence APIs for RDMA synchronization - -SCIF exposes the notion of a connection which can be used by peer processes on -nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A -process in a SCIF node initiates a SCIF connection to a peer process on a -different node via a SCIF "endpoint". SCIF endpoints support messaging APIs -which are similar to connection oriented socket APIs. Connected SCIF endpoints -can also register local memory which is followed by data transfer using either -DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and -kernel mode clients which are functionally equivalent. - -SCIF Performance for MIC -======================== - -DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus -SCIF shows the performance advantages of SCIF for HPC applications and -runtimes:: - - Comparison of TCP and SCIF based BW - - Throughput (GB/sec) - 8 + PCIe Bandwidth ****** - + TCP ###### - 7 + ************************************** SCIF %%%%%% - | %%%%%%%%%%%%%%%%%%% - 6 + %%%% - | %% - | %%% - 5 + %% - | %% - 4 + %% - | %% - 3 + %% - | % - 2 + %% - | %% - | % - 1 + - + ###################################### - 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+- - 1 10 100 1000 10000 100000 - Transfer Size (KBytes) - -SCIF allows memory sharing via mmap(..) between processes on different PCIe -nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap -latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs. - -SCIF has a user space library which is a thin IOCTL wrapper providing a user -space API similar to the kernel API in scif.h. The SCIF user space library -is distributed @ https://software.intel.com/en-us/mic-developer - -Here is some pseudo code for an example of how two applications on two PCIe -nodes would typically use the SCIF API:: - - Process A (on node A) Process B (on node B) - - /* get online node information */ - scif_get_node_ids(..) scif_get_node_ids(..) - scif_open(..) scif_open(..) - scif_bind(..) scif_bind(..) - scif_listen(..) - scif_accept(..) scif_connect(..) - /* SCIF connection established */ - - /* Send and receive short messages */ - scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..) - - /* Register memory */ - scif_register(..) scif_register(..) - - /* RDMA */ - scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..) - - /* Fence DMAs */ - scif_fence_signal(..) scif_fence_signal(..) - - mmap(..) mmap(..) - - /* Access remote registered memory */ - - /* Close the endpoints */ - scif_close(..) scif_close(..) diff --git a/Documentation/misc-devices/index.rst b/Documentation/misc-devices/index.rst index f11c5daeada54..c1dcd26289118 100644 --- a/Documentation/misc-devices/index.rst +++ b/Documentation/misc-devices/index.rst @@ -20,4 +20,5 @@ fit into other categories. isl29003 lis3lv02d max6875 + mic/index xilinx_sdfec diff --git a/Documentation/misc-devices/mic/index.rst b/Documentation/misc-devices/mic/index.rst new file mode 100644 index 0000000000000..3a8d06367ef12 --- /dev/null +++ b/Documentation/misc-devices/mic/index.rst @@ -0,0 +1,16 @@ +============================================= +Intel Many Integrated Core (MIC) architecture +============================================= + +.. toctree:: + :maxdepth: 1 + + mic_overview + scif_overview + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/misc-devices/mic/mic_overview.rst b/Documentation/misc-devices/mic/mic_overview.rst new file mode 100644 index 0000000000000..17d956bdaf7c0 --- /dev/null +++ b/Documentation/misc-devices/mic/mic_overview.rst @@ -0,0 +1,85 @@ +====================================================== +Intel Many Integrated Core (MIC) architecture overview +====================================================== + +An Intel MIC X100 device is a PCIe form factor add-in coprocessor +card based on the Intel Many Integrated Core (MIC) architecture +that runs a Linux OS. It is a PCIe endpoint in a platform and therefore +implements the three required standard address spaces i.e. configuration, +memory and I/O. The host OS loads a device driver as is typical for +PCIe devices. The card itself runs a bootstrap after reset that +transfers control to the card OS downloaded from the host driver. The +host driver supports OSPM suspend and resume operations. It shuts down +the card during suspend and reboots the card OS during resume. +The card OS as shipped by Intel is a Linux kernel with modifications +for the X100 devices. + +Since it is a PCIe card, it does not have the ability to host hardware +devices for networking, storage and console. We provide these devices +on X100 coprocessors thus enabling a self-bootable equivalent +environment for applications. A key benefit of our solution is that it +leverages the standard virtio framework for network, disk and console +devices, though in our case the virtio framework is used across a PCIe +bus. A Virtio Over PCIe (VOP) driver allows creating user space +backends or devices on the host which are used to probe virtio drivers +for these devices on the MIC card. The existing VRINGH infrastructure +in the kernel is used to access virtio rings from the host. The card +VOP driver allows card virtio drivers to communicate with their user +space backends on the host via a device page. Ring 3 apps on the host +can add, remove and configure virtio devices. A thin MIC specific +virtio_config_ops is implemented which is borrowed heavily from +previous similar implementations in lguest and s390. + +MIC PCIe card has a dma controller with 8 channels. These channels are +shared between the host s/w and the card s/w. 0 to 3 are used by host +and 4 to 7 by card. As the dma device doesn't show up as PCIe device, +a virtual bus called mic bus is created and virtual dma devices are +created on it by the host/card drivers. On host the channels are private +and used only by the host driver to transfer data for the virtio devices. + +The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a +low level communications API across PCIe currently implemented for MIC. +More details are available at scif_overview.txt. + +The Coprocessor State Management (COSM) driver on the host allows for +boot, shutdown and reset of Intel MIC devices. It communicates with a COSM +"client" driver on the MIC cards over SCIF to perform these functions. + +Here is a block diagram of the various components described above. The +virtio backends are situated on the host rather than the card given better +single threaded performance for the host compared to MIC, the ability of +the host to initiate DMA's to/from the card using the MIC DMA engine and +the fact that the virtio block storage backend can only be on the host:: + + +----------+ | +----------+ + | Card OS | | | Host OS | + +----------+ | +----------+ + | + +-------+ +--------+ +------+ | +---------+ +--------+ +--------+ + | Virtio| |Virtio | |Virtio| | |Virtio | |Virtio | |Virtio | + | Net | |Console | |Block | | |Net | |Console | |Block | + | Driver| |Driver | |Driver| | |backend | |backend | |backend | + +---+---+ +---+----+ +--+---+ | +---------+ +----+---+ +--------+ + | | | | | | | + | | | |User | | | + | | | |------|------------|--+------|------- + +---------+---------+ |Kernel | + | | | + +---------+ +---+----+ +------+ | +------+ +------+ +--+---+ +-------+ + |MIC DMA | | VOP | | SCIF | | | SCIF | | COSM | | VOP | |MIC DMA| + +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+ + | | | | | | | + +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+ + |MIC | | VOP | |SCIF | | |SCIF | | COSM | | VOP | | MIC | + |HW Bus | | HW Bus| |HW Bus| | |HW Bus| | Bus | |HW Bus| |HW Bus | + +---------+ +--------+ +--+---+ | +--+---+ +------+ +------+ +-------+ + | | | | | | | + | +-----------+--+ | | | +---------------+ | + | |Intel MIC | | | | |Intel MIC | | + | |Card Driver | | | | |Host Driver | | + +---+--------------+------+ | +----+---------------+-----+ + | | | + +-------------------------------------------------------------+ + | | + | PCIe Bus | + +-------------------------------------------------------------+ diff --git a/Documentation/misc-devices/mic/scif_overview.rst b/Documentation/misc-devices/mic/scif_overview.rst new file mode 100644 index 0000000000000..4c8ad9e437064 --- /dev/null +++ b/Documentation/misc-devices/mic/scif_overview.rst @@ -0,0 +1,108 @@ +======================================== +Symmetric Communication Interface (SCIF) +======================================== + +The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low +level communications API across PCIe currently implemented for MIC. Currently +SCIF provides inter-node communication within a single host platform, where a +node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of +communicating over the PCIe bus while providing an API that is symmetric +across all the nodes in the PCIe network. An important design objective for SCIF +is to deliver the maximum possible performance given the communication +abilities of the hardware. SCIF has been used to implement an offload compiler +runtime and OFED support for MPI implementations for MIC coprocessors. + +SCIF API Components +=================== + +The SCIF API has the following parts: + +1. Connection establishment using a client server model +2. Byte stream messaging intended for short messages +3. Node enumeration to determine online nodes +4. Poll semantics for detection of incoming connections and messages +5. Memory registration to pin down pages +6. Remote memory mapping for low latency CPU accesses via mmap +7. Remote DMA (RDMA) for high bandwidth DMA transfers +8. Fence APIs for RDMA synchronization + +SCIF exposes the notion of a connection which can be used by peer processes on +nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A +process in a SCIF node initiates a SCIF connection to a peer process on a +different node via a SCIF "endpoint". SCIF endpoints support messaging APIs +which are similar to connection oriented socket APIs. Connected SCIF endpoints +can also register local memory which is followed by data transfer using either +DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and +kernel mode clients which are functionally equivalent. + +SCIF Performance for MIC +======================== + +DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus +SCIF shows the performance advantages of SCIF for HPC applications and +runtimes:: + + Comparison of TCP and SCIF based BW + + Throughput (GB/sec) + 8 + PCIe Bandwidth ****** + + TCP ###### + 7 + ************************************** SCIF %%%%%% + | %%%%%%%%%%%%%%%%%%% + 6 + %%%% + | %% + | %%% + 5 + %% + | %% + 4 + %% + | %% + 3 + %% + | % + 2 + %% + | %% + | % + 1 + + + ###################################### + 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+- + 1 10 100 1000 10000 100000 + Transfer Size (KBytes) + +SCIF allows memory sharing via mmap(..) between processes on different PCIe +nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap +latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs. + +SCIF has a user space library which is a thin IOCTL wrapper providing a user +space API similar to the kernel API in scif.h. The SCIF user space library +is distributed @ https://software.intel.com/en-us/mic-developer + +Here is some pseudo code for an example of how two applications on two PCIe +nodes would typically use the SCIF API:: + + Process A (on node A) Process B (on node B) + + /* get online node information */ + scif_get_node_ids(..) scif_get_node_ids(..) + scif_open(..) scif_open(..) + scif_bind(..) scif_bind(..) + scif_listen(..) + scif_accept(..) scif_connect(..) + /* SCIF connection established */ + + /* Send and receive short messages */ + scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..) + + /* Register memory */ + scif_register(..) scif_register(..) + + /* RDMA */ + scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..) + + /* Fence DMAs */ + scif_fence_signal(..) scif_fence_signal(..) + + mmap(..) mmap(..) + + /* Access remote registered memory */ + + /* Close the endpoints */ + scif_close(..) scif_close(..) diff --git a/MAINTAINERS b/MAINTAINERS index 38fe2f3f7b6f2..083fcf1a151c2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8569,15 +8569,15 @@ M: Ashutosh Dixit S: Supported W: https://github.com/sudeepdutt/mic W: http://software.intel.com/en-us/mic-developer +F: Documentation/misc-devices/mic/ +F: drivers/dma/mic_x100_dma.c +F: drivers/dma/mic_x100_dma.h +F: drivers/misc/mic/ F: include/linux/mic_bus.h F: include/linux/scif.h F: include/uapi/linux/mic_common.h F: include/uapi/linux/mic_ioctl.h F: include/uapi/linux/scif_ioctl.h -F: drivers/misc/mic/ -F: drivers/dma/mic_x100_dma.c -F: drivers/dma/mic_x100_dma.h -F: Documentation/mic/ INTEL PMC CORE DRIVER M: Rajneesh Bhardwaj