Mauro Carvalho Chehab [Thu, 5 Aug 2021 14:28:43 +0000 (16:28 +0200)]
Merge tag 'v5.14-rc4' into media_tree
Linux 5.14-rc4
* tag 'v5.14-rc4': (948 commits)
Linux 5.14-rc4
pipe: make pipe writes always wake up readers
Revert "perf map: Fix dso->nsinfo refcounting"
mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
slub: fix unreclaimable slab stat for bulk free
mm/migrate: fix NR_ISOLATED corruption on 64-bit
mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
ocfs2: issue zeroout to EOF blocks
ocfs2: fix zero out valid data
lib/test_string.c: move string selftest in the Runtime Testing menu
gve: Update MAINTAINERS list
arch: Kconfig: clean up obsolete use of HAVE_IDE
can: esd_usb2: fix memory leak
can: ems_usb: fix memory leak
can: usb_8dev: fix memory leak
can: mcba_usb_start(): add missing urb->transfer_dma initialization
can: hi311x: fix a signedness bug in hi3110_cmd()
MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
scsi: fas216: Fix fall-through warning for Clang
scsi: acornscsi: Fix fall-through warning for clang
...
Mansur Alisha Shaik [Thu, 29 Jul 2021 11:30:40 +0000 (13:30 +0200)]
media: venus: venc: add support for V4L2_CID_MPEG_VIDEO_H264_8X8_TRANSFORM control
Add support for V4L2_CID_MPEG_VIDEO_H264_8X8_TRANSFORM control for
H264 high profile and constrained high profile.
Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Stanimir Varbanov [Tue, 22 Jun 2021 11:39:58 +0000 (13:39 +0200)]
media: venus: venc: Add support for intra-refresh period
Add support for intra-refresh period v4l2 control and drop
cyclic intra-refresh macroblock control in the same time.
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Stanimir Varbanov [Tue, 22 Jun 2021 11:39:57 +0000 (13:39 +0200)]
media: v4l2-ctrls: Add intra-refresh period control
Add a control to set intra-refresh period.
Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Stanimir Varbanov [Tue, 22 Jun 2021 11:39:56 +0000 (13:39 +0200)]
media: docs: ext-ctrls-codec: Document cyclic intra-refresh zero control value
In all drivers _CYCLIC_INTRA_REFRESH_MB default control value is zero
which means that the macroblocks will not be intra-refreshed. Document
this _CYCLIC_INTRA_REFRESH_MB control behaviour in control description.
Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Mansur Alisha Shaik [Tue, 6 Jul 2021 16:21:22 +0000 (18:21 +0200)]
media: venus: helper: do not set constrained parameters for UBWC
Plane constraints firmware interface is to override the default
alignment for a given color format. By default venus hardware has
alignments as 128x32, but NV12 was defined differently to meet
various usecases. Compressed NV12 has always been aligned as 128x32,
hence not needed to override the default alignment.
Fixes: bc28936bbba9 ("media: venus: helpers, hfi, vdec: Set actual plane constraints to FW")
Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Colin Ian King [Fri, 9 Jul 2021 12:30:25 +0000 (14:30 +0200)]
media: venus: venc: Fix potential null pointer dereference on pointer fmt
Currently the call to find_format can potentially return a NULL to
fmt and the nullpointer is later dereferenced on the assignment of
pixmp->num_planes = fmt->num_planes. Fix this by adding a NULL pointer
check and returning NULL for the failure case.
Addresses-Coverity: ("Dereference null return")
Fixes: aaaa93eda64b ("[media] media: venus: venc: add video encoder files")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Zhen Lei [Wed, 7 Jul 2021 07:45:17 +0000 (09:45 +0200)]
media: venus: hfi: fix return value check in sys_get_prop_image_version()
In case of error, the function qcom_smem_get() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check
should be replaced with IS_ERR().
Fixes: d566e78dd6af ("media: venus : hfi: add venus image info into smem")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Evgeny Novikov [Wed, 28 Jul 2021 14:44:32 +0000 (16:44 +0200)]
media: tegra-cec: Handle errors of clk_prepare_enable()
tegra_cec_probe() and tegra_cec_resume() ignored possible errors of
clk_prepare_enable(). The patch fixes this.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Deborah Brouwer [Wed, 28 Jul 2021 05:56:16 +0000 (07:56 +0200)]
media: cec-pin: rename timer overrun variables
The cec pin timer overruns are measured in microseconds, but the variable
names include the millisecond symbol "ms". To avoid confusion, replace
"ms" with "us" in the variable names and printed status message.
Signed-off-by: Deborah Brouwer <deborahbrouwer3563@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Krzysztof Hałasa [Mon, 26 Jul 2021 10:49:30 +0000 (12:49 +0200)]
media: TDA1997x: report -ENOLINK after disconnecting HDMI source
The TD1997x chip retains vper, hper and hsper register values when the
HDMI source is disconnected. Use a different means of checking if the
link is still valid.
Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Krzysztof Hałasa [Mon, 26 Jul 2021 10:46:28 +0000 (12:46 +0200)]
media: TDA1997x: fix tda1997x_query_dv_timings() return value
Correctly propagate the tda1997x_detect_std error value.
Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Krzysztof Hałasa [Mon, 26 Jul 2021 10:42:49 +0000 (12:42 +0200)]
media: Fix cosmetic error in TDA1997x driver
The colon isn't followed by anything here.
Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Hans Verkuil [Fri, 23 Jul 2021 08:22:59 +0000 (10:22 +0200)]
media: v4l2-dv-timings.c: fix wrong condition in two for-loops
These for-loops should test against v4l2_dv_timings_presets[i].bt.width,
not if i < v4l2_dv_timings_presets[i].bt.width. Luckily nothing ever broke,
since the smallest width is still a lot higher than the total number of
presets, but it is wrong.
The last item in the presets array is all 0, so the for-loop must stop
when it reaches that sentinel.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martin Kepplinger [Wed, 28 Jul 2021 09:12:44 +0000 (11:12 +0200)]
media: imx: add a driver for i.MX8MQ mipi csi rx phy and controller
Add a driver to support the i.MX8MQ MIPI CSI receiver. The hardware side
is based on
https://source.codeaurora.org/external/imx/linux-imx/tree/drivers/media/platform/imx8/mxc-mipi-csi2_yav.c?h=imx_5.4.70_2.3.0
It's built as part of VIDEO_IMX7_CSI because that's documented to support
i.MX8M platforms. This driver adds i.MX8MQ support where currently only the
i.MX8MM platform has been supported.
Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martin Kepplinger [Wed, 28 Jul 2021 09:12:43 +0000 (11:12 +0200)]
media: dt-bindings: media: document the nxp,imx8mq-mipi-csi2 receiver phy and controller
The i.MX8MQ SoC integrates a different MIPI CSI receiver as the i.MX8MM so
describe the DT bindings for it.
Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Tom Rix [Mon, 31 May 2021 17:43:00 +0000 (19:43 +0200)]
media: imx: imx7_mipi_csis: convert some switch cases to the default
Static analysis reports this false positive
imx7-mipi-csis.c:1027:2: warning: 4th function call argument is
an uninitialized value
The variable 'align' is falsely reported as uninitialized.
Even though all the cases are covered in the
switch (csis_fmt->width % 8) {
Because there is no default case, it is reported as uninialized.
Improve the switch by converting the most numerous set of cases
to the default and silence the false positive.
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Laurent Pinchart [Tue, 11 May 2021 23:44:40 +0000 (01:44 +0200)]
media: imx: imx7-media-csi: Fix buffer return upon stream start failure
When the stream fails to start, the first two buffers in the queue have
been moved to the active_vb2_buf array and are returned to vb2 by
imx7_csi_dma_unsetup_vb2_buf(). The function is called with the buffer
state set to VB2_BUF_STATE_ERROR unconditionally, which is correct when
stopping the stream, but not when the start operation fails. In that
case, the state should be set to VB2_BUF_STATE_QUEUED. Fix it.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Laurent Pinchart [Wed, 3 Feb 2021 02:57:18 +0000 (03:57 +0100)]
media: imx: imx7-media-csi: Don't set PIXEL_BIT in CSICR1
The PIXEL_BIT field of the CSICR1 register is documented as setting the
Bayer data width to 10 bits, and is set by the driver for all non-YUV
pixel formats. Test code from NXP showed that the bit shouldn't be set
for Bayer formats, and this was confirmed by experimentation with RAW8
capture (which doesn't work when setting the field) and RAW10 capture
(for which setting the field doesn't seem to make a difference) on
i.MX8MM with an OV5640 sensor connected over CSI-2. Don't set it.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Laurent Pinchart [Wed, 3 Feb 2021 03:08:19 +0000 (04:08 +0100)]
media: imx: imx7-media-csi: Set TWO_8BIT_SENSOR for >= 10-bit formats
Sample code from NXP, as well as experiments on i.MX8MM with RAW10
capture with an OV5640 sensor connected over CSI-2, showed that the
TWO_8BIT_SENSOR field of the CSICR3 register needs to be set for formats
larger than 8 bits. Do so, even if the reference manual doesn't clearly
describe the effect of the field.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Laurent Pinchart [Mon, 12 Apr 2021 02:32:48 +0000 (04:32 +0200)]
media: dt-bindings: media: nxp,imx7-csi: Add i.MX8MM support
The i.MX8MM integrates a CSI bridge IP core, as the i.MX7. There seems
to be no difference between the two SoCs according to the reference
manual, but as documentation may not be accurate, add a compatible
string for the i.MX8MM, with a fallback on the compatible i.MX7.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Umang Jain [Fri, 23 Jul 2021 11:22:33 +0000 (13:22 +0200)]
media: imx258: Limit the max analogue gain to 480
The range for analog gain mentioned in the datasheet is [0, 480].
The real gain formula mentioned in the datasheet is:
Gain = 512 / (512 – X)
Hence, values larger than 511 clearly makes no sense. The gain
register field is also documented to be of 9-bits in the datasheet.
Certainly, it is enough to infer that, the kernel driver currently
advertises an arbitrary analog gain max. Fix it by rectifying the
value as per the data sheet i.e. 480.
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Laurent Pinchart [Fri, 23 Jul 2021 11:22:32 +0000 (13:22 +0200)]
media: imx258: Rectify mismatch of VTS value
The frame_length_lines (0x0340) registers are hard-coded as follows:
- 4208x3118
frame_length_lines = 0x0c50
- 2104x1560
frame_length_lines = 0x0638
- 1048x780
frame_length_lines = 0x034c
The driver exposes the V4L2_CID_VBLANK control in read-only mode and
sets its value to vts_def - height, where vts_def is a mode-dependent
value coming from the supported_modes array. It is set using one of
the following macros defined in the driver:
#define IMX258_VTS_30FPS 0x0c98
#define IMX258_VTS_30FPS_2K 0x0638
#define IMX258_VTS_30FPS_VGA 0x034c
There's a clear mismatch in the value for the full resolution mode i.e.
IMX258_VTS_30FPS. Fix it by rectifying the macro with the value set for
the frame_length_lines register as stated above.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Reviewed-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Bingbu Cao [Tue, 6 Jul 2021 09:18:49 +0000 (11:18 +0200)]
media: ov8856: ignore gpio and regulator for ov8856 with ACPI
For ov8856 working with ACPI, it does not depend on the reset gpio
and regulator to do reset and power control, so should get the gpio
and regulator for non-ACPI cases only, otherwise it will break ov8856
with ACPI.
[Sakari Ailus: Wrap a line over 80 chars.]
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Tianshu Qiu <tian.shu.qiu@intel.com>
Cc: Robert Foss <robert.foss@linaro.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Bingbu Cao [Mon, 5 Jul 2021 07:14:15 +0000 (09:14 +0200)]
media: ov9734: use group write for digital gain
As the RGB digital gains of ov9734 were not applied as group, some
artifacts were observed in low light environment, use group write for
digital gain can make the RGB digital can be guaranteed to applied
together at frame boundary.
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Bingbu Cao [Mon, 5 Jul 2021 06:50:59 +0000 (08:50 +0200)]
media: ov2740: use group write for digital gain
As the RGB digital gains of ov2740 were not applied as group, some
artifacts were observed in low light environment, use group write for
digital gain can make the RGB digital can be guaranteed to applied
together at frame boundary.
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Thu, 24 Jun 2021 12:32:02 +0000 (14:32 +0200)]
media: v4l2-flash: Check whether setting LED brightness succeeded
Setting LED brightness may return an error but the return value was never
checked by the V4L2 flash LED class. Do it now.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Thu, 24 Jun 2021 12:05:24 +0000 (14:05 +0200)]
media: v4l2-flash: Add sanity checks for flash and indicator controls
The V4L2 flash API supports combinations of indicator and flash LEDs. Due
to this, there's a fair amount of code that deals with all the possible
options and just reading one part of the file doesn't really tell which
combinations are really possible.
Make the checks more explicit to keep static analysers happy and to make
the code more resilient to future mishaps.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Wed, 23 Jun 2021 14:20:50 +0000 (16:20 +0200)]
media: ccs: Implement support for manual LP control
Use the pre_streamon callback to transition the transmitter to either
LP-11 or LP-111 mode if supported by the sensor.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Wed, 23 Jun 2021 12:44:28 +0000 (14:44 +0200)]
media: v4l: subdev: Add pre_streamon and post_streamoff callbacks
Add pre_streamon and post_streamoff callbacks that can be used to set a
CSI-2 transmitter to LP-11 or LP-111 mode. This can be used by receiver
drivers to reliably initialise the receiver when its initialisation
requires software involvement.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Wed, 23 Jun 2021 12:38:31 +0000 (14:38 +0200)]
media: Documentation: v4l: Rework LP-11 documentation, add callbacks
Rework LP-11 and LP-111 mode documentation to make it more understandable
and useful. This involves adding pre_streamon and post_streamon callbacks
that make it possible to explicitly transition the transmitter to either
mode.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Tue, 22 Jun 2021 11:17:20 +0000 (13:17 +0200)]
media: Documentation: v4l: Improve frame rate configuration documentation
Improve the documentation of the frame rate configuration so that it can
be understood by a regular human being.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Tue, 22 Jun 2021 10:52:03 +0000 (12:52 +0200)]
media: Documentation: v4l: Fix V4L2_CID_PIXEL_RATE documentation
The V4L2_CID_PIXEL_RATE is nowadays used to tell pixel sampling rate in
the sub-device's pixel array, not the pixel rate over a link (for which it
also becomes unfit with the addition of multiplexed streams later on). Fix
this.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
Reviewed-by: Andrey Konovalov <andrey.konovalov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Mon, 8 Feb 2021 10:33:14 +0000 (11:33 +0100)]
media: Documentation: media: Fix v4l2-async kerneldoc syntax
Fix kerneldoc syntax in v4l2-async. The references were not produced
correctly.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sakari Ailus [Mon, 1 Feb 2021 09:16:03 +0000 (10:16 +0100)]
media: Documentation: media: Improve camera sensor documentation
Modernise the documentation to make it more precise and update the use of
pixel rate control and various other changes. In particular:
- Use non-proportional font for file names, properties as well as
controls.
- The unit of the HBLANK control is pixels, not lines.
- The unit of PIXEL_RATE control is pixels per second, not Hz.
- Merge common requirements for CSI-2 and parallel busses.
- Include all DT properties needed for assigned clocks.
- Fix referencing the link rate control.
- SMIA driver's new name is CCS driver.
- The PIXEL_RATE control denotes pixel rate on the pixel array on camera
sensors. Do not suggest it is used to tell the maximum pixel rate on the
bus anymore.
- Improve ReST syntax (plain struct and function names).
- Remove the suggestion to use s_power() in receiver drivers.
- Make MIPI website URL use HTTPS, add Wikipedia links to BT.601 and
BT.656.
Fixes: e4cf8c58af75 ("media: Documentation: media: Document how to write camera sensor drivers")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
Reviewed-by: Andrey Konovalov <andrey.konovalov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Wei Yongjun [Wed, 7 Apr 2021 14:37:33 +0000 (16:37 +0200)]
media: omap3isp: Fix missing unlock in isp_subdev_notifier_complete()
Add the missing unlock before return from function
isp_subdev_notifier_complete() in the init error
handling case.
Fixes: ba689d933361 ("media: omap3isp: Acquire graph mutex for graph traversal")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Zhen Lei [Thu, 3 Jun 2021 07:06:13 +0000 (09:06 +0200)]
media: exynos4-is: use DEVICE_ATTR_RW() helper macro
Use DEVICE_ATTR_RW() helper macro instead of DEVICE_ATTR(), which is
simpler and more readable.
Due to the names of the read and write functions of the sysfs attribute is
normalized, there is a natural association.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Zhen Lei [Thu, 3 Jun 2021 07:17:50 +0000 (09:17 +0200)]
media: i2c: use DEVICE_ATTR_RO() helper macro
Use DEVICE_ATTR_RO() helper macro instead of DEVICE_ATTR(), which is
simpler and more readable.
Due to the name of the read function of the sysfs attribute is normalized,
there is a natural association.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Zhen Lei [Thu, 3 Jun 2021 07:15:29 +0000 (09:15 +0200)]
media: i2c: et8ek8: use DEVICE_ATTR_RO() helper macro
Use DEVICE_ATTR_RO() helper macro instead of DEVICE_ATTR(), which is
simpler and more readable.
Due to the name of the read function of the sysfs attribute is normalized,
there is a natural association.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Zhen Lei [Thu, 3 Jun 2021 07:12:49 +0000 (09:12 +0200)]
media: mc-device.c: use DEVICE_ATTR_RO() helper macro
Use DEVICE_ATTR_RO() helper macro instead of DEVICE_ATTR(), which is
simpler and more readable.
Due to the name of the read function of the sysfs attribute is normalized,
there is a natural association.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Xavier Roumegue [Fri, 18 Jun 2021 07:59:33 +0000 (09:59 +0200)]
media: ov5640: Complement yuv mbus formats with their 1X16 versions
According to media bus pixel codes definition, data formats on serial
busses should be described with one bus sample per pixel.
Documentation/userspace-api/media/v4l/subdev-formats.rst states:
"The media bus pixel codes document parallel formats. Should the pixel
data be transported over a serial bus, the media bus pixel code that
describes a parallel format that transfers a sample on a single clock
cycle is used. For instance, both MEDIA_BUS_FMT_BGR888_1X24 and
MEDIA_BUS_FMT_BGR888_3X8 are used on parallel busses for transferring an
8 bits per sample BGR data, whereas on serial busses the data in this
format is only referred to using MEDIA_BUS_FMT_BGR888_1X24. This is
because there is effectively only a single way to transport that
format on the serial busses."
Some MIPI CSI receivers strictly obey this definition and declare
support for only *1X_* formats.
Hence, complement the supported media bus formats with their 1X16 versions
(currently applicable to yuyv, uyvy) to enhance interoperability with CSI
receivers.
Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martina Krasteva [Thu, 27 May 2021 14:21:45 +0000 (16:21 +0200)]
media: i2c: Add ov9282 camera sensor driver
Add a v4l2 sub-device driver for the OmniVisison ov9282
black&white image sensor.
The camera sensor uses the i2c bus for control and the
csi-2 bus for data.
The following features are supported:
- manual exposure and analog gain control support
- vblank/hblank/pixel rate/link freq control support
- supported resolution:
- 1280x720 @ 30fps
[Sakari Ailus: Rebase on commit
c802a4174beeb25cb539c806c9d0d3c0f61dfa53.]
Signed-off-by: Martina Krasteva <martinax.krasteva@intel.com>
Acked-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com>
Acked-by: Paul J. Murphy <paul.j.murphy@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martina Krasteva [Thu, 27 May 2021 14:21:44 +0000 (16:21 +0200)]
media: dt-bindings: media: Add bindings for ov9282
- Add dt-bindings documentation for OmniVision ov9282 sensor driver
- Add MAINTAINERS entry for OmniVision ov9282 binding documentation
Signed-off-by: Martina Krasteva <martinax.krasteva@intel.com>
Acked-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com>
Acked-by: Paul J. Murphy <paul.j.murphy@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martina Krasteva [Thu, 27 May 2021 14:21:43 +0000 (16:21 +0200)]
media: i2c: Add imx412 camera sensor driver
Add a v4l2 sub-device driver for the Sony imx412 image sensor.
This is a camera sensor using the i2c bus for control and the
csi-2 bus for data.
The following features are supported:
- manual exposure and analog gain control support
- vblank/hblank/pixel rate/link freq control support
- supported resolution:
- 4056x3040 @ 30fps
- supported bayer order output:
- SRGGB10
[Sakari Ailus: Rebase on commit
c802a4174beeb25cb539c806c9d0d3c0f61dfa53.]
Signed-off-by: Martina Krasteva <martinax.krasteva@intel.com>
Acked-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com>
Acked-by: Paul J. Murphy <paul.j.murphy@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martina Krasteva [Thu, 27 May 2021 14:21:42 +0000 (16:21 +0200)]
media: dt-bindings: media: Add bindings for imx412
- Add dt-bindings documentation for Sony imx412 sensor driver
- Add MAINTAINERS entry for Sony imx412 binding documentation
Signed-off-by: Martina Krasteva <martinax.krasteva@intel.com>
Acked-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com>
Acked-by: Paul J. Murphy <paul.j.murphy@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martina Krasteva [Thu, 27 May 2021 14:21:41 +0000 (16:21 +0200)]
media: i2c: Add imx335 camera sensor driver
Add a v4l2 sub-device driver for the Sony imx335 image sensor.
ThE camera sensor uses the i2c bus for control and the csi-2
bus for data.
The following features are supported:
- manual exposure and analog gain control support
- vblank/hblank/pixel rate/link freq control support
- supported resolution:
- 2592x1940 @ 30fps
- supported bayer order output:
- SRGGB12
[Sakari Ailus: Rebase on commit
c802a4174beeb25cb539c806c9d0d3c0f61dfa53.]
Signed-off-by: Martina Krasteva <martinax.krasteva@intel.com>
Acked-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com>
Acked-by: Paul J. Murphy <paul.j.murphy@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Martina Krasteva [Thu, 27 May 2021 14:21:40 +0000 (16:21 +0200)]
media: dt-bindings: media: Add bindings for imx335
- Add dt-bindings documentation for Sony imx335 sensor driver
- Add MAINTAINERS entry for Sony imx335 binding documentation
Signed-off-by: Martina Krasteva <martinax.krasteva@intel.com>
Acked-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com>
Acked-by: Paul J. Murphy <paul.j.murphy@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Paul Kocialkowski [Wed, 9 Jun 2021 11:54:56 +0000 (13:54 +0200)]
media: v4l2-subdev: Fix documentation of the subdev_notifier member
Fix the name of the function that registers the subdev_notifier member
of the v4l2_subdev structure.
[Sakari Ailus: Drop _sensor from the function name.]
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Dongliang Mu [Wed, 7 Jul 2021 09:34:09 +0000 (11:34 +0200)]
media: em28xx-input: fix refcount bug in em28xx_usb_disconnect
If em28xx_ir_init fails, it would decrease the refcount of dev. However,
in the em28xx_ir_fini, when ir is NULL, it goes to ref_put and decrease
the refcount of dev. This will lead to a refcount bug.
Fix this bug by removing the kref_put in the error handling code
of em28xx_ir_init.
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 7 at lib/refcount.c:28 refcount_warn_saturate+0x18e/0x1a0 lib/refcount.c:28
Modules linked in:
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.13.0 #3
Workqueue: usb_hub_wq hub_event
RIP: 0010:refcount_warn_saturate+0x18e/0x1a0 lib/refcount.c:28
Call Trace:
kref_put.constprop.0+0x60/0x85 include/linux/kref.h:69
em28xx_usb_disconnect.cold+0xd7/0xdc drivers/media/usb/em28xx/em28xx-cards.c:4150
usb_unbind_interface+0xbf/0x3a0 drivers/usb/core/driver.c:458
__device_release_driver drivers/base/dd.c:1201 [inline]
device_release_driver_internal+0x22a/0x230 drivers/base/dd.c:1232
bus_remove_device+0x108/0x160 drivers/base/bus.c:529
device_del+0x1fe/0x510 drivers/base/core.c:3540
usb_disable_device+0xd1/0x1d0 drivers/usb/core/message.c:1419
usb_disconnect+0x109/0x330 drivers/usb/core/hub.c:2221
hub_port_connect drivers/usb/core/hub.c:5151 [inline]
hub_port_connect_change drivers/usb/core/hub.c:5440 [inline]
port_event drivers/usb/core/hub.c:5586 [inline]
hub_event+0xf81/0x1d40 drivers/usb/core/hub.c:5668
process_one_work+0x2c9/0x610 kernel/workqueue.c:2276
process_scheduled_works kernel/workqueue.c:2338 [inline]
worker_thread+0x333/0x5b0 kernel/workqueue.c:2424
kthread+0x188/0x1d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Fixes: ac5688637144 ("media: em28xx: Fix possible memory leak of em28xx struct")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Viktor Prutyanov [Mon, 19 Jul 2021 17:05:06 +0000 (19:05 +0200)]
media: rc: introduce Meson IR TX driver
This patch adds the driver for Amlogic Meson IR transmitter.
Some Amlogic SoCs such as A311D and T950D4 have IR transmitter
(also called blaster) controller onboard. It is capable of sending
IR signals with arbitrary carrier frequency and duty cycle.
The driver supports 2 modulation clock sources:
- xtal3 clock (xtal divided by 3)
- 1us clock
Signed-off-by: Viktor Prutyanov <viktor.prutyanov@phystech.edu>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Viktor Prutyanov [Mon, 19 Jul 2021 17:05:05 +0000 (19:05 +0200)]
media: rc: meson-ir-tx: document device tree bindings
This patch adds binding documentation for the IR transmitter
available in Amlogic Meson SoCs.
Signed-off-by: Viktor Prutyanov <viktor.prutyanov@phystech.edu>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Mauro Carvalho Chehab [Tue, 3 Aug 2021 13:36:58 +0000 (15:36 +0200)]
Merge commit '
c3cdc019a6bf' into media_tree
* commit '
c3cdc019a6bf':
media: atomisp: pci: reposition braces as per coding style
media: atomisp: i2c: Remove a superfluous else clause in atomisp-mt9m114.c
media: atomisp: Move MIPI_PORT_LANES to the only user
media: atomisp: Perform a single memset() for union
media: atomisp: pci: fix error return code in atomisp_pci_probe()
media: atomisp: pci: Remove unnecessary (void *) cast
media: atomisp: pci: Remove checks before kfree/kvfree
media: atomisp: Remove unused port_enabled variable
media: atomisp: Annotate a couple of definitions with __maybe_unused
media: atomisp: Remove unused declarations
media: atomisp: remove the repeated declaration
media: atomisp: improve error handling in gc2235_detect()
media: atomisp: Fix whitespace at the beginning of line
media: atomisp: Align block comments
media: atomisp: Use sysfs_emit() instead of sprintf() where appropriate
media: atomisp: Fix line continuation style issue in sh_css.c
media: atomisp: Use kcalloc instead of kzalloc with multiply in sh_css.c
media: atomisp: Remove unnecessary parens in sh_css.c
media: atomisp: Resolve goto style issue in sh_css.c
media: atomisp: fix the uninitialized use and rename "retvalue"
Linus Torvalds [Mon, 2 Aug 2021 00:04:17 +0000 (17:04 -0700)]
Linux 5.14-rc4
Linus Torvalds [Sun, 1 Aug 2021 19:25:30 +0000 (12:25 -0700)]
Merge tag 'perf-tools-fixes-for-v5.14-2021-08-01' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Revert "perf map: Fix dso->nsinfo refcounting", this makes 'perf top'
abort, uncovering a design flaw on how namespace information is kept.
The fix for that is more than we can do right now, leave it for the
next merge window.
- Split --dump-raw-trace by AUX records for ARM's CoreSight, fixing up
the decoding of some records.
- Fix PMU alias matching.
Thanks to James Clark and John Garry for these fixes.
* tag 'perf-tools-fixes-for-v5.14-2021-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
Revert "perf map: Fix dso->nsinfo refcounting"
perf pmu: Fix alias matching
perf cs-etm: Split --dump-raw-trace by AUX records
Linus Torvalds [Sun, 1 Aug 2021 19:18:44 +0000 (12:18 -0700)]
Merge tag 'powerpc-5.14-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Don't use r30 in VDSO code, to avoid breaking existing Go lang
programs.
- Change an export symbol to allow non-GPL modules to use spinlocks
again.
Thanks to Paul Menzel, and Srikar Dronamraju.
* tag 'powerpc-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/vdso: Don't use r30 to avoid breaking Go lang
powerpc/pseries: Fix regression while building external modules
Linus Torvalds [Sun, 1 Aug 2021 19:07:23 +0000 (12:07 -0700)]
Merge tag 'xfs-5.14-fixes-2' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"This contains a bunch of bug fixes in XFS.
Dave and I have been busy the last couple of weeks to find and fix as
many log recovery bugs as we can find; here are the results so far. Go
fstests -g recoveryloop! ;)
- Fix a number of coordination bugs relating to cache flushes for
metadata writeback, cache flushes for multi-buffer log writes, and
FUA writes for single-buffer log writes
- Fix a bug with incorrect replay of attr3 blocks
- Fix unnecessary stalls when flushing logs to disk
- Fix spoofing problems when recovering realtime bitmap blocks"
* tag 'xfs-5.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: prevent spoofing of rtbitmap blocks when recovering buffers
xfs: limit iclog tail updates
xfs: need to see iclog flags in tracing
xfs: Enforce attr3 buffer recovery order
xfs: logging the on disk inode LSN can make it go backwards
xfs: avoid unnecessary waits in xfs_log_force_lsn()
xfs: log forces imply data device cache flushes
xfs: factor out forced iclog flushes
xfs: fix ordering violation between cache flushes and tail updates
xfs: fold __xlog_state_release_iclog into xlog_state_release_iclog
xfs: external logs need to flush data device
xfs: flush data dev on external log write
Linus Torvalds [Sat, 31 Jul 2021 16:25:12 +0000 (09:25 -0700)]
Merge tag '5.14-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three cifs/smb3 fixes, including two for stable, and a fix for an
fallocate problem noticed by Clang"
* tag '5.14-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: add missing parsing of backupuid
smb3: rc uninitialized in one fallocate path
SMB3: fix readpage for large swap cache
Linus Torvalds [Fri, 30 Jul 2021 23:01:36 +0000 (16:01 -0700)]
Merge tag 'net-5.14-rc4' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.14-rc4, including fixes from bpf, can, WiFi
(mac80211) and netfilter trees.
Current release - regressions:
- mac80211: fix starting aggregation sessions on mesh interfaces
Current release - new code bugs:
- sctp: send pmtu probe only if packet loss in Search Complete state
- bnxt_en: add missing periodic PHC overflow check
- devlink: fix phys_port_name of virtual port and merge error
- hns3: change the method of obtaining default ptp cycle
- can: mcba_usb_start(): add missing urb->transfer_dma initialization
Previous releases - regressions:
- set true network header for ECN decapsulation
- mlx5e: RX, avoid possible data corruption w/ relaxed ordering and
LRO
- phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811
PHY
- sctp: fix return value check in __sctp_rcv_asconf_lookup
Previous releases - always broken:
- bpf:
- more spectre corner case fixes, introduce a BPF nospec
instruction for mitigating Spectre v4
- fix OOB read when printing XDP link fdinfo
- sockmap: fix cleanup related races
- mac80211: fix enabling 4-address mode on a sta vif after assoc
- can:
- raw: raw_setsockopt(): fix raw_rcv panic for sock UAF
- j1939: j1939_session_deactivate(): clarify lifetime of session
object, avoid UAF
- fix number of identical memory leaks in USB drivers
- tipc:
- do not blindly write skb_shinfo frags when doing decryption
- fix sleeping in tipc accept routine"
* tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
gve: Update MAINTAINERS list
can: esd_usb2: fix memory leak
can: ems_usb: fix memory leak
can: usb_8dev: fix memory leak
can: mcba_usb_start(): add missing urb->transfer_dma initialization
can: hi311x: fix a signedness bug in hi3110_cmd()
MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
bpf: Fix leakage due to insufficient speculative store bypass mitigation
bpf: Introduce BPF nospec instruction for mitigating Spectre v4
sis900: Fix missing pci_disable_device() in probe and remove
net: let flow have same hash in two directions
nfc: nfcsim: fix use after free during module unload
tulip: windbond-840: Fix missing pci_disable_device() in probe and remove
sctp: fix return value check in __sctp_rcv_asconf_lookup
nfc: s3fwrn5: fix undefined parameter values in dev_err()
net/mlx5: Fix mlx5_vport_tbl_attr chain from u16 to u32
net/mlx5e: Fix nullptr in mlx5e_hairpin_get_mdev()
net/mlx5: Unload device upon firmware fatal error
net/mlx5e: Fix page allocation failure for ptp-RQ over SF
net/mlx5e: Fix page allocation failure for trap-RQ over SF
...
Linus Torvalds [Fri, 30 Jul 2021 22:56:24 +0000 (15:56 -0700)]
Merge tag 'acpi-5.14-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These revert a recent IRQ resources handling modification that turned
out to be problematic, fix suspend-to-idle handling on AMD platforms
to take upcoming systems into account properly and fix the retrieval
of the DPTF attributes of the PCH FIVR.
Specifics:
- Revert recent change of the ACPI IRQ resources handling that
attempted to improve the ACPI IRQ override selection logic, but
introduced serious regressions on some systems (Hui Wang).
- Fix up quirks for AMD platforms in the suspend-to-idle support code
so as to take upcoming systems using uPEP HID AMDI007 into account
as appropriate (Mario Limonciello).
- Fix the code retrieving DPTF attributes of the PCH FIVR so that it
agrees on the return data type with the ACPI control method
evaluated for this purpose (Srinivas Pandruvada)"
* tag 'acpi-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: DPTF: Fix reading of attributes
Revert "ACPI: resources: Add checks for ACPI IRQ override"
ACPI: PM: Add support for upcoming AMD uPEP HID AMDI007
Linus Torvalds [Fri, 30 Jul 2021 22:42:34 +0000 (15:42 -0700)]
pipe: make pipe writes always wake up readers
Since commit
1b6b26ae7053 ("pipe: fix and clarify pipe write wakeup
logic") we have sanitized the pipe write logic, and would only try to
wake up readers if they needed it.
In particular, if the pipe already had data in it before the write,
there was no point in trying to wake up a reader, since any existing
readers must have been aware of the pre-existing data already. Doing
extraneous wakeups will only cause potential thundering herd problems.
However, it turns out that some Android libraries have misused the EPOLL
interface, and expected "edge triggered" be to "any new write will
trigger it". Even if there was no edge in sight.
Quoting Sandeep Patil:
"The commit
1b6b26ae7053 ('pipe: fix and clarify pipe write wakeup
logic') changed pipe write logic to wakeup readers only if the pipe
was empty at the time of write. However, there are libraries that
relied upon the older behavior for notification scheme similar to
what's described in [1]
One such library 'realm-core'[2] is used by numerous Android
applications. The library uses a similar notification mechanism as GNU
Make but it never drains the pipe until it is full. When Android moved
to v5.10 kernel, all applications using this library stopped working.
The library has since been fixed[3] but it will be a while before all
applications incorporate the updated library"
Our regression rule for the kernel is that if applications break from
new behavior, it's a regression, even if it was because the application
did something patently wrong. Also note the original report [4] by
Michal Kerrisk about a test for this epoll behavior - but at that point
we didn't know of any actual broken use case.
So add the extraneous wakeup, to approximate the old behavior.
[ I say "approximate", because the exact old behavior was to do a wakeup
not for each write(), but for each pipe buffer chunk that was filled
in. The behavior introduced by this change is not that - this is just
"every write will cause a wakeup, whether necessary or not", which
seems to be sufficient for the broken library use. ]
It's worth noting that this adds the extraneous wakeup only for the
write side, while the read side still considers the "edge" to be purely
about reading enough from the pipe to allow further writes.
See commit
f467a6a66419 ("pipe: fix and clarify pipe read wakeup logic")
for the pipe read case, which remains that "only wake up if the pipe was
full, and we read something from it".
Link: https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com/
Link: https://github.com/realm/realm-core
Link: https://github.com/realm/realm-core/issues/4666
Link: https://lore.kernel.org/lkml/CAKgNAkjMBGeAwF=2MKK758BhxvW58wYTgYKB2V-gY1PwXxrH+Q@mail.gmail.com/
Link: https://lore.kernel.org/lkml/20210729222635.2937453-1-sspatil@android.com/
Reported-by: Sandeep Patil <sspatil@android.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnaldo Carvalho de Melo [Fri, 30 Jul 2021 21:26:22 +0000 (18:26 -0300)]
Revert "perf map: Fix dso->nsinfo refcounting"
This makes 'perf top' abort in some cases, and the right fix will
involve surgery that is too much to do at this stage, so revert for now
and fix it in the next merge window.
This reverts commit
2d6b74baa7147251c30a46c4996e8cc224aa2dc5.
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rafael J. Wysocki [Fri, 30 Jul 2021 18:26:38 +0000 (20:26 +0200)]
Merge branches 'acpi-resources' and 'acpi-dptf'
* acpi-resources:
Revert "ACPI: resources: Add checks for ACPI IRQ override"
* acpi-dptf:
ACPI: DPTF: Fix reading of attributes
Linus Torvalds [Fri, 30 Jul 2021 18:08:12 +0000 (11:08 -0700)]
Merge tag 'block-5.14-2021-07-30' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- gendisk freeing fix (Christoph)
- blk-iocost wake ordering fix (Tejun)
- tag allocation error handling fix (John)
- loop locking fix. While this isn't the prettiest fix in the world,
nobody has any good alternatives for 5.14. Something to likely
revisit for 5.15. (Tetsuo)
* tag 'block-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
block: delay freeing the gendisk
blk-iocost: fix operation ordering in iocg_wake_fn()
blk-mq-sched: Fix blk_mq_sched_alloc_tags() error handling
loop: reintroduce global lock for safe loop_validate_file() traversal
Linus Torvalds [Fri, 30 Jul 2021 18:01:47 +0000 (11:01 -0700)]
Merge tag 'io_uring-5.14-2021-07-30' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- A fix for block backed reissue (me)
- Reissue context hardening (me)
- Async link locking fix (Pavel)
* tag 'io_uring-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
io_uring: fix poll requests leaking second poll entries
io_uring: don't block level reissue off completion path
io_uring: always reissue from task_work context
io_uring: fix race in unified task_work running
io_uring: fix io_prep_async_link locking
Linus Torvalds [Fri, 30 Jul 2021 17:56:47 +0000 (10:56 -0700)]
Merge tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block
Pull libata fixlets from Jens Axboe:
- A fix for PIO highmem (Christoph)
- Kill HAVE_IDE as it's now unused (Lukas)
* tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
arch: Kconfig: clean up obsolete use of HAVE_IDE
libata: fix ata_pio_sector for CONFIG_HIGHMEM
Linus Torvalds [Fri, 30 Jul 2021 17:50:09 +0000 (10:50 -0700)]
Merge tag 'for-5.14-rc3-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix -Warray-bounds warning, to help external patchset to make it
default treewide
- fix writeable device accounting (syzbot report)
- fix fsync and log replay after a rename and inode eviction
- fix potentially lost error code when submitting multiple bios for
compressed range
* tag 'for-5.14-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: calculate number of eb pages properly in csum_tree_block
btrfs: fix rw device counting in __btrfs_free_extra_devids
btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction
btrfs: mark compressed range uptodate only if all bio succeed
Linus Torvalds [Fri, 30 Jul 2021 17:36:36 +0000 (10:36 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- resume timing fix for intel-ish driver (Ye Xiang)
- fix for using incorrect MMIO register in amd_sfh driver (Dylan
MacKenzie)
- Cintiq 24HDT / 27QHDT regression fix and touch processing fix for
Wacom driver (Jason Gerecke)
- device removal bugfix for ft260 driver (Michael Zaidman)
- other small assorted fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: ft260: fix device removal due to USB disconnect
HID: wacom: Skip processing of touches with negative slot values
HID: wacom: Re-enable touch by default for Cintiq 24HDT / 27QHDT
HID: Kconfig: Fix spelling mistake "Uninterruptable" -> "Uninterruptible"
HID: apple: Add support for Keychron K1 wireless keyboard
HID: fix typo in Kconfig
HID: ft260: fix format type warning in ft260_word_show()
HID: amd_sfh: Use correct MMIO register for DMA address
HID: asus: Remove check for same LED brightness on set
HID: intel-ish-hid: use async resume function
Linus Torvalds [Fri, 30 Jul 2021 17:29:58 +0000 (10:29 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"7 patches.
Subsystems affected by this patch series: lib, ocfs2, and mm (slub,
migration, and memcg)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
slub: fix unreclaimable slab stat for bulk free
mm/migrate: fix NR_ISOLATED corruption on 64-bit
mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
ocfs2: issue zeroout to EOF blocks
ocfs2: fix zero out valid data
lib/test_string.c: move string selftest in the Runtime Testing menu
Jakub Kicinski [Fri, 30 Jul 2021 17:29:52 +0000 (19:29 +0200)]
Merge tag 'linux-can-fixes-for-5.14-
20210730' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2021-07-30
The first patch is by me and adds Yasushi SHOJI as a reviewer for the
Microchip CAN BUS Analyzer Tool driver.
Dan Carpenter's patch fixes a signedness bug in the hi311x driver.
Pavel Skripkin provides 4 patches, the first targets the mcba_usb
driver by adding the missing urb->transfer_dma initialization, which
was broken in a previous commit. The last 3 patches fix a memory leak
in the usb_8dev, ems_usb and esd_usb2 driver.
* tag 'linux-can-fixes-for-5.14-
20210730' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: esd_usb2: fix memory leak
can: ems_usb: fix memory leak
can: usb_8dev: fix memory leak
can: mcba_usb_start(): add missing urb->transfer_dma initialization
can: hi311x: fix a signedness bug in hi3110_cmd()
MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
====================
Link: https://lore.kernel.org/r/20210730070526.1699867-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wang Hai [Thu, 29 Jul 2021 21:53:54 +0000 (14:53 -0700)]
mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
When I use kfree_rcu() to free a large memory allocated by kmalloc_node(),
the following dump occurs.
BUG: kernel NULL pointer dereference, address:
0000000000000020
[...]
Oops: 0000 [#1] SMP
[...]
Workqueue: events kfree_rcu_work
RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
[...]
Call Trace:
kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
kfree_bulk include/linux/slab.h:413 [inline]
kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
process_one_work+0x207/0x530 kernel/workqueue.c:2276
worker_thread+0x320/0x610 kernel/workqueue.c:2422
kthread+0x13d/0x160 kernel/kthread.c:313
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
When kmalloc_node() a large memory, page is allocated, not slab, so when
freeing memory via kfree_rcu(), this large memory should not be used by
memcg_slab_free_hook(), because memcg_slab_free_hook() is is used for
slab.
Using page_objcgs_check() instead of page_objcgs() in
memcg_slab_free_hook() to fix this bug.
Link: https://lkml.kernel.org/r/20210728145655.274476-1-wanghai38@huawei.com
Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shakeel Butt [Thu, 29 Jul 2021 21:53:50 +0000 (14:53 -0700)]
slub: fix unreclaimable slab stat for bulk free
SLUB uses page allocator for higher order allocations and update
unreclaimable slab stat for such allocations. At the moment, the bulk
free for SLUB does not share code with normal free code path for these
type of allocations and have missed the stat update. So, fix the stat
update by common code. The user visible impact of the bug is the
potential of inconsistent unreclaimable slab stat visible through
meminfo and vmstat.
Link: https://lkml.kernel.org/r/20210728155354.3440560-1-shakeelb@google.com
Fixes: 6a486c0ad4dc ("mm, sl[ou]b: improve memory accounting")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Thu, 29 Jul 2021 21:53:47 +0000 (14:53 -0700)]
mm/migrate: fix NR_ISOLATED corruption on 64-bit
Similar to commit
2da9f6305f30 ("mm/vmscan: fix NR_ISOLATED_FILE
corruption on 64-bit") avoid using unsigned int for nr_pages. With
unsigned int type the large unsigned int converts to a large positive
signed long.
Symptoms include CMA allocations hanging forever due to
alloc_contig_range->...->isolate_migratepages_block waiting forever in
"while (unlikely(too_many_isolated(pgdat)))".
Link: https://lkml.kernel.org/r/20210728042531.359409-1-aneesh.kumar@linux.ibm.com
Fixes: c5fc5c3ae0c8 ("mm: migrate: account THP NUMA migration counters correctly")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 29 Jul 2021 21:53:44 +0000 (14:53 -0700)]
mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
Dan Carpenter reports:
The patch
2d146aa3aa84: "mm: memcontrol: switch to rstat" from Apr
29, 2021, leads to the following static checker warning:
kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
warn: sleeping in atomic context
mm/memcontrol.c
3572 static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
3573 {
3574 unsigned long val;
3575
3576 if (mem_cgroup_is_root(memcg)) {
3577 cgroup_rstat_flush(memcg->css.cgroup);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is from static analysis and potentially a false positive. The
problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
which holds an rcu_read_lock(). And the cgroup_rstat_flush() function
can sleep.
3578 val = memcg_page_state(memcg, NR_FILE_PAGES) +
3579 memcg_page_state(memcg, NR_ANON_MAPPED);
3580 if (swap)
3581 val += memcg_page_state(memcg, MEMCG_SWAP);
3582 } else {
3583 if (!swap)
3584 val = page_counter_read(&memcg->memory);
3585 else
3586 val = page_counter_read(&memcg->memsw);
3587 }
3588 return val;
3589 }
__mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
thresholding code is invoked during stat changes, and those contexts
have irqs disabled as well. If the lock breaking occurs inside the
flush function, it will result in a sleep from an atomic context.
Use the irqsafe flushing variant in mem_cgroup_usage() to fix this.
Link: https://lkml.kernel.org/r/20210726150019.251820-1-hannes@cmpxchg.org
Fixes: 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Rik van Riel <riel@surriel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Junxiao Bi [Thu, 29 Jul 2021 21:53:41 +0000 (14:53 -0700)]
ocfs2: issue zeroout to EOF blocks
For punch holes in EOF blocks, fallocate used buffer write to zero the
EOF blocks in last cluster. But since ->writepage will ignore EOF
pages, those zeros will not be flushed.
This "looks" ok as commit
6bba4471f0cc ("ocfs2: fix data corruption by
fallocate") will zero the EOF blocks when extend the file size, but it
isn't. The problem happened on those EOF pages, before writeback, those
pages had DIRTY flag set and all buffer_head in them also had DIRTY flag
set, when writeback run by write_cache_pages(), DIRTY flag on the page
was cleared, but DIRTY flag on the buffer_head not.
When next write happened to those EOF pages, since buffer_head already
had DIRTY flag set, it would not mark page DIRTY again. That made
writeback ignore them forever. That will cause data corruption. Even
directio write can't work because it will fail when trying to drop pages
caches before direct io, as it found the buffer_head for those pages
still had DIRTY flag set, then it will fall back to buffer io mode.
To make a summary of the issue, as writeback ingores EOF pages, once any
EOF page is generated, any write to it will only go to the page cache,
it will never be flushed to disk even file size extends and that page is
not EOF page any more. The fix is to avoid zero EOF blocks with buffer
write.
The following code snippet from qemu-img could trigger the corruption.
656 open("
6b3711ae-3306-4bdd-823c-
cf1c0060a095.conv.2", O_RDWR|O_DIRECT|O_CLOEXEC) = 11
...
660 fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE,
2275868672, 327680 <unfinished ...>
660 fallocate(11, 0,
2275868672, 327680) = 0
658 pwrite64(11, "
Link: https://lkml.kernel.org/r/20210722054923.24389-2-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Junxiao Bi [Thu, 29 Jul 2021 21:53:38 +0000 (14:53 -0700)]
ocfs2: fix zero out valid data
If append-dio feature is enabled, direct-io write and fallocate could
run in parallel to extend file size, fallocate used "orig_isize" to
record i_size before taking "ip_alloc_sem", when
ocfs2_zeroout_partial_cluster() zeroout EOF blocks, i_size maybe already
extended by ocfs2_dio_end_io_write(), that will cause valid data zeroed
out.
Link: https://lkml.kernel.org/r/20210722054923.24389-1-junxiao.bi@oracle.com
Fixes: 6bba4471f0cc ("ocfs2: fix data corruption by fallocate")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matteo Croce [Thu, 29 Jul 2021 21:53:35 +0000 (14:53 -0700)]
lib/test_string.c: move string selftest in the Runtime Testing menu
STRING_SELFTEST is presented in the "Library routines" menu. Move it in
Kernel hacking > Kernel Testing and Coverage > Runtime Testing together
with other similar tests found in lib/
--- Runtime Testing
<*> Test functions located in the hexdump module at runtime
<*> Test string functions (NEW)
<*> Test functions located in the string_helpers module at runtime
<*> Test strscpy*() family of functions at runtime
<*> Test kstrto*() family of functions at runtime
<*> Test printf() family of functions at runtime
<*> Test scanf() family of functions at runtime
Link: https://lkml.kernel.org/r/20210719185158.190371-1-mcroce@linux.microsoft.com
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Cc: Peter Rosin <peda@axentia.se>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Catherine Sullivan [Thu, 29 Jul 2021 15:52:58 +0000 (08:52 -0700)]
gve: Update MAINTAINERS list
The team maintaining the gve driver has undergone some changes,
this updates the MAINTAINERS file accordingly.
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://lore.kernel.org/r/20210729155258.442650-1-csully@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lukas Bulwahn [Wed, 28 Jul 2021 18:21:15 +0000 (20:21 +0200)]
arch: Kconfig: clean up obsolete use of HAVE_IDE
The arch-specific Kconfig files use HAVE_IDE to indicate if IDE is
supported.
As IDE support and the HAVE_IDE config vanishes with commit
b7fb14d3ac63
("ide: remove the legacy ide driver"), there is no need to mention
HAVE_IDE in all those arch-specific Kconfig files.
The issue was identified with ./scripts/checkkconfigsymbols.py.
Fixes: b7fb14d3ac63 ("ide: remove the legacy ide driver")
Suggested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210728182115.4401-1-lukas.bulwahn@gmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Skripkin [Tue, 27 Jul 2021 17:00:46 +0000 (20:00 +0300)]
can: esd_usb2: fix memory leak
In esd_usb2_setup_rx_urbs() MAX_RX_URBS coherent buffers are allocated
and there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see esd_usb2_setup_rx_urbs) and this flag cannot be used
with coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent()
explicitly.
Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.
Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device")
Link: https://lore.kernel.org/r/b31b096926dcb35998ad0271aac4b51770ca7cc8.1627404470.git.paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Skripkin [Tue, 27 Jul 2021 17:00:33 +0000 (20:00 +0300)]
can: ems_usb: fix memory leak
In ems_usb_start() MAX_RX_URBS coherent buffers are allocated and
there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see ems_usb_start) and this flag cannot be used with
coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent()
explicitly.
Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.
Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface")
Link: https://lore.kernel.org/r/59aa9fbc9a8cbf9af2bbd2f61a659c480b415800.1627404470.git.paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Skripkin [Tue, 27 Jul 2021 16:59:57 +0000 (19:59 +0300)]
can: usb_8dev: fix memory leak
In usb_8dev_start() MAX_RX_URBS coherent buffers are allocated and
there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see usb_8dev_start) and this flag cannot be used with
coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent()
explicitly.
Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.
Fixes: 0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices")
Link: https://lore.kernel.org/r/d39b458cd425a1cf7f512f340224e6e9563b07bd.1627404470.git.paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Skripkin [Sun, 25 Jul 2021 10:36:30 +0000 (13:36 +0300)]
can: mcba_usb_start(): add missing urb->transfer_dma initialization
Yasushi reported, that his Microchip CAN Analyzer stopped working
since commit
91c02557174b ("can: mcba_usb: fix memory leak in
mcba_usb"). The problem was in missing urb->transfer_dma
initialization.
In my previous patch to this driver I refactored mcba_usb_start() code
to avoid leaking usb coherent buffers. To archive it, I passed local
stack variable to usb_alloc_coherent() and then saved it to private
array to correctly free all coherent buffers on ->close() call. But I
forgot to initialize urb->transfer_dma with variable passed to
usb_alloc_coherent().
All of this was causing device to not work, since dma addr 0 is not
valid and following log can be found on bug report page, which points
exactly to problem described above.
| DMAR: [DMA Write] Request device [00:14.0] PASID
ffffffff fault addr 0 [fault reason 05] PTE Write access is not set
Fixes: 91c02557174b ("can: mcba_usb: fix memory leak in mcba_usb")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990850
Link: https://lore.kernel.org/r/20210725103630.23864-1-paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Tested-by: Yasushi SHOJI <yashi@spacecubics.com>
[mkl: fixed typos in commit message - thanks Yasushi SHOJI]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Dan Carpenter [Thu, 29 Jul 2021 14:12:46 +0000 (17:12 +0300)]
can: hi311x: fix a signedness bug in hi3110_cmd()
The hi3110_cmd() is supposed to return zero on success and negative
error codes on failure, but it was accidentally declared as a u8 when
it needs to be an int type.
Fixes: 57e83fb9b746 ("can: hi311x: Add Holt HI-311x CAN driver")
Link: https://lore.kernel.org/r/20210729141246.GA1267@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Mon, 26 Jul 2021 09:26:44 +0000 (11:26 +0200)]
MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
This patch adds Yasushi SHOJI as a reviewer for the Microchip CAN BUS
Analyzer Tool driver.
Link: https://lore.kernel.org/r/20210726111619.1023991-1-mkl@pengutronix.de
Acked-by: Yasushi SHOJI <yashi@spacecubics.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Linus Torvalds [Fri, 30 Jul 2021 05:10:05 +0000 (22:10 -0700)]
Merge tag 'drm-fixes-2021-07-30' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular drm fixes pull, seems about the right size, lots of small
fixes across the board, mostly amdgpu, but msm and i915 are in there
along with panel and ttm.
amdgpu:
- Fix resource leak in an error path
- Avoid stack contents exposure in error path
- pmops check fix for S0ix vs S3
- DCN 2.1 display fixes
- DCN 2.0 display fix
- Backlight control fix for laptops with HDR panels
- Maintainers updates
i915:
- Fix vbt port mask
- Fix around reading the right DSC disable fuse in display_ver 10
- Split display version 9 and 10 in intel_setup_outputs
msm:
- iommu fault display fix
- misc dp compliance fixes
- dpu reg sizing fix
panel:
- Fix bpc for ytc700tlag_05_201c
ttm:
- debugfs init fixes"
* tag 'drm-fixes-2021-07-30' of git://anongit.freedesktop.org/drm/drm:
maintainers: add bugs and chat URLs for amdgpu
drm/amdgpu/display: only enable aux backlight control for OLED panels
drm/amd/display: ensure dentist display clock update finished in DCN20
drm/amd/display: Add missing DCN21 IP parameter
drm/amd/display: Guard DST_Y_PREFETCH register overflow in DCN21
drm/amdgpu: Check pmops for desired suspend state
drm/msm/dp: Initialize dp->aux->drm_dev before registration
drm/msm/dp: signal audio plugged change at dp_pm_resume
drm/msm/dp: Initialize the INTF_CONFIG register
drm/msm/dp: use dp_ctrl_off_link_stream during PHY compliance test run
drm/msm: Fix display fault handling
drm/msm/dpu: Fix sm8250_mdp register length
drm/amdgpu: Avoid printing of stack contents on firmware load error
drm/amdgpu: Fix resource leak on probe error path
drm/i915/display: split DISPLAY_VER 9 and 10 in intel_setup_outputs()
drm/i915: fix not reading DSC disable fuse in GLK
drm/i915/bios: Fix ports mask
drm/panel: panel-simple: Fix proper bpc for ytc700tlag_05_201c
drm/ttm: Initialize debugfs from ttm_global_init()
Linus Torvalds [Fri, 30 Jul 2021 04:03:47 +0000 (21:03 -0700)]
Merge tag 'fallthrough-fixes-clang-5.14-rc4' of git://git./linux/kernel/git/gustavoars/linux
Pull fallthrough fixes from Gustavo Silva:
"Fix some fall-through warnings when building with Clang and
'-Wimplicit-fallthrough' on ARM"
* tag 'fallthrough-fixes-clang-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
scsi: fas216: Fix fall-through warning for Clang
scsi: acornscsi: Fix fall-through warning for clang
ARM: riscpc: Fix fall-through warning for Clang
Linus Torvalds [Fri, 30 Jul 2021 03:57:56 +0000 (20:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mattst88/alpha
Pull alpha updates from Matt Turner:
"They're mostly small janitorial fixes but there's also more important
ones:
- drop the alpha-specific x86 binary loader (David Hildenbrand)
- regression fix for at least Marvel platforms (Mike Rapoport)
- fix for a scary-looking typo (Zheng Yongjun)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
alpha: register early reserved memory in memblock
alpha: fix spelling mistakes
alpha: Remove space between * and parameter name
alpha: fp_emul: avoid init/cleanup_module names
alpha: Add syscall_get_return_value()
binfmt: remove support for em86 (alpha only)
alpha: fix typos in a comment
alpha: defconfig: add necessary configs for boot testing
alpha: Send stop IPI to send to online CPUs
alpha: convert comma to semicolon
alpha: remove undef inline in compiler.h
alpha: Kconfig: Replace HTTP links with HTTPS ones
alpha: __udiv_qrnnd should be exported
Gustavo A. R. Silva [Mon, 26 Jul 2021 20:46:47 +0000 (15:46 -0500)]
scsi: fas216: Fix fall-through warning for Clang
Fix the following fallthrough warning (on ARM):
drivers/scsi/arm/fas216.c:1379:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
default:
^
drivers/scsi/arm/fas216.c:1379:2: note: insert 'break;' to avoid fall-through
default:
^
break;
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Gustavo A. R. Silva [Mon, 26 Jul 2021 20:33:53 +0000 (15:33 -0500)]
scsi: acornscsi: Fix fall-through warning for clang
Fix the following fallthrough warning (on ARM):
drivers/scsi/arm/acornscsi.c:2651:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
case res_success:
^
drivers/scsi/arm/acornscsi.c:2651:2: note: insert '__attribute__((fallthrough));' to silence this warning
case res_success:
^
__attribute__((fallthrough));
drivers/scsi/arm/acornscsi.c:2651:2: note: insert 'break;' to avoid fall-through
case res_success:
^
break;
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Gustavo A. R. Silva [Mon, 26 Jul 2021 20:06:22 +0000 (15:06 -0500)]
ARM: riscpc: Fix fall-through warning for Clang
Fix the following fallthrough warning:
arch/arm/mach-rpc/riscpc.c:52:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
default:
^
arch/arm/mach-rpc/riscpc.c:52:2: note: insert 'break;' to avoid fall-through
default:
^
break;
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Linus Torvalds [Thu, 29 Jul 2021 16:42:09 +0000 (09:42 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix MTE shared page detection
- Enable selftest's use of PMU registers when asked to
s390:
- restore 5.13 debugfs names
x86:
- fix sizes for vcpu-id indexed arrays
- fixes for AMD virtualized LAPIC (AVIC)
- other small bugfixes
Generic:
- access tracking performance test
- dirty_log_perf_test command line parsing fix
- Fix selftest use of obsolete pthread_yield() in favour of
sched_yield()
- use cpu_relax when halt polling
- fixed missing KVM_CLEAR_DIRTY_LOG compat ioctl"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: add missing compat KVM_CLEAR_DIRTY_LOG
KVM: use cpu_relax when halt polling
KVM: SVM: use vmcb01 in svm_refresh_apicv_exec_ctrl
KVM: SVM: tweak warning about enabled AVIC on nested entry
KVM: SVM: svm_set_vintr don't warn if AVIC is active but is about to be deactivated
KVM: s390: restore old debugfs names
KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized
KVM: selftests: Introduce access_tracking_perf_test
KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing
x86/kvm: fix vcpu-id indexed array sizes
KVM: x86: Check the right feature bit for MSR_KVM_ASYNC_PF_ACK access
docs: virt: kvm: api.rst: replace some characters
KVM: Documentation: Fix KVM_CAP_ENFORCE_PV_FEATURE_CPUID name
KVM: nSVM: Swap the parameter order for svm_copy_vmrun_state()/svm_copy_vmloadsave_state()
KVM: nSVM: Rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state()
KVM: arm64: selftests: get-reg-list: actually enable pmu regs in pmu sublist
KVM: selftests: change pthread_yield to sched_yield
KVM: arm64: Fix detection of shared VMAs on guest fault
Linus Torvalds [Thu, 29 Jul 2021 16:28:24 +0000 (09:28 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/gerg/m68knommu
Pull m68knommu fix from Greg Ungerer:
"A single compile time fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k/coldfire: change pll var. to clk_pll
Darrick J. Wong [Mon, 26 Jul 2021 23:43:17 +0000 (16:43 -0700)]
xfs: prevent spoofing of rtbitmap blocks when recovering buffers
While reviewing the buffer item recovery code, the thought occurred to
me: in V5 filesystems we use log sequence number (LSN) tracking to avoid
replaying older metadata updates against newer log items. However, we
use the magic number of the ondisk buffer to find the LSN of the ondisk
metadata, which means that if an attacker can control the layout of the
realtime device precisely enough that the start of an rt bitmap block
matches the magic and UUID of some other kind of block, they can control
the purported LSN of that spoofed block and thereby break log replay.
Since realtime bitmap and summary blocks don't have headers at all, we
have no way to tell if a block really should be replayed. The best we
can do is replay unconditionally and hope for the best.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Dave Chinner [Thu, 29 Jul 2021 00:14:11 +0000 (17:14 -0700)]
xfs: limit iclog tail updates
From the department of "generic/482 keeps on giving", we bring you
another tail update race condition:
iclog:
S1 C1
+-----------------------+-----------------------+
S2 EOIC
Two checkpoints in a single iclog. One is complete, the other just
contains the start record and overruns into a new iclog.
Timeline:
Before S1: Cache flush, log tail = X
At S1: Metadata stable, write start record and checkpoint
At C1: Write commit record, set NEED_FUA
Single iclog checkpoint, so no need for NEED_FLUSH
Log tail still = X, so no need for NEED_FLUSH
After C1,
Before S2: Cache flush, log tail = X
At S2: Metadata stable, write start record and checkpoint
After S2: Log tail moves to X+1
At EOIC: End of iclog, more journal data to write
Releases iclog
Not a commit iclog, so no need for NEED_FLUSH
Writes log tail X+1 into iclog.
At this point, the iclog has tail X+1 and NEED_FUA set. There has
been no cache flush for the metadata between X and X+1, and the
iclog writes the new tail permanently to the log. THis is sufficient
to violate on disk metadata/journal ordering.
We have two options here. The first is to detect this case in some
manner and ensure that the partial checkpoint write sets NEED_FLUSH
when the iclog is already marked NEED_FUA and the log tail changes.
This seems somewhat fragile and quite complex to get right, and it
doesn't actually make it obvious what underlying problem it is
actually addressing from reading the code.
The second option seems much cleaner to me, because it is derived
directly from the requirements of the C1 commit record in the iclog.
That is, when we write this commit record to the iclog, we've
guaranteed that the metadata/data ordering is correct for tail
update purposes. Hence if we only write the log tail into the iclog
for the *first* commit record rather than the log tail at the last
release, we guarantee that the log tail does not move past where the
the first commit record in the log expects it to be.
IOWs, taking the first option means that replay of C1 becomes
dependent on future operations doing the right thing, not just the
C1 checkpoint itself doing the right thing. This makes log recovery
almost impossible to reason about because now we have to take into
account what might or might not have happened in the future when
looking at checkpoints in the log rather than just having to
reconstruct the past...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 27 Jul 2021 23:23:50 +0000 (16:23 -0700)]
xfs: need to see iclog flags in tracing
Because I cannot tell if the NEED_FLUSH flag is being set correctly
by the log force and CIL push machinery without it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 27 Jul 2021 23:23:50 +0000 (16:23 -0700)]
xfs: Enforce attr3 buffer recovery order
From the department of "WTAF? How did we miss that!?"...
When we are recovering a buffer, the first thing we do is check the
buffer magic number and extract the LSN from the buffer. If the LSN
is older than the current LSN, we replay the modification to it. If
the metadata on disk is newer than the transaction in the log, we
skip it. This is a fundamental v5 filesystem metadata recovery
behaviour.
generic/482 failed with an attribute writeback failure during log
recovery. The write verifier caught the corruption before it got
written to disk, and the attr buffer dump looked like:
XFS (dm-3): Metadata corruption detected at xfs_attr3_leaf_verify+0x275/0x2e0, xfs_attr3_leaf block 0x19be8
XFS (dm-3): Unmount and run xfs_repair
XFS (dm-3): First 128 bytes of corrupted metadata buffer:
00000000: 00 00 00 00 00 00 00 00 3b ee 00 00 4d 2a 01 e1 ........;...M*..
00000010: 00 00 00 00 00 01 9b e8 00 00 00 01 00 00 05 38 ...............8
^^^^^^^^^^^^^^^^^^^^^^^
00000020: df 39 5e 51 58 ac 44 b6 8d c5 e7 10 44 09 bc 17 .9^QX.D.....D...
00000030: 00 00 00 00 00 02 00 83 00 03 00 cc 0f 24 01 00 .............$..
00000040: 00 68 0e bc 0f c8 00 10 00 00 00 00 00 00 00 00 .h..............
00000050: 00 00 3c 31 0f 24 01 00 00 00 3c 32 0f 88 01 00 ..<1.$....<2....
00000060: 00 00 3c 33 0f d8 01 00 00 00 00 00 00 00 00 00 ..<3............
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
.....
The highlighted bytes are the LSN that was replayed into the
buffer: 0x100000538. This is cycle 1, block 0x538. Prior to replay,
that block on disk looks like this:
$ sudo xfs_db -c "fsb 0x417d" -c "type attr3" -c p /dev/mapper/thin-vol
hdr.info.hdr.forw = 0
hdr.info.hdr.back = 0
hdr.info.hdr.magic = 0x3bee
hdr.info.crc = 0xb5af0bc6 (correct)
hdr.info.bno = 105448
hdr.info.lsn = 0x100000900
^^^^^^^^^^^
hdr.info.uuid =
df395e51-58ac-44b6-8dc5-
e7104409bc17
hdr.info.owner = 131203
hdr.count = 2
hdr.usedbytes = 120
hdr.firstused = 3796
hdr.holes = 1
hdr.freemap[0-2] = [base,size]
Note the LSN stamped into the buffer on disk: 1/0x900. The version
on disk is much newer than the log transaction that was being
replayed. That's a bug, and should -never- happen.
So I immediately went to look at xlog_recover_get_buf_lsn() to check
that we handled the LSN correctly. I was wondering if there was a
similar "two commits with the same start LSN skips the second
replay" problem with buffers. I didn't get that far, because I found
a much more basic, rudimentary bug: xlog_recover_get_buf_lsn()
doesn't recognise buffers with XFS_ATTR3_LEAF_MAGIC set in them!!!
IOWs, attr3 leaf buffers fall through the magic number checks
unrecognised, so trigger the "recover immediately" behaviour instead
of undergoing an LSN check. IOWs, we incorrectly replay ATTR3 leaf
buffers and that causes silent on disk corruption of inode attribute
forks and potentially other things....
Git history shows this is *another* zero day bug, this time
introduced in commit
50d5c8d8e938 ("xfs: check LSN ordering for v5
superblocks during recovery") which failed to handle the attr3 leaf
buffers in recovery. And we've failed to handle them ever since...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 27 Jul 2021 23:23:49 +0000 (16:23 -0700)]
xfs: logging the on disk inode LSN can make it go backwards
When we log an inode, we format the "log inode" core and set an LSN
in that inode core. We do that via xfs_inode_item_format_core(),
which calls:
xfs_inode_to_log_dinode(ip, dic, ip->i_itemp->ili_item.li_lsn);
to format the log inode. It writes the LSN from the inode item into
the log inode, and if recovery decides the inode item needs to be
replayed, it recovers the log inode LSN field and writes it into the
on disk inode LSN field.
Now this might seem like a reasonable thing to do, but it is wrong
on multiple levels. Firstly, if the item is not yet in the AIL,
item->li_lsn is zero. i.e. the first time the inode it is logged and
formatted, the LSN we write into the log inode will be zero. If we
only log it once, recovery will run and can write this zero LSN into
the inode.
This means that the next time the inode is logged and log recovery
runs, it will *always* replay changes to the inode regardless of
whether the inode is newer on disk than the version in the log and
that violates the entire purpose of recording the LSN in the inode
at writeback time (i.e. to stop it going backwards in time on disk
during recovery).
Secondly, if we commit the CIL to the journal so the inode item
moves to the AIL, and then relog the inode, the LSN that gets
stamped into the log inode will be the LSN of the inode's current
location in the AIL, not it's age on disk. And it's not the LSN that
will be associated with the current change. That means when log
recovery replays this inode item, the LSN that ends up on disk is
the LSN for the previous changes in the log, not the current
changes being replayed. IOWs, after recovery the LSN on disk is not
in sync with the LSN of the modifications that were replayed into
the inode. This, again, violates the recovery ordering semantics
that on-disk writeback LSNs provide.
Hence the inode LSN in the log dinode is -always- invalid.
Thirdly, recovery actually has the LSN of the log transaction it is
replaying right at hand - it uses it to determine if it should
replay the inode by comparing it to the on-disk inode's LSN. But it
doesn't use that LSN to stamp the LSN into the inode which will be
written back when the transaction is fully replayed. It uses the one
in the log dinode, which we know is always going to be incorrect.
Looking back at the change history, the inode logging was broken by
commit
93f958f9c41f ("xfs: cull unnecessary icdinode fields") way
back in 2016 by a stupid idiot who thought he knew how this code
worked. i.e. me. That commit replaced an in memory di_lsn field that
was updated only at inode writeback time from the inode item.li_lsn
value - and hence always contained the same LSN that appeared in the
on-disk inode - with a read of the inode item LSN at inode format
time. CLearly these are not the same thing.
Before
93f958f9c41f, the log recovery behaviour was irrelevant,
because the LSN in the log inode always matched the on-disk LSN at
the time the inode was logged, hence recovery of the transaction
would never make the on-disk LSN in the inode go backwards or get
out of sync.
A symptom of the problem is this, caught from a failure of
generic/482. Before log recovery, the inode has been allocated but
never used:
xfs_db> inode 393388
xfs_db> p
core.magic = 0x494e
core.mode = 0
....
v3.crc = 0x99126961 (correct)
v3.change_count = 0
v3.lsn = 0
v3.flags2 = 0
v3.cowextsize = 0
v3.crtime.sec = Thu Jan 1 10:00:00 1970
v3.crtime.nsec = 0
After log recovery:
xfs_db> p
core.magic = 0x494e
core.mode = 020444
....
v3.crc = 0x23e68f23 (correct)
v3.change_count = 2
v3.lsn = 0
v3.flags2 = 0
v3.cowextsize = 0
v3.crtime.sec = Thu Jul 22 17:03:03 2021
v3.crtime.nsec =
751000000
...
You can see that the LSN of the on-disk inode is 0, even though it
clearly has been written to disk. I point out this inode, because
the generic/482 failure occurred because several adjacent inodes in
this specific inode cluster were not replayed correctly and still
appeared to be zero on disk when all the other metadata (inobt,
finobt, directories, etc) indicated they should be allocated and
written back.
The fix for this is two-fold. The first is that we need to either
revert the LSN changes in
93f958f9c41f or stop logging the inode LSN
altogether. If we do the former, log recovery does not need to
change but we add 8 bytes of memory per inode to store what is
largely a write-only inode field. If we do the latter, log recovery
needs to stamp the on-disk inode in the same manner that inode
writeback does.
I prefer the latter, because we shouldn't really be trying to log
and replay changes to the on disk LSN as the on-disk value is the
canonical source of the on-disk version of the inode. It also
matches the way we recover buffer items - we create a buf_log_item
that carries the current recovery transaction LSN that gets stamped
into the buffer by the write verifier when it gets written back
when the transaction is fully recovered.
However, this might break log recovery on older kernels even more,
so I'm going to simply ignore the logged value in recovery and stamp
the on-disk inode with the LSN of the transaction being recovered
that will trigger writeback on transaction recovery completion. This
will ensure that the on-disk inode LSN always reflects the LSN of
the last change that was written to disk, regardless of whether it
comes from log recovery or runtime writeback.
Fixes: 93f958f9c41f ("xfs: cull unnecessary icdinode fields")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 27 Jul 2021 23:23:49 +0000 (16:23 -0700)]
xfs: avoid unnecessary waits in xfs_log_force_lsn()
Before waiting on a iclog in xfs_log_force_lsn(), we don't check to
see if the iclog has already been completed and the contents on
stable storage. We check for completed iclogs in xfs_log_force(), so
we should do the same thing for xfs_log_force_lsn().
This fixed some random up-to-30s pauses seen in unmounting
filesystems in some tests. A log force ends up waiting on completed
iclog, and that doesn't then get flushed (and hence the log force
get completed) until the background log worker issues a log force
that flushes the iclog in question. Then the unmount unblocks and
continues.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 27 Jul 2021 23:23:49 +0000 (16:23 -0700)]
xfs: log forces imply data device cache flushes
After fixing the tail_lsn vs cache flush race, generic/482 continued
to fail in a similar way where cache flushes were missing before
iclog FUA writes. Tracing of iclog state changes during the fsstress
workload portion of the test (via xlog_iclog* events) indicated that
iclog writes were coming from two sources - CIL pushes and log
forces (due to fsync/O_SYNC operations). All of the cases where a
recovery problem was triggered indicated that the log force was the
source of the iclog write that was not preceeded by a cache flush.
This was an oversight in the modifications made in commit
eef983ffeae7 ("xfs: journal IO cache flush reductions"). Log forces
for fsync imply a data device cache flush has been issued if an
iclog was flushed to disk and is indicated to the caller via the
log_flushed parameter so they can elide the device cache flush if
the journal issued one.
The change in
eef983ffeae7 results in iclogs only issuing a cache
flush if XLOG_ICL_NEED_FLUSH is set on the iclog, but this was not
added to the iclogs that the log force code flushes to disk. Hence
log forces are no longer guaranteeing that a cache flush is issued,
hence opening up a potential on-disk ordering failure.
Log forces should also set XLOG_ICL_NEED_FUA as well to ensure that
the actual iclogs it forces to the journal are also on stable
storage before it returns to the caller.
This patch introduces the xlog_force_iclog() helper function to
encapsulate the process of taking a reference to an iclog, switching
its state if WANT_SYNC and flushing it to stable storage correctly.
Both xfs_log_force() and xfs_log_force_lsn() are converted to use
it, as is xlog_unmount_write() which has an elaborate method of
doing exactly the same "write this iclog to stable storage"
operation.
Further, if the log force code needs to wait on a iclog in the
WANT_SYNC state, it needs to ensure that iclog also results in a
cache flush being issued. This covers the case where the iclog
contains the commit record of the CIL flush that the log force
triggered, but it hasn't been written yet because there is still an
active reference to the iclog.
Note: this whole cache flush whack-a-mole patch is a result of log
forces still being iclog state centric rather than being CIL
sequence centric. Most of this nasty code will go away in future
when log forces are converted to wait on CIL sequence push
completion rather than iclog completion. With the CIL push algorithm
guaranteeing that the CIL checkpoint is fully on stable storage when
it completes, we no longer need to iterate iclogs and push them to
ensure a CIL sequence push has completed and so all this nasty iclog
iteration and flushing code will go away.
Fixes: eef983ffeae7 ("xfs: journal IO cache flush reductions")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Dave Chinner [Tue, 27 Jul 2021 23:23:48 +0000 (16:23 -0700)]
xfs: factor out forced iclog flushes
We force iclogs in several places - we need them all to have the
same cache flush semantics, so start by factoring out the iclog
force into a common helper.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>