Krzysztof Kozlowski [Fri, 8 Dec 2023 10:51:55 +0000 (11:51 +0100)]
 
arm64: dts: qcom: sm8150: add necessary ref clock to PCIe
The PCIe nodes should get the ref clock, according to information from
Qualcomm.
Link: https://lore.kernel.org/all/20231121065440.GB3315@thinkpad/
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231208105155.36097-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:16 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sdm630: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-12-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:15 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sm8550: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones. Also, unify the naming scheme of the
thermal zones across the tree while at it.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-11-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:14 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sm8450: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones. Also, update the trip point label
to be more telling, while at it.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-10-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:13 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sm8350: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones. Also, update the trip point label
to be more telling, while at it.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-9-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:12 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sm8250: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones. Also, update the trip point label
to be more telling, while at it.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-8-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:11 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sm8150: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones. Also, update the trip point label
to be more telling, while at it.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-7-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:10 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sm6115: Mark GPU @ 125C critical
If the GPU ever reaches this temperature, the "critical" signal shuold
definitely be propagated. Fix the wrong type.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-6-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:09 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sm6115: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-5-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:08 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sdm845: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-4-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:07 +0000 (14:34 +0100)]
 
arm64: dts: qcom: sc8180x: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-3-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:06 +0000 (14:34 +0100)]
 
arm64: dts: qcom: msm8939: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-2-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 13:34:05 +0000 (14:34 +0100)]
 
arm64: dts: qcom: msm8916: Hook up GPU cooling device
In order to allow for throttling the GPU, hook up the cooling device
to the respective thermal zones.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-gpu_cooling-v1-1-fda30c57e353@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 18:29:50 +0000 (19:29 +0100)]
 
arm64: dts: qcom: x1e80100: Flush RSC sleep & wake votes
The RPMh driver will cache sleep and wake votes until the cluster
power-domain is about to enter idle, to avoid unnecessary writes. So
associate the apps_rsc with the cluster pd, so that it can be notified
about this event.
Without this, only AMC votes are being committed.
Fixes: af16b00578a7 ("arm64: dts: qcom: Add base X1E80100 dtsi and the QCP dts")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-x1e_fixes-v1-4-70723e08d5f6@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Tue, 2 Jan 2024 18:29:49 +0000 (19:29 +0100)]
 
arm64: dts: qcom: x1e80100: Add missing system-wide PSCI power domain
Previous Qualcomm SoCs over the past couple years have used the Arm DSU
architecture, which basically unified the meaning of the "cluster" and
"system". This is however clearly not the case on X1E, as can be seen
by three separate cluster power domains.
Add the lacking system-level power domain. For now it's going to be
always-on, as no system-wide idle states are defined at the moment.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Link: https://lore.kernel.org/r/20240102-topic-x1e_fixes-v1-3-70723e08d5f6@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Krishna Kurapati [Thu, 25 Jan 2024 18:59:21 +0000 (00:29 +0530)]
 
arm64: dts: qcom: Add missing interrupts for qcs404/ipq5332
For qcs404 and ipq5332, certain interrupts are missing in DT.
Add them to ensure they are in accordance to bindings.
The interrupts added enable remote wakeup functionality for these SoCs.
Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
Link: https://lore.kernel.org/r/20240125185921.5062-5-quic_kriskura@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Krishna Kurapati [Thu, 25 Jan 2024 18:59:20 +0000 (00:29 +0530)]
 
arm64: dts: qcom: Fix hs_phy_irq for SDM670/SDM845/SM6350
For sm6350/sdm670/sdm845, although they are qusb2 phy targets, dp/dm
interrupts are used for wakeup instead of qusb2_phy irq. These targets
were part of a generation that were the last ones to implement QUSB2 PHY
and the design incorporated dedicated DP/DM interrupts which eventually
carried forward to the newer femto based targets.
Add the missing pwr_event irq for these targets. Also modify order of
interrupts in accordance to bindings update. Modifying the order of these
interrupts is harmless as the driver tries to get these interrupts from DT
by name and not by index.
Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
Link: https://lore.kernel.org/r/20240125185921.5062-4-quic_kriskura@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Krishna Kurapati [Thu, 25 Jan 2024 18:59:19 +0000 (00:29 +0530)]
 
arm64: dts: qcom: Fix hs_phy_irq for non-QUSB2 targets
On non-QUSB2 targets (like the ones that use femto phys, M31 phy, eusb2
phy), many of the QCOM DTs are missing the IRQ for either hs_phy_irq or
pwr_event. In one case, the hs_phy_irq was incorrectly defined with the
latter's IRQ number. Since the DT must describe the hw whether or not
the driver uses these interrupts, fix and add the missing entries in order
to describe the HW completely and accurately.
Also modify order of interrupts in accordance to bindings update.
Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
Link: https://lore.kernel.org/r/20240125185921.5062-3-quic_kriskura@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Krishna Kurapati [Thu, 25 Jan 2024 18:59:18 +0000 (00:29 +0530)]
 
arm64: dts: qcom: Fix hs_phy_irq for QUSB2 targets
On several QUSB2 Targets, the hs_phy_irq mentioned is actually
qusb2_phy interrupt specific to QUSB2 PHY's. Rename hs_phy_irq
to qusb2_phy for such targets.
In actuality, the hs_phy_irq is also present in these targets, but
kept in for debug purposes in hw test environments. This is not
triggered by default and its functionality is mutually exclusive
to that of qusb2_phy interrupt.
Add missing hs_phy_irq's, pwr_event irq's for QUSB2 PHY targets.
Add missing ss_phy_irq on some targets which allows for remote
wakeup to work on a Super Speed link.
Also modify order of interrupts in accordance to bindings update.
Since driver looks up for interrupts by name and not by index, it
is safe to modify order of these interrupts in the DT.
Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
Link: https://lore.kernel.org/r/20240125185921.5062-2-quic_kriskura@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Neil Armstrong [Thu, 25 Jan 2024 08:13:35 +0000 (09:13 +0100)]
 
arm64: dts: qcom: sm8550: add support for the SM8550-HDK board
The SM8550-HDK is an embedded development platforms for the
Snapdragon 8 Gen 2 SoC aka SM8550, with the following features:
- Qualcomm SM8550 SoC
- 16GiB On-board LPDDR5
- On-board WiFi 7 + Bluetooth 5.3/BLE
- On-board UFS4.0
- M.2 Key B+M Gen3x2 PCIe Slot
- HDMI Output
- USB-C Connector with DP Almode & Audio Accessory mode
- Micro-SDCard Slot
- Audio Jack with Playback and Microphone
- 2 On-board Analog microphones
- 2 On-board Speakers
- 96Boards Compatible Low-Speed and High-Speed connectors [1]
  - For Camera, Sensors and external Display cards
  - Compatible with the Linaro Debug board [2]
- SIM Slot for Modem
- Debug connectors
- 6x On-Board LEDs
Product Page: [3]
[1] https://www.96boards.org/specifications/
[2] https://git.codelinaro.org/linaro/qcomlt/debugboard
[3] https://www.lantronix.com/products/snapdragon-8-gen-2-mobile-hardware-development-kit/
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20240125-topic-sm8550-upstream-hdk8550-v3-2-73bb5ef11cf8@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Neil Armstrong [Thu, 25 Jan 2024 08:13:34 +0000 (09:13 +0100)]
 
dt-bindings: arm: qcom: Document the HDK8550 board
Document the Qualcomm SM8550 based HDK (Hardware Development Kit)
embedded development platform designed by Qualcomm and sold by Lantronix.
[1] https://www.lantronix.com/products/snapdragon-8-gen-2-mobile-hardware-development-kit/
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20240125-topic-sm8550-upstream-hdk8550-v3-1-73bb5ef11cf8@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:11 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Add RPMh sleep stats
Add the sleep stats node to enable peeking at the power collapse reports
coming from the AOSS.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-10-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:10 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Shrink aoss_qmp register space size
The AOSS_QMP region is overallocated, bleeding into space that's supposed
to be used by other peripherals. Fix it.
Fixes: 8575f197b077 ("arm64: dts: qcom: Introduce the SC8180x platform")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-9-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:09 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Add missing CPU<->MDP_CFG path
To guarantee the required resources are enabled, describe the
interconnect path between the MDSS and the CPU.
Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-8-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:08 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Require LOW_SVS vote for MMCX if DISPCC is on
To ensure the PLLs are getting enough power, cast a vote with DISPCC so
that MMCX is at least at LOW_SVS.
Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-7-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:07 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Don't hold MDP core clock at FMAX
There's an OPP table to handle this, drop the permanent vote.
Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-6-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:06 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Fix eDP PHY power-domains
The (e)DP PHYs are powered by the MX line, not through the MDSS GDSC.
Fix that up.
Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-5-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:05 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Add missing CPU off state
The CPUs can be powered off without pulling the plug from the rest of
the system. Describe the idle state responsible for this.
Fixes: 8575f197b077 ("arm64: dts: qcom: Introduce the SC8180x platform")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-4-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:04 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Fix up big CPU idle state entry latency
The entry latency was oddly low.. Turns out somebody forgot about a
second '1'! Fix it.
Fixes: 8575f197b077 ("arm64: dts: qcom: Introduce the SC8180x platform")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-3-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:03 +0000 (01:05 +0100)]
 
arm64: dts: qcom: sc8180x: Hook up VDD_CX as GCC parent domain
Most of GCC is powered by the CX rail. Describe that relationship to
let the performance state requests trickle up the chain.
Fixes: 8575f197b077 ("arm64: dts: qcom: Introduce the SC8180x platform")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-2-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Konrad Dybcio [Sat, 30 Dec 2023 00:05:02 +0000 (01:05 +0100)]
 
dt-bindings: clock: gcc-sc8180x: Add the missing CX power domain
The GCC block is (mostly) powered by the VDD_CX rail. Allow specifying
it in power-domains.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231230-topic-8180_more_fixes-v1-1-93b5c107ed43@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Luca Weiss [Fri, 1 Dec 2023 09:33:20 +0000 (10:33 +0100)]
 
arm64: dts: qcom: qcm6490-fairphone-fp5: Enable venus node
Enable the venus node so that the video encoder/decoder will start
working.
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Link: https://lore.kernel.org/r/20231201-sc7280-venus-pas-v3-3-bc132dc5fc30@fairphone.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Luca Weiss [Fri, 1 Dec 2023 09:33:19 +0000 (10:33 +0100)]
 
arm64: dts: qcom: sc7280: Move video-firmware to chrome-common
If the video-firmware node is present, the venus driver assumes we're on
a system that doesn't use TZ for starting venus, like on ChromeOS
devices.
Move the video-firmware node to chrome-common.dtsi so we can use venus
on a non-ChromeOS devices. We also need to move the secure SID 0x2184
for iommu since (on some boards) we cannot touch that.
At the same time also disable the venus node by default in the dtsi,
like it's done on other SoCs.
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Link: https://lore.kernel.org/r/20231201-sc7280-venus-pas-v3-2-bc132dc5fc30@fairphone.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Krzysztof Kozlowski [Mon, 18 Dec 2023 14:50:50 +0000 (15:50 +0100)]
 
arm64: dts: qcom: x1e80100: drop qcom,drv-count
Property qcom,drv-count in the RSC node is not allowed and not used:
  x1e80100-crd.dtb: rsc@
17500000: 'qcom,drv-count' does not match any of the regexes: '^regulators(-[0-9])?$', 'pinctrl-[0-9]+'
Fixes: af16b00578a7 ("arm64: dts: qcom: Add base X1E80100 dtsi and the QCP dts")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231218145050.66394-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Krishna chaitanya chundru [Mon, 18 Dec 2023 14:02:36 +0000 (19:32 +0530)]
 
arm64: dts: qcom: sc7280: Add additional MSI interrupts
Current MSI's mapping doesn't have all the vectors. This platform
supports 8 vectors each vector supports 32 MSI's, so total MSI's
supported is 256.
Add all the MSI groups supported for this PCIe instance in this platform.
Fixes: 92e0ee9f83b3 ("arm64: dts: qcom: sc7280: Add PCIe and PHY related nodes")
cc: stable@vger.kernel.org
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Link: https://lore.kernel.org/r/20231218-additional_msi-v1-1-de6917392684@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Linus Torvalds [Sun, 21 Jan 2024 22:11:32 +0000 (14:11 -0800)]
 
Linux 6.8-rc1
Linus Torvalds [Sun, 21 Jan 2024 22:01:12 +0000 (14:01 -0800)]
 
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs updates from Kent Overstreet:
 "Some fixes, Some refactoring, some minor features:
   - Assorted prep work for disk space accounting rewrite
   - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
     makes our trigger context more explicit
   - A few fixes to avoid excessive transaction restarts on
     multithreaded workloads: fstests (in addition to ktest tests) are
     now checking slowpath counters, and that's shaking out a few bugs
   - Assorted tracepoint improvements
   - Starting to break up bcachefs_format.h and move on disk types so
     they're with the code they belong to; this will make room to start
     documenting the on disk format better.
   - A few minor fixes"
* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
  bcachefs: Improve inode_to_text()
  bcachefs: logged_ops_format.h
  bcachefs: reflink_format.h
  bcachefs; extents_format.h
  bcachefs: ec_format.h
  bcachefs: subvolume_format.h
  bcachefs: snapshot_format.h
  bcachefs: alloc_background_format.h
  bcachefs: xattr_format.h
  bcachefs: dirent_format.h
  bcachefs: inode_format.h
  bcachefs; quota_format.h
  bcachefs: sb-counters_format.h
  bcachefs: counters.c -> sb-counters.c
  bcachefs: comment bch_subvolume
  bcachefs: bch_snapshot::btime
  bcachefs: add missing __GFP_NOWARN
  bcachefs: opts->compression can now also be applied in the background
  bcachefs: Prep work for variable size btree node buffers
  bcachefs: grab s_umount only if snapshotting
  ...
Linus Torvalds [Sun, 21 Jan 2024 19:14:40 +0000 (11:14 -0800)]
 
Merge tag 'timers-core-2024-01-21' of git://git./linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Updates for time and clocksources:
   - A fix for the idle and iowait time accounting vs CPU hotplug.
     The time is reset on CPU hotplug which makes the accumulated
     systemwide time jump backwards.
   - Assorted fixes and improvements for clocksource/event drivers"
* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
  clocksource/drivers/ep93xx: Fix error handling during probe
  clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
  clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
  clocksource/timer-riscv: Add riscv_clock_shutdown callback
  dt-bindings: timer: Add StarFive JH8100 clint
  dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
Linus Torvalds [Sun, 21 Jan 2024 19:04:29 +0000 (11:04 -0800)]
 
Merge tag 'powerpc-6.8-2' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Aneesh Kumar:
 - Increase default stack size to 32KB for Book3S
Thanks to Michael Ellerman.
* tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Increase default stack size to 32KB
Kent Overstreet [Sun, 21 Jan 2024 17:19:01 +0000 (12:19 -0500)]
 
bcachefs: Improve inode_to_text()
Add line breaks - inode_to_text() is now much easier to read.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 07:57:45 +0000 (02:57 -0500)]
 
bcachefs: logged_ops_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 07:54:47 +0000 (02:54 -0500)]
 
bcachefs: reflink_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 07:51:56 +0000 (02:51 -0500)]
 
bcachefs; extents_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 07:47:14 +0000 (02:47 -0500)]
 
bcachefs: ec_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 07:42:53 +0000 (02:42 -0500)]
 
bcachefs: subvolume_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 07:41:06 +0000 (02:41 -0500)]
 
bcachefs: snapshot_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 05:01:52 +0000 (00:01 -0500)]
 
bcachefs: alloc_background_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:59:15 +0000 (23:59 -0500)]
 
bcachefs: xattr_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:57:10 +0000 (23:57 -0500)]
 
bcachefs: dirent_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:55:39 +0000 (23:55 -0500)]
 
bcachefs: inode_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:53:52 +0000 (23:53 -0500)]
 
bcachefs; quota_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:50:56 +0000 (23:50 -0500)]
 
bcachefs: sb-counters_format.h
bcachefs_format.h has gotten too big; let's do some organizing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:46:35 +0000 (23:46 -0500)]
 
bcachefs: counters.c -> sb-counters.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:44:17 +0000 (23:44 -0500)]
 
bcachefs: comment bch_subvolume
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 21 Jan 2024 04:35:41 +0000 (23:35 -0500)]
 
bcachefs: bch_snapshot::btime
Add a field to bch_snapshot for creation time; this will be important
when we start exposing the snapshot tree to userspace.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 17 Jan 2024 22:16:07 +0000 (17:16 -0500)]
 
bcachefs: add missing __GFP_NOWARN
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 21:20:21 +0000 (16:20 -0500)]
 
bcachefs: opts->compression can now also be applied in the background
The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 18:29:59 +0000 (13:29 -0500)]
 
bcachefs: Prep work for variable size btree node buffers
bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.
And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.
Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Su Yue [Mon, 15 Jan 2024 02:21:25 +0000 (10:21 +0800)]
 
bcachefs: grab s_umount only if snapshotting
When I was testing mongodb over bcachefs with compression,
there is a lockdep warning when snapshotting mongodb data volume.
$ cat test.sh
prog=bcachefs
$prog subvolume create /mnt/data
$prog subvolume create /mnt/data/snapshots
while true;do
    $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
    sleep 1s
done
$ cat /etc/mongodb.conf
systemLog:
  destination: file
  logAppend: true
  path: /mnt/data/mongod.log
storage:
  dbPath: /mnt/data/
lockdep reports:
[ 3437.452330] ======================================================
[ 3437.452750] WARNING: possible circular locking dependency detected
[ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G            E
[ 3437.453562] ------------------------------------------------------
[ 3437.453981] bcachefs/35533 is trying to acquire lock:
[ 3437.454325] 
ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
[ 3437.454875]
               but task is already holding lock:
[ 3437.455268] 
ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.456009]
               which lock already depends on the new lock.
[ 3437.456553]
               the existing dependency chain (in reverse order) is:
[ 3437.457054]
               -> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
[ 3437.457507]        down_read+0x3e/0x170
[ 3437.457772]        bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.458206]        __x64_sys_ioctl+0x93/0xd0
[ 3437.458498]        do_syscall_64+0x42/0xf0
[ 3437.458779]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.459155]
               -> #2 (&c->snapshot_create_lock){++++}-{3:3}:
[ 3437.459615]        down_read+0x3e/0x170
[ 3437.459878]        bch2_truncate+0x82/0x110 [bcachefs]
[ 3437.460276]        bchfs_truncate+0x254/0x3c0 [bcachefs]
[ 3437.460686]        notify_change+0x1f1/0x4a0
[ 3437.461283]        do_truncate+0x7f/0xd0
[ 3437.461555]        path_openat+0xa57/0xce0
[ 3437.461836]        do_filp_open+0xb4/0x160
[ 3437.462116]        do_sys_openat2+0x91/0xc0
[ 3437.462402]        __x64_sys_openat+0x53/0xa0
[ 3437.462701]        do_syscall_64+0x42/0xf0
[ 3437.462982]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.463359]
               -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
[ 3437.463843]        down_write+0x3b/0xc0
[ 3437.464223]        bch2_write_iter+0x5b/0xcc0 [bcachefs]
[ 3437.464493]        vfs_write+0x21b/0x4c0
[ 3437.464653]        ksys_write+0x69/0xf0
[ 3437.464839]        do_syscall_64+0x42/0xf0
[ 3437.465009]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.465231]
               -> #0 (sb_writers#10){.+.+}-{0:0}:
[ 3437.465471]        __lock_acquire+0x1455/0x21b0
[ 3437.465656]        lock_acquire+0xc6/0x2b0
[ 3437.465822]        mnt_want_write+0x46/0x1a0
[ 3437.465996]        filename_create+0x62/0x190
[ 3437.466175]        user_path_create+0x2d/0x50
[ 3437.466352]        bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.466617]        __x64_sys_ioctl+0x93/0xd0
[ 3437.466791]        do_syscall_64+0x42/0xf0
[ 3437.466957]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.467180]
               other info that might help us debug this:
[ 3437.469670] 2 locks held by bcachefs/35533:
               other info that might help us debug this:
[ 3437.467507] Chain exists of:
                 sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48
[ 3437.467979]  Possible unsafe locking scenario:
[ 3437.468223]        CPU0                    CPU1
[ 3437.468405]        ----                    ----
[ 3437.468585]   rlock(&type->s_umount_key#48);
[ 3437.468758]                                lock(&c->snapshot_create_lock);
[ 3437.469030]                                lock(&type->s_umount_key#48);
[ 3437.469291]   rlock(sb_writers#10);
[ 3437.469434]
                *** DEADLOCK ***
[ 3437.469670] 2 locks held by bcachefs/35533:
[ 3437.469838]  #0: 
ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
[ 3437.470294]  #1: 
ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.470744]
               stack backtrace:
[ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G            E      6.7.0-rc7-custom+ #85
[ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 3437.471694] Call Trace:
[ 3437.471795]  <TASK>
[ 3437.471884]  dump_stack_lvl+0x57/0x90
[ 3437.472035]  check_noncircular+0x132/0x150
[ 3437.472202]  __lock_acquire+0x1455/0x21b0
[ 3437.472369]  lock_acquire+0xc6/0x2b0
[ 3437.472518]  ? filename_create+0x62/0x190
[ 3437.472683]  ? lock_is_held_type+0x97/0x110
[ 3437.472856]  mnt_want_write+0x46/0x1a0
[ 3437.473025]  ? filename_create+0x62/0x190
[ 3437.473204]  filename_create+0x62/0x190
[ 3437.473380]  user_path_create+0x2d/0x50
[ 3437.473555]  bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.473819]  ? lock_acquire+0xc6/0x2b0
[ 3437.474002]  ? __fget_files+0x2a/0x190
[ 3437.474195]  ? __fget_files+0xbc/0x190
[ 3437.474380]  ? lock_release+0xc5/0x270
[ 3437.474567]  ? __x64_sys_ioctl+0x93/0xd0
[ 3437.474764]  ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
[ 3437.475090]  __x64_sys_ioctl+0x93/0xd0
[ 3437.475277]  do_syscall_64+0x42/0xf0
[ 3437.475454]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.475691] RIP: 0033:0x7f2743c313af
======================================================
In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 
42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.
Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().
Signed-off-by: Su Yue <glass.su@suse.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Su Yue [Tue, 16 Jan 2024 11:05:37 +0000 (19:05 +0800)]
 
bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit
bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
It should be freed by kvfree not kfree.
Or umount will triger:
[  406.829178 ] BUG: unable to handle page fault for address: 
ffffe7b487148008
[  406.830676 ] #PF: supervisor read access in kernel mode
[  406.831643 ] #PF: error_code(0x0000) - not-present page
[  406.832487 ] PGD 0 P4D 0
[  406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
[  406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G           OE      6.7.0-rc7-custom+ #90
[  406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[  406.835796 ] RIP: 0010:kfree+0x62/0x140
[  406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
[  406.837810 ] RSP: 0018:
ffffb9d641607e48 EFLAGS: 
00010286
[  406.838213 ] RAX: 
ffffe7b487148000 RBX: 
ffffb9d645200000 RCX: 
ffffb9d641607dc4
[  406.838738 ] RDX: 
000065bb00000000 RSI: 
ffffffffc0d88b84 RDI: 
ffffb9d645200000
[  406.839217 ] RBP: 
ffff9a4625d00068 R08: 
0000000000000001 R09: 
0000000000000001
[  406.839650 ] R10: 
0000000000000001 R11: 
000000000000001f R12: 
ffff9a4625d4da80
[  406.840055 ] R13: 
ffff9a4625d00000 R14: 
ffffffffc0e2eb20 R15: 
0000000000000000
[  406.840451 ] FS:  
00007f0a264ffb80(0000) GS:
ffff9a4e2d500000(0000) knlGS:
0000000000000000
[  406.840851 ] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
[  406.841125 ] CR2: 
ffffe7b487148008 CR3: 
000000018c4d2000 CR4: 
00000000000006f0
[  406.841464 ] Call Trace:
[  406.841583 ]  <TASK>
[  406.841682 ]  ? __die+0x1f/0x70
[  406.841828 ]  ? page_fault_oops+0x159/0x470
[  406.842014 ]  ? fixup_exception+0x22/0x310
[  406.842198 ]  ? exc_page_fault+0x1ed/0x200
[  406.842382 ]  ? asm_exc_page_fault+0x22/0x30
[  406.842574 ]  ? bch2_fs_release+0x54/0x280 [bcachefs]
[  406.842842 ]  ? kfree+0x62/0x140
[  406.842988 ]  ? kfree+0x104/0x140
[  406.843138 ]  bch2_fs_release+0x54/0x280 [bcachefs]
[  406.843390 ]  kobject_put+0xb7/0x170
[  406.843552 ]  deactivate_locked_super+0x2f/0xa0
[  406.843756 ]  cleanup_mnt+0xba/0x150
[  406.843917 ]  task_work_run+0x59/0xa0
[  406.844083 ]  exit_to_user_mode_prepare+0x197/0x1a0
[  406.844302 ]  syscall_exit_to_user_mode+0x16/0x40
[  406.844510 ]  do_syscall_64+0x4e/0xf0
[  406.844675 ]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  406.844907 ] RIP: 0033:0x7f0a2664e4fb
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 16:38:04 +0000 (11:38 -0500)]
 
bcachefs: bios must be 512 byte algined
Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Colin Ian King [Tue, 16 Jan 2024 11:07:23 +0000 (11:07 +0000)]
 
bcachefs: remove redundant variable tmp
The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.
Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 01:40:06 +0000 (20:40 -0500)]
 
bcachefs: Improve trace_trans_restart_relock
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 01:37:23 +0000 (20:37 -0500)]
 
bcachefs: Fix excess transaction restarts in __bchfs_fallocate()
drop_locks_do() should not be used in a fastpath without first trying
the do in nonblocking mode - the unlock and relock will cause excessive
transaction restarts and potentially livelocking with other threads that
are contending for the same locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 23:19:52 +0000 (18:19 -0500)]
 
bcachefs: extents_to_bp_state
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 23:08:32 +0000 (18:08 -0500)]
 
bcachefs: bkey_and_val_eq()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 22:59:51 +0000 (17:59 -0500)]
 
bcachefs: Better journal tracepoints
Factor out bch2_journal_bufs_to_text(), and use it in the
journal_entry_full() tracepoint; when we can't get a journal reservation
we need to know the outstanding journal entry sizes to know if the
problem is due to excessive flushing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 22:57:44 +0000 (17:57 -0500)]
 
bcachefs: Print size of superblock with space allocated
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 22:56:22 +0000 (17:56 -0500)]
 
bcachefs: Avoid flushing the journal in the discard path
When issuing discards, we may need to flush the journal if there's too
many buckets that can't be discarded until a journal flush.
But the heuristic was bad; we should be comparing the number of buckets
that need to flushes against the number of free buckets, not the number
of buckets we saw.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 20:33:39 +0000 (15:33 -0500)]
 
bcachefs: Improve move_extent tracepoint
Also print out the data_opts, so that we can see what specifically is
being done to an extent.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 20:06:43 +0000 (15:06 -0500)]
 
bcachefs: Add missing bch2_moving_ctxt_flush_all()
This fixes a bug with rebalance IOs getting stuck with reads completed,
but writes never being issued.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 20:04:40 +0000 (15:04 -0500)]
 
bcachefs: Re-add move_extent_write tracepoint
It appears this was accidentally deleted at some point - also, do a bit
of cleanup.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 19:15:26 +0000 (14:15 -0500)]
 
bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount
Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code
that uses it to be woken up for other reasons, and fixes a bug where
rebalance wouldn't wake up when a scan was requested.
This raises the possibility of spurious wakeups, but callers should
always be able to handle that reasonably well.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 19:15:03 +0000 (14:15 -0500)]
 
bcachefs: Add .val_to_text() for KEY_TYPE_cookie
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 19:12:43 +0000 (14:12 -0500)]
 
bcachefs: Don't pass memcmp() as a pointer
Some (buggy!) compilers have issues with this.
Fixes: https://github.com/koverstreet/bcachefs/issues/625
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sun, 21 Jan 2024 18:21:43 +0000 (10:21 -0800)]
 
Merge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs
Pull header fix from Kent Overstreet:
 "Just one small fixup for the RT build"
* tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs:
  spinlock: Fix failing build for PREEMPT_RT
Kent Overstreet [Thu, 11 Jan 2024 04:47:04 +0000 (23:47 -0500)]
 
bcachefs: Reduce would_deadlock restarts
We don't have to take locks in any particular ordering - we'll make
forward progress just fine - but if we try to stick to an ordering, it
can help to avoid excessive would_deadlock transaction restarts.
This tweaks the reflink path to take extents btree locks in the right
order.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 11 Nov 2023 20:08:36 +0000 (15:08 -0500)]
 
bcachefs: bch2_trans_account_disk_usage_change()
The disk space accounting rewrite is splitting out accounting for each
replicas set - those are moving to btree keys, instead of percpu
counters.
This breaks bch2_trans_fs_usage_apply() up, splitting out the part we
will still need.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 17 Nov 2023 05:03:45 +0000 (00:03 -0500)]
 
bcachefs: bch_fs_usage_base
Split out base filesystem usage into its own type; prep work for
breaking up bch2_trans_fs_usage_apply().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Jan 2024 02:01:47 +0000 (21:01 -0500)]
 
bcachefs: bch2_prt_compression_type()
bounds checking helper, since compression types are extensible
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Jan 2024 01:57:43 +0000 (20:57 -0500)]
 
bcachefs: helpers for printing data types
We need bounds checking since new versions may introduce new data types.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Jan 2024 22:14:46 +0000 (17:14 -0500)]
 
bcachefs: BTREE_TRIGGER_ATOMIC
Add a new flag to be explicit about when we're running atomic triggers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Jan 2024 00:47:09 +0000 (19:47 -0500)]
 
bcachefs: drop to_text code for obsolete bps in alloc keys
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Jan 2024 00:29:14 +0000 (19:29 -0500)]
 
bcachefs: eytzinger_for_each() declares loop iter
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jan 2024 04:08:30 +0000 (23:08 -0500)]
 
bcachefs: Don't log errors if BCH_WRITE_ALLOC_NOWAIT
Previously, we added logging in the write path to ensure that any
unexpected errors getting reported to userspace have a log message; but
BCH_WRITE_ALLOC_NOWAIT is a special case, it's used for promotes where
errors are expected and not reported out to userspace - so we need to
silence those.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Su Yue [Mon, 8 Jan 2024 15:11:08 +0000 (23:11 +0800)]
 
bcachefs: fix memleak in bch2_split_devs
The pointer dev_name can be modified by strseq(),
then causes the memleak:
unreferenced object 0xffff9d08a2916c80 (size 32):
  comm "mount.bcachefs", pid 9090, jiffies 
4295856224 (age 17.564s)
  hex dump (first 32 bytes):
    2f 64 65 76 2f 6d 61 70 70 65 72 2f 74 65 73 74  /dev/mapper/test
    2d 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00  -0..............
  backtrace:
    [<
00000000c5d3be7d>] __kmem_cache_alloc_node+0x1f3/0x2c0
    [<
0000000052215d26>] __kmalloc_node_track_caller+0x51/0x150
    [<
0000000069fea956>] kstrdup+0x32/0x60
    [<
000000000877fcf1>] bch2_split_devs+0x3f/0x150 [bcachefs]
    [<
000000007ee93204>] bch2_mount+0xcb/0x640 [bcachefs]
    [<
000000002dd1e04b>] legacy_get_tree+0x30/0x60
    [<
000000006afc31d3>] vfs_get_tree+0x28/0xf0
    [<
000000007b0c538e>] path_mount+0x475/0xb60
    [<
0000000092de5882>] __x64_sys_mount+0x105/0x140
    [<
0000000054fc05d8>] do_syscall_64+0x42/0xf0
    [<
00000000df584910>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
Fix it by copy pointer dev_name at beginning and free the copied
pointer at end.
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sun, 21 Jan 2024 00:48:07 +0000 (16:48 -0800)]
 
Merge tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
 "Various smb client fixes, including multichannel and for SMB3.1.1
  POSIX extensions:
   - debugging improvement (display start time for stats)
   - two reparse point handling fixes
   - various multichannel improvements and fixes
   - SMB3.1.1 POSIX extensions open/create parsing fix
   - retry (reconnect) improvement including new retrans mount parm, and
     handling of two additional return codes that need to be retried on
   - two minor cleanup patches and another to remove duplicate query
     info code
   - two documentation cleanup, and one reviewer email correction"
* tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update iface_last_update on each query-and-update
  cifs: handle servers that still advertise multichannel after disabling
  cifs: new mount option called retrans
  cifs: reschedule periodic query for server interfaces
  smb: client: don't clobber ->i_rdev from cached reparse points
  smb: client: get rid of smb311_posix_query_path_info()
  smb: client: parse owner/group when creating reparse points
  smb: client: fix parsing of SMB3.1.1 POSIX create context
  cifs: update known bugs mentioned in kernel docs for cifs
  cifs: new nt status codes from MS-SMB2
  cifs: pick channel for tcon and tdis
  cifs: open_cached_dir should not rely on primary channel
  smb3: minor documentation updates
  Update MAINTAINERS email address
  cifs: minor comment cleanup
  smb3: show beginning time for per share stats
  cifs: remove redundant variable tcon_exist
Linus Torvalds [Sat, 20 Jan 2024 23:03:25 +0000 (15:03 -0800)]
 
Merge tag 'dmaengine-fix-6.8-rc1' of git://git./linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
 "New support:
   - Loongson LS2X APB DMA controller
   - sf-pdma: mpfs-pdma support
   - Qualcomm X1E80100 GPI dma controller support
  Updates:
   - Xilinx XDMA updates to support interleaved DMA transfers
   - TI PSIL threads for AM62P and J722S and cfg register regions
     description
   - axi-dmac Improving the cyclic DMA transfers
   - Tegra Support dma-channel-mask property
   - Remaining platform remove callback returning void conversions
 Driver fixes for:
   - Xilinx xdma driver operator precedence and initialization fix
   - Excess kernel-doc warning fix in imx-sdma xilinx xdma drivers
   - format-overflow warning fix for rz-dmac, sh usb dmac drivers
   - 'output may be truncated' fix for shdma, fsl-qdma and dw-edma
     drivers"
* tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (58 commits)
  dmaengine: dw-edma: increase size of 'name' in debugfs code
  dmaengine: fsl-qdma: increase size of 'irq_name'
  dmaengine: shdma: increase size of 'dev_id'
  dmaengine: xilinx: xdma: Fix kernel-doc warnings
  dmaengine: usb-dmac: Avoid format-overflow warning
  dmaengine: sh: rz-dmac: Avoid format-overflow warning
  dmaengine: imx-sdma: fix Excess kernel-doc warnings
  dmaengine: xilinx: xdma: Fix initialization location of desc in xdma_channel_isr()
  dmaengine: xilinx: xdma: Fix operator precedence in xdma_prep_interleaved_dma()
  dmaengine: xilinx: xdma: statify xdma_prep_interleaved_dma
  dmaengine: xilinx: xdma: Workaround truncation compilation error
  dmaengine: pl330: issue_pending waits until WFP state
  dmaengine: xilinx: xdma: Implement interleaved DMA transfers
  dmaengine: xilinx: xdma: Prepare the introduction of interleaved DMA transfers
  dmaengine: xilinx: xdma: Add transfer error reporting
  dmaengine: xilinx: xdma: Add error checking in xdma_channel_isr()
  dmaengine: xilinx: xdma: Rework xdma_terminate_all()
  dmaengine: xilinx: xdma: Ease dma_pool alignment requirements
  dmaengine: xilinx: xdma: Add necessary macro definitions
  dmaengine: xilinx: xdma: Get rid of unused code
  ...
Linus Torvalds [Sat, 20 Jan 2024 22:20:34 +0000 (14:20 -0800)]
 
Merge tag 'coccinelle-for-6.8' of git://git./linux/kernel/git/jlawall/linux
Pull coccinelle updates from Julia Lawall:
 "Updates to the device_attr_show semantic patch to reflect the new
  guidelines of the Linux kernel documentation.
  The problem was identified by Li Zhijian <lizhijian@fujitsu.com>, who
  proposed an initial fix"
* tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  coccinelle: device_attr_show: simplify patch case
  coccinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst
Aurelien Jarno [Sat, 13 Jan 2024 18:33:31 +0000 (19:33 +0100)]
 
media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)
This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
driver.  This improves the readability and more importantly, for the
solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):
 - the preprocessed size from 121 MiB to 4.5 MiB;
 - the build CPU time from 46.8 s to 1.6 s;
 - the build memory from 2786 MiB to 98MiB.
In fine, this allows this relatively simple C file to be built on a
32-bit system.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Closes: https://lore.kernel.org/lkml/18c6df0d-45ed-450c-9eda-95160a2bbb8e@gmail.com/
Cc:  <stable@vger.kernel.org> # v6.7+
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: David Laight <David.Laight@ACULAB.COM>
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Julia Lawall [Sat, 20 Jan 2024 20:56:11 +0000 (21:56 +0100)]
 
coccinelle: device_attr_show: simplify patch case
Replacing the final expression argument by ... allows the format
string to have multiple arguments.
It also has the advantage of allowing the change to be recognized as
a change in a single statement, thus avoiding adding unneeded braces.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Linus Torvalds [Tue, 9 Jan 2024 00:43:04 +0000 (16:43 -0800)]
 
execve: open the executable file before doing anything else
No point in allocating a new mm, counting arguments and environment
variables etc if we're just going to return ENOENT.
This patch does expose the fact that 'do_filp_open()' that execve() uses
is still unnecessarily expensive in the failure case, because it
allocates the 'struct file *' early, even if the path lookup (which is
heavily optimized) fails.
So that remains an unnecessary cost in the "no such executable" case,
but it's a separate issue.  Regardless, I do not want to do _both_ a
filename_lookup() and a later do_filp_open() like the origin patch by
Josh Triplett did in [1].
Reported-by: Josh Triplett <josh@joshtriplett.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/lkml/5c7333ea4bec2fad1b47a8fa2db7c31e4ffc4f14.1663334978.git.josh@joshtriplett.org/
Link: https://lore.kernel.org/lkml/202209161637.9EDAF6B18@keescook/
Link: https://lore.kernel.org/lkml/CAHk-=wgznerM-xs+x+krDfE7eVBiy_HOam35rbsFMMOwvYuEKQ@mail.gmail.com/
Link: https://lore.kernel.org/lkml/CAHk-=whf9qLO8ipps4QhmS0BkM8mtWJhvnuDSdtw5gFjhzvKNA@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 20 Jan 2024 19:06:04 +0000 (11:06 -0800)]
 
Merge tag 'riscv-for-linus-6.8-mw4' of git://git./linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
 - Support for tuning for systems with fast misaligned accesses.
 - Support for SBI-based suspend.
 - Support for the new SBI debug console extension.
 - The T-Head CMOs now use PA-based flushes.
 - Support for enabling the V extension in kernel code.
 - Optimized IP checksum routines.
 - Various ftrace improvements.
 - Support for archrandom, which depends on the Zkr extension.
 - The build is no longer broken under NET=n, KUNIT=y for ports that
   don't define their own ipv6 checksum.
* tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits)
  lib: checksum: Fix build with CONFIG_NET=n
  riscv: lib: Check if output in asm goto supported
  riscv: Fix build error on rv32 + XIP
  riscv: optimize ELF relocation function in riscv
  RISC-V: Implement archrandom when Zkr is available
  riscv: Optimize hweight API with Zbb extension
  riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
  samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  riscv: ftrace: Make function graph use ftrace directly
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig
  kunit: Add tests for csum_ipv6_magic and ip_fast_csum
  riscv: Add checksum library
  riscv: Add checksum header
  riscv: Add static key for misaligned accesses
  asm-generic: Improve csum_fold
  RISC-V: selftests: cbo: Ensure asm operands match constraints
  ...
Linus Torvalds [Sat, 20 Jan 2024 17:42:32 +0000 (09:42 -0800)]
 
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
 "Final round of fixes that came in too late to send in the first
  request.
  It's nine bug fixes and one version update (because of a bug fix) and
  one set of PCI ID additions. There's one bug fix in the core which is
  really a one liner (except that an additional sdev pointer was added
  for convenience) and the rest are in drivers"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: core: Add TMF to tmr_list handling
  scsi: core: Kick the requeue list after inserting when flushing
  scsi: fnic: unlock on error path in fnic_queuecommand()
  scsi: fcoe: Fix unsigned comparison with zero in store_ctlr_mode()
  scsi: mpi3mr: Fix mpi3mr_fw.c kernel-doc warnings
  scsi: smartpqi: Bump driver version to 2.1.26-030
  scsi: smartpqi: Fix logical volume rescan race condition
  scsi: smartpqi: Add new controller PCI IDs
  scsi: ufs: qcom: Remove unnecessary goto statement from ufs_qcom_config_esi()
  scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan()
  scsi: ufs: core: Simplify power management during async scan
Linus Torvalds [Sat, 20 Jan 2024 17:24:06 +0000 (09:24 -0800)]
 
Merge tag 'sh-for-v6.8-tag1' of git://git./linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
 "Since the large patch series to convert arch/sh to device tree support
  has not been finalized yet due to various maintainers still asking for
  changes to the series, this ended up being rather small consisting of
  just two fixes.
  The first patch by Geert Uytterhoeven addresses a build failure in the
  EcoVec platform code. And the second patch by Masahiro Yamada removes
  an unnecessary $(foreach ...) found in a Makefile of the vsyscall
  code.
   - Rename missed backlight field from fbdev to dev
   - Remove unnecessary $(foreach ...)"
* tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: vsyscall: Remove unnecessary $(foreach ...)
  sh: ecovec24: Rename missed backlight field from fbdev to dev
Linus Torvalds [Sat, 20 Jan 2024 17:14:04 +0000 (09:14 -0800)]
 
Merge tag 'fbdev-for-6.8-rc1-2' of git://git./linux/kernel/git/deller/linux-fbdev
Pull fbdev fix from Helge Deller:
 "There were various reports from people without any graphics output on
  the screen and it turns out one commit triggers the problem.
   - Revert 'firmware/sysfb: Clear screen_info state after consuming it'"
* tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  Revert "firmware/sysfb: Clear screen_info state after consuming it"
Linus Torvalds [Fri, 19 Jan 2024 22:25:23 +0000 (14:25 -0800)]
 
Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git./linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
 "Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns
  processing patches, switching roles from perf-tools to perf-tools-next
  at each Linux release.
  Data profiling:
   - Associate samples that identify loads and stores with data
     structures. This uses events available on Intel, AMD and others and
     DWARF info:
       # To get memory access samples in kernel for 1 second (on Intel)
       $ perf mem record -a -K --ldlat=4 -- sleep 1
       # Similar for the AMD (but it requires 6.3+ kernel for BPF filters)
       $ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1
     Then, amongst several modes of post processing, one can do things like:
       $ perf report -s type,typeoff --hierarchy --group --stdio
       ...
       #
       # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u'
       # Event count (approx.): 
602758064
       #
       #                    Overhead  Data Type / Data Type Offset
       # ...........................  ............................
       #
           26.09%   3.28%   0.00%     long unsigned int
              26.09%   3.28%   0.00%     long unsigned int +0 (no field)
           18.48%   0.73%   0.00%     struct page
              10.83%   0.02%   0.00%     struct page +8 (lru.next)
               3.90%   0.28%   0.00%     struct page +0 (flags)
               3.45%   0.06%   0.00%     struct page +24 (mapping)
               0.25%   0.28%   0.00%     struct page +48 (_mapcount.counter)
               0.02%   0.06%   0.00%     struct page +32 (index)
               0.02%   0.00%   0.00%     struct page +52 (_refcount.counter)
               0.02%   0.01%   0.00%     struct page +56 (memcg_data)
               0.00%   0.01%   0.00%     struct page +16 (lru.prev)
           15.37%  17.54%   0.00%     (stack operation)
              15.37%  17.54%   0.00%     (stack operation) +0 (no field)
           11.71%  50.27%   0.00%     (unknown)
              11.71%  50.27%   0.00%     (unknown) +0 (no field)
       $ perf annotate --data-type
       ...
       Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples):
       ============================================================================
           samples     offset       size  field
                13          0        640  struct cfs_rq         {
                 2          0         16      struct load_weight       load {
                 2          0          8          unsigned long        weight;
                 0          8          4          u32  inv_weight;
                                              };
                 0         16          8      unsigned long    runnable_weight;
                 0         24          4      unsigned int     nr_running;
                 1         28          4      unsigned int     h_nr_running;
       ...
       $ perf annotate --data-type=page --group
       Annotate type: 'struct page' in [kernel.kallsyms] (480 samples):
        event[0] = cpu/mem-loads,ldlat=4/P
        event[1] = cpu/mem-stores/P
        event[2] = dummy:u
       ===================================================================================
                samples  offset  size  field
       447  33        0       0    64  struct page     {
       108   8        0       0     8	 long unsigned int  flags;
       319  13        0       8    40	 union       {
       319  13        0       8    40          struct          {
       236   2        0       8    16              union       {
       236   2        0       8    16                  struct list_head       lru {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
       236   2        0       8    16                  struct          {
       236   1        0       8     8                      void*      __filler;
         0   1        0      16     4                      unsigned int       mlock_count;
                                                       };
       236   2        0       8    16                  struct list_head       buddy_list {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
       236   2        0       8    16                  struct list_head       pcp_list {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
                                                   };
        82   4        0      24     8              struct address_space*      mapping;
         1   7        0      32     8              union       {
         1   7        0      32     8                  long unsigned int      index;
         1   7        0      32     8                  long unsigned int      share;
                                                   };
         0   0        0      40     8              long unsigned int  private;
                                                                 };
     This uses the existing annotate code, calling objdump to do the
     disassembly, with improvements to avoid having this take too long,
     but longer term a switch to a disassembler library, possibly
     reusing code in the kernel will be pursued.
     This is the initial implementation, please use it and report
     impressions and bugs. Make sure the kernel-debuginfo packages match
     the running kernel. The 'perf report' phase for non short perf.data
     files may take a while.
     There is a great article about it on LWN:
       https://lwn.net/Articles/955709/ - "Data-type profiling for perf"
     One last test I did while writing this text, on a AMD Ryzen 5950X,
     using a distro kernel, while doing a simple 'find /' on an
     otherwise idle system resulted in:
     # uname -r
     6.6.9-100.fc38.x86_64
     # perf -vv | grep BPF_
                      bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
            bpf_skeletons: [ on  ]  # HAVE_BPF_SKEL
     # rpm -qa | grep kernel-debuginfo
     kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64
     kernel-debuginfo-6.6.9-100.fc38.x86_64
     #
     # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000'
     ^C[ perf record: Woken up 1 times to write data ]
     [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ]
     #
     # ls -la perf.data
     -rw-------. 1 root root 
2346486 Jan  9 18:36 perf.data
     # perf evlist
     ibs_op//
     dummy:u
     # perf evlist -v
     ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1
     dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
     #
     # perf report -s type,typeoff --hierarchy --group --stdio
     # Total Lost Samples: 0
     #
     # Samples: 2K of events 'ibs_op//, dummy:u'
     # Event count (approx.): 
1904553038
     #
     #            Overhead  Data Type / Data Type Offset
     # ...................  ............................
     #
         73.70%   0.00%     (unknown)
            73.70%   0.00%     (unknown) +0 (no field)
          3.01%   0.00%     long unsigned int
             3.00%   0.00%     long unsigned int +0 (no field)
             0.01%   0.00%     long unsigned int +2 (no field)
          2.73%   0.00%     struct task_struct
             1.71%   0.00%     struct task_struct +52 (on_cpu)
             0.38%   0.00%     struct task_struct +2104 (rcu_read_unlock_special.b.blocked)
             0.23%   0.00%     struct task_struct +2100 (rcu_read_lock_nesting)
             0.14%   0.00%     struct task_struct +2384 ()
             0.06%   0.00%     struct task_struct +3096 (signal)
             0.05%   0.00%     struct task_struct +3616 (cgroups)
             0.05%   0.00%     struct task_struct +2344 (active_mm)
             0.02%   0.00%     struct task_struct +46 (flags)
             0.02%   0.00%     struct task_struct +2096 (migration_disabled)
             0.01%   0.00%     struct task_struct +24 (__state)
             0.01%   0.00%     struct task_struct +3956 (mm_cid_active)
             0.01%   0.00%     struct task_struct +1048 (cpus_ptr)
             0.01%   0.00%     struct task_struct +184 (se.group_node.next)
             0.01%   0.00%     struct task_struct +20 (thread_info.cpu)
             0.00%   0.00%     struct task_struct +104 (on_rq)
             0.00%   0.00%     struct task_struct +2456 (pid)
          1.36%   0.00%     struct module
             0.59%   0.00%     struct module +952 (kallsyms)
             0.42%   0.00%     struct module +0 (state)
             0.23%   0.00%     struct module +8 (list.next)
             0.12%   0.00%     struct module +216 (syms)
          0.95%   0.00%     struct inode
             0.41%   0.00%     struct inode +40 (i_sb)
             0.22%   0.00%     struct inode +0 (i_mode)
             0.06%   0.00%     struct inode +76 (i_rdev)
             0.06%   0.00%     struct inode +56 (i_security)
     <SNIP>
  perf top/report:
   - Don't ignore job control, allowing control+Z + bg to work.
   - Add s390 raw data interpretation for PAI (Processor Activity
     Instrumentation) counters.
  perf archive:
   - Add new option '--all' to pack perf.data with DSOs.
   - Add new option '--unpack' to expand tarballs.
  Initialization speedups:
   - Lazily initialize zstd streams to save memory when not using it.
   - Lazily allocate/size mmap event copy.
   - Lazy load kernel symbols in 'perf record'.
   - Be lazier in allocating lost samples buffer in 'perf record'.
   - Don't synthesize BPF events when disabled via the command line
     (perf record --no-bpf-event).
  Assorted improvements:
   - Show note on AMD systems that the :p, :pp, :ppp and :P are all the
     same, as IBS (Instruction Based Sampling) is used and it is
     inherentely precise, not having levels of precision like in Intel
     systems.
   - When 'cycles' isn't available, fall back to the "task-clock" event
     when not system wide, not to 'cpu-clock'.
   - Add --debug-file option to redirect debug output, e.g.:
       $ perf --debug-file /tmp/perf.log record -v true
   - Shrink 'struct map' to under one cacheline by avoiding function
     pointers for selecting if addresses are identity or DSO relative,
     and using just a byte for some boolean struct members.
   - Resolve the arch specific strerrno just once to use in
     perf_env__arch_strerrno().
   - Reduce memory for recording PERF_RECORD_LOST_SAMPLES event.
  Assorted fixes:
   - Fix the default 'perf top' usage on Intel hybrid systems, now it
     starts with a browser showing the number of samples for Efficiency
     (cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This
     behaviour is similar on ARM64, with its respective set of
     big.LITTLE processors.
   - Fix segfault on build_mem_topology() error path.
   - Fix 'perf mem' error on hybrid related to availability of mem event
     in a PMU.
   - Fix missing reference count gets (map, maps) in the db-export code.
   - Avoid recursively taking env->bpf_progs.lock in the 'perf_env'
     code.
   - Use the newly introduced maps__for_each_map() to add missing
     locking around iteration of 'struct map' entries.
   - Parse NOTE segments until the build id is found, don't stop on the
     first one, ELF files may have several such NOTE segments.
   - Remove 'egrep' usage, its deprecated, use 'grep -E' instead.
   - Warn first about missing libelf, not libbpf, that depends on
     libelf.
   - Use alternative to 'find ... -printf' as this isn't supported in
     busybox.
   - Address python 3.6 DeprecationWarning for string scapes.
   - Fix memory leak in uniq() in libsubcmd.
   - Fix man page formatting for 'perf lock'
   - Fix some spelling mistakes.
  perf tests:
   - Fail shell tests that needs some symbol in perf itself if it is
     stripped. These tests check if a symbol is resolved, if some hot
     function is indeed detected by profiling, etc.
   - The 'perf test sigtrap' test is currently failing on PREEMPT_RT,
     skip it if sleeping spinlocks are detected (using BTF) and point to
     the mailing list discussion about it. This test is also being
     skipped on several architectures (powerpc, s390x, arm and aarch64)
     due to other pending issues with intruction breakpoints.
   - Adjust test case perf record offcpu profiling tests for s390.
   - Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest,
     addressing issues caused by the fallback from cycles to task-clock
     done in this release.
   - Fix mask for VG register in the user-regs test.
   - Use shellcheck on 'perf test' shell scripts automatically to make
     sure changes don't introduce things it flags as problematic.
   - Add option to change objdump binary and allow it to be set via
     'perf config'.
   - Add basic 'perf script', 'perf list --json" and 'perf diff' tests.
   - Basic branch counter support.
   - Make DSO tests a suite rather than individual.
   - Remove atomics from test_loop to avoid test failures.
   - Fix call chain match on powerpc for the record+probe_libc_inet_pton
     test.
   - Improve Intel hybrid tests.
  Vendor event files (JSON):
  powerpc:
   - Update datasource event name to fix duplicate events on IBM's
     Power10.
   - Add PVN for HX-C2000 CPU with Power8 Architecture.
  Intel:
   - Alderlake/rocketlake metric fixes.
   - Update emeraldrapids events to v1.02.
   - Update icelakex events to v1.23.
   - Update sapphirerapids events to v1.17.
   - Add skx, clx, icx and spr upi bandwidth metric.
  AMD:
   - Add Zen 4 memory controller events.
  RISC-V:
   - Add StarFive Dubhe-80 and Dubhe-90 JSON files.
       https://www.starfivetech.com/en/site/cpu-u
   - Add T-HEAD C9xx JSON file.
       https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md
  ARM64:
   - Remove UTF-8 characters from cmn.json, that were causing build
     failure in some distros.
   - Add core PMU events and metrics for Ampere One X.
   - Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT
  libperf:
   - Rename several perf_cpu_map constructor names to clarify what they
     really do.
   - Ditto for some other methods, coping with some issues in their
     semantics, like perf_cpu_map__empty() ->
     perf_cpu_map__has_any_cpu_or_is_empty().
   - Document perf_cpu_map__nr()'s behavior
  perf stat:
   - Exit if parse groups fails.
   - Combine the -A/--no-aggr and --no-merge options.
   - Fix help message for --metric-no-threshold option.
  Hardware tracing:
  ARM64 CoreSight:
   - Bump minimum OpenCSD version to ensure a bugfix is present.
   - Add 'T' itrace option for timestamp trace
   - Set start vm addr of exectable file to 0 and don't ignore first
     sample on the arm-cs-trace-disasm.py 'perf script'"
* tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits)
  MAINTAINERS: Add Namhyung as tools/perf/ co-maintainer
  perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vm
  perf db-export: Fix missing reference count get in call_path_from_sample()
  perf tests: Add perf script test
  libsubcmd: Fix memory leak in uniq()
  perf TUI: Don't ignore job control
  perf vendor events intel: Update sapphirerapids events to v1.17
  perf vendor events intel: Update icelakex events to v1.23
  perf vendor events intel: Update emeraldrapids events to v1.02
  perf vendor events intel: Alderlake/rocketlake metric fixes
  perf x86 test: Add hybrid test for conflicting legacy/sysfs event
  perf x86 test: Update hybrid expectations
  perf vendor events amd: Add Zen 4 memory controller events
  perf stat: Fix hard coded LL miss units
  perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event
  perf env: Avoid recursively taking env->bpf_progs.lock
  perf annotate: Add --insn-stat option for debugging
  perf annotate: Add --type-stat option for debugging
  perf annotate: Support event group display
  perf annotate: Add --data-type option
  ...
Linus Torvalds [Fri, 19 Jan 2024 21:49:16 +0000 (13:49 -0800)]
 
Merge tag 'strlcpy-removal-v6.8-rc1' of git://git./linux/kernel/git/kees/linux
Pull strlcpy removal from Kees Cook:
 "As promised, this is 'part 2' of the hardening tree, late in -rc1 now
  that all the other trees with strlcpy() removals have landed. One new
  user appeared (in bcachefs) but was a trivial refactor. The kernel is
  now free of the strlcpy() API!
   - Remove of the final (very recent) user of strlcpy() (in bcachefs)
   - Remove the strlcpy() API. Long live strscpy()"
* tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  string: Remove strlcpy()
  bcachefs: Replace strlcpy() with strscpy()
Linus Torvalds [Fri, 19 Jan 2024 21:36:15 +0000 (13:36 -0800)]
 
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "I think the main one is fixing the dynamic SCS patching when full LTO
  is enabled (clang was silently getting this horribly wrong), but it's
  all good stuff.
  Rob just pointed out that the fix to the workaround for erratum
  #
2966298 might not be necessary, but in the worst case it's harmless
  and since the official description leaves a little to be desired here,
  I've left it in.
  Summary:
   - Fix shadow call stack patching with LTO=full
   - Fix voluntary preemption of the FPSIMD registers from assembly code
   - Fix workaround for A520 CPU erratum #
2966298 and extend to A510
   - Fix SME issues that resulted in corruption of the register state
   - Minor fixes (missing includes, formatting)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix silcon-errata.rst formatting
  arm64/sme: Always exit sme_alloc() early with existing storage
  arm64/fpsimd: Remove spurious check for SVE support
  arm64/ptrace: Don't flush ZA/ZT storage when writing ZA via ptrace
  arm64: entry: simplify kernel_exit logic
  arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
  arm64: errata: Add Cortex-A510 speculative unprivileged load workaround
  arm64: Rename ARM64_WORKAROUND_2966298
  arm64: fpsimd: Bring cond_yield asm macro in line with new rules
  arm64: scs: Work around full LTO issue with dynamic SCS
  arm64: irq: include <linux/cpumask.h>
Linus Torvalds [Fri, 19 Jan 2024 21:30:49 +0000 (13:30 -0800)]
 
Merge tag 'loongarch-6.8' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
 - Raise minimum clang version to 18.0.0
 - Enable initial Rust support for LoongArch
 - Add built-in dtb support for LoongArch
 - Use generic interface to support crashkernel=X,[high,low]
 - Some bug fixes and other small changes
 - Update the default config file.
* tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits)
  MAINTAINERS: Add BPF JIT for LOONGARCH entry
  LoongArch: Update Loongson-3 default config file
  LoongArch: BPF: Prevent out-of-bounds memory access
  LoongArch: BPF: Support 64-bit pointers to kfuncs
  LoongArch: Fix definition of ftrace_regs_set_instruction_pointer()
  LoongArch: Use generic interface to support crashkernel=X,[high,low]
  LoongArch: Fix and simplify fcsr initialization on execve()
  LoongArch: Let cores_io_master cover the largest NR_CPUS
  LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE
  LoongArch: Add a missing call to efi_esrt_init()
  LoongArch: Parsing CPU-related information from DTS
  LoongArch: dts: DeviceTree for Loongson-2K2000
  LoongArch: dts: DeviceTree for Loongson-2K1000
  LoongArch: dts: DeviceTree for Loongson-2K0500
  LoongArch: Allow device trees be built into the kernel
  dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for interrupt-names
  dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for reg-names
  dt-bindings: loongarch: Add Loongson SoC boards compatibles
  dt-bindings: loongarch: Add CPU bindings for LoongArch
  LoongArch: Enable initial Rust support
  ...