git.maquefel.me Git - linux.git/log

Merge patch series "scsi: target: Allow userspace to config cmd submission"

Mike Christie <michael.christie@oracle.com> says:

The following patches were made over Linus's tree but apply over
Martin's branches. They allow userspace to configure how fabric
drivers submit cmds to backend drivers.

Right now loop and vhost use a worker thread, and the other drivers
submit from the contexts they receive/process the cmd from. For
multiple LUN cases where the target can queue more cmds than the
backend can handle then deferring to a worker thread is safest because
the backend driver can block when doing things like waiting for a free
request/tag. Deferring also helps when the target has to handle
transport level requests from the recv context.

For cases where the backend devices can queue everything the target
sends, then there is no need to defer to a workqueue and you can see a
perf boost of up to 26% for small IO workloads. For a nvme device and
vhost-scsi I can see with 4K IOs:

fio jobs        1       2       4       8       10
--------------------------------------------------
workqueue
submit        94K     190K    394K    770K    890K

direct
submit       128K     252K    488K    950K    -

Link: https://lore.kernel.org/r/1b1f7a5c-0988-45f9-b103-dfed2c0405b1@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Export fabric driver direct submit settings

This exports the fabric driver's direct submit settings, so users know what
the driver supports. It will be helpful when they are exporting a device
through different targets and one doesn't support direct submission.

The new files allow the fabric to report what submission types they default
to and if they support direct submission:

default_submit_type:

1 - TARGET_DIRECT_SUBMIT - If the user has not requested a specific value
then the fabric requests direct submission.

2 - TARGET_QUEUE_SUBMIT - If the user has not requested a specific value
then the fabric requests queued submission.

Note that these fabric values are based on what the fabric driver currently
defaults to for compat with exiting setups.

direct_submit_supported:

0 - The fabric does not support direct submission.

1 - The fabric supports direct submission.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-8-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: core: Unexport target_queue_submission()

target_queue_submission() is not called by drivers anymore so unexport it.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-7-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Allow userspace to request direct submissions

This allows userspace to request the fabric drivers do direct submissions
if they support it. With the new device file, submit_type, users can
write 0 - 2 to control how commands are submitted to the backend:

0 - TARGET_FABRIC_DEFAULT_SUBMIT - LIO will use the fabric's default
     submission type. This is the default for compat.

1 - TARGET_DIRECT_SUBMIT - LIO will submit the cmd to the backend from the
     calling context if the fabric the cmd was received on supports it,
     else it will use the fabric's default type.

2 - TARGET_QUEUE_SUBMIT - LIO will queue the cmd to the LIO submission
     workqueue which will pass it to the backend.

When using an NVMe drive and vhost-scsi with direct submission we see
around a 20% improvement in 4K I/Os:

fio jobs        1       2       4       8       10
--------------------------------------------------
defer           94K     190K    394K    770K    890K
direct          128K    252K    488K    950K    -

And when using the queueing mode, we now no longer see issues like where
the iSCSI tx thread is blocked in the block layer waiting on a tag so it
can't respond to a nop or perform I/Os for other LUs.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-6-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: core: Kill transport_handle_cdb_direct()

Move the code from transport_handle_cdb_direct() to target_submit() and
have iSCSI call target_submit().

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-5-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: core: Move buffer clearing hack

Move the hack to clear some buffers to transport_handle_cdb_direct() so we
can eventually merge transport_handle_cdb_direct() and target_submit().

This also fixes up the comment so it's clear it was only for udev and
reflects that the referenced function does not exist and we now allow more
than 1 page for control CDBs.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-4-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: core: Move core_alua_check_nonop_delay() call

Move core_alua_check_nonop_delay() to transport_handle_cdb_direct() so the
iSCSI target driver doesn't have to call as many core functions
directly. We will eventually merge transport_handle_cdb_direct and
target_submit so iSCSI and the other drivers call a common function.

It will also be helpful as preparation for future changes which allow the
iSCSI target to defer command submission to the LIO submission workqueue,
because we will have a common submission function for that which will be
based on transport_handle_cdb_direct()/target_submit().

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-3-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Have drivers report if they support direct submissions

In some cases, like with multiple LUN targets or where the target has to
respond to transport level requests from the receiving context it can be
better to defer cmd submission to a helper thread. If the backend driver
blocks on something like request/tag allocation it can block the entire
target submission path and other LUs and transport IO on that session.

In other cases like single LUN targets with storage that can support all
the commands that the target can queue, then it's best to submit the cmd
to the backend from the target's cmd receiving context.

Subsequent commits will allow the user to config what they prefer, but
drivers like loop can't directly submit because they can be called from a
context that can't sleep. And, drivers like vhost-scsi can support direct
submission, but need to keep their default behavior of deferring execution
to avoid possible regressions where the backend can block.

Make the drivers tell LIO core if they support direct submissions and their
current default, so we can prevent users from misconfiguring the system and
initialize devices correctly.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-2-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: iscs: Make write_pending_must_be_called a bit field

Subsequent commits add more on/off type of settings to the
target_core_fabric_ops struct so this makes write_pending_must_be_called a
bit field instead of a bool to better organize the settings.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge patch series "scsi: EH rework prep patches, part 1"

Hannes Reinecke <hare@suse.de> says:

Hi all,

(taking up an old thread:) here's the first batch of patches for my EH
rework. It modifies the reset callbacks for SCSI drivers such that
the final conversion to drop the 'struct scsi_cmnd' argument and use
the entity in question (host, bus, target, device) as the argument to
the SCSI EH callbacks becomes possible. The first part covers drivers
which just requires minor tweaks.

Link: https://lore.kernel.org/r/20231002154328.43718-1-hare@suse.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpi3mr: Split off bus_reset function from host_reset

SCSI EH host reset is the final callback in the escalation chain; once we
reach this we need to reset the controller. As such it defeats the purpose
to skip controller reset if no I/Os are pending and the RAID device is to
be reset; especially after kexec there might be stale commands pending in
firmware for which we have no reference whatsoever. So this patch splits
off the check for pending I/O into a 'bus_reset' function, and leaves the
actual controller reset to the host reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-19-hare@suse.de
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pmcraid: Select device in pmcraid_eh_target_reset_handler()

The reset code requires a device to be selected, but we shouldn't rely on
the command to provide a device for us. So select the first device on the
target when sending down a target reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-18-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pmcraid: Select device in pmcraid_eh_bus_reset_handler()

The reset code requires a device to be selected, but we shouldn't rely on
the command to provide a device for us. So select the first device on the
bus when sending down a bus reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-17-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla1280: Separate out host reset function from qla1280_error_action()

There's not much in common between host reset and all other error handlers,
so use a separate function here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-16-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sym53c8xx_2: Rework reset handling

Split off the combined abort and device reset handling into distinct
functions. And rename the current device reset handler into a target reset
handler, seeing that it really is a target reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-15-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sym53c8xx_2: Split off bus reset from host reset

The current handler does both, bus reset and host reset. So split them off
into two distinct functions.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-14-hare@suse.de
Cc: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ips: Do not try to abort command from host reset

The code for aborting an outstanding command is a copy of the functionality
from command abort. As we already have called this function once we reach
host reset there's no point in trying to do so again.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-13-hare@suse.de
Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid: Pass in NULL scb for host reset

When calling a host reset we shouldn't rely on the command triggering the
reset, so allow megaraid_abort_and_reset() to be called with a NULL scb.
And drop the pointless 'bus_reset' and 'target_reset' handlers, which just
call the same function as host_reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-12-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Open-code reset loop for target reset

For target reset we need a device to send the target reset to, so open-code
the loop in target reset to send the target reset TMF to the correct
device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-11-hare@suse.de
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: aic79xx: Do not reference SCSI command when resetting device

When sending a device reset we should not take a reference to the SCSI
command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-10-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: aic79xx: Make BUILD_SCSIID() a function

Convert BUILD_SCSIID() into a function and add a scsi_device argument.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-9-hare@suse.de
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: aic7xxx: Do not reference SCSI command when resetting device

When sending a device reset we should not take a reference to the SCSI
command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-8-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: aic7xxx: Make BUILD_SCSIID() a function

Convert BUILD_SCSIID() into a function and add a scsi_device argument.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-7-hare@suse.de
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: bnx2fc: Do not rely on a SCSI command for LUN or target reset

When a LUN or target reset is issued, we should not rely on a SCSI command
to be present; we'll have to reset the entire device or target anyway.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-6-hare@suse.de
Cc: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Use FC rport as argument for qedf_initiate_tmf()

When sending a TMF we're only concerned with the rport and the LUN ID, so
use struct fc_rport as argument for qedf_initiate_tmf().

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-5-hare@suse.de
Cc: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Open-code mptfc_block_error_handler() for bus reset

When calling bus_reset we have potentially several ports to be reset, so
this patch open-codes the existing mptfc_block_error_handler() to wait for
all ports attached to this bus.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-4-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Correct definitions for mptscsih_dev_reset()

mptscsih_dev_reset() is _not_ a device reset, but rather a target
reset. Nevertheless it's being used for either purpose. This patch adds a
correct implementation for mptscsih_dev_reset(), and renames the original
function to mptscsih_target_reset().

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-3-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Simplify mptfc_block_error_handler()

Instead of passing in a function to mptfc_block_error_handler() we can as
well return a status and call the function afterwards.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-2-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Use 'unsigned int' for single-bit bitfields in 'struct ibmvfc_host'

Clang warns (or errors with CONFIG_WERROR=y) several times along the
lines of:

  drivers/scsi/ibmvscsi/ibmvfc.c:650:17: warning: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Wsingle-bit-bitfield-constant-conversion]
    650 |                 vhost->reinit = 1;
        |                               ^ ~

A single-bit signed integer bitfield only has possible values of -1 and
0, not 0 and 1 like an unsigned one would. No context appears to check
the actual value of these bitfields, just whether or not it is zero.
However, it is easy enough to change the type of the fields to 'unsigned
int', which keeps the same size in memory and resolves the warning.

Fixes: 5144905884e2 ("scsi: ibmvfc: Use a bitfield for boolean flags")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20231010-ibmvfc-fix-bitfields-type-v1-1-37e95b5a60e5@kernel.org
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libfc: Fix potential NULL pointer dereference in fc_lport_ptp_setup()

fc_lport_ptp_setup() did not check the return value of fc_rport_create()
which can return NULL and would cause a NULL pointer dereference. Address
this issue by checking return value of fc_rport_create() and log error
message on fc_rport_create() failed.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231011130350.819571-1-haowenchao2@huawei.com
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: cxgbi: Fix 'generated' typo

Fix 'generated' typo.

Signed-off-by: Muhammad Muzammil <m.muzzammilashraf@gmail.com>
Link: https://lore.kernel.org/r/20231013055121.12310-1-m.muzzammilashraf@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Fix abnormal scale up after scale down

When no active_reqs, devfreq_monitor (thread A) will suspend clock scaling.
But it may have racing with clk_scaling.suspend_work (thread B) and
actually not suspend clock scaling (requeue after suspend). Next time
after polling_ms, devfreq_monitor read clk_scaling.window_start_t = 0 then
scale up clock abnormal.

Below is racing step:
devfreq->work (Thread A)
devfreq_monitor
update_devfreq
.....
ufshcd_devfreq_target
queue_work(hba->clk_scaling.workq,
1 &hba->clk_scaling.suspend_work)
.....
5 queue_delayed_work(devfreq_wq, &devfreq->work,
msecs_to_jiffies(devfreq->profile->polling_ms));

2 hba->clk_scaling.suspend_work (Thread B)
ufshcd_clk_scaling_suspend_work
__ufshcd_suspend_clkscaling
devfreq_suspend_device(hba->devfreq);
3 cancel_delayed_work_sync(&devfreq->work);
4 hba->clk_scaling.window_start_t = 0;
.....

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20230831130826.5592-4-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Fix abnormal scale up after last cmd finish

When ufshcd_clk_scaling_suspend_work (thread A) running and new command
coming, ufshcd_clk_scaling_start_busy (thread B) may get host_lock after
thread A first time release host_lock. Then thread A second time get
host_lock will set clk_scaling.window_start_t = 0 which scale up clock
abnormal next polling_ms time. Also inlines another
__ufshcd_suspend_clkscaling calls.

Below is racing step:
1 hba->clk_scaling.suspend_work (Thread A)
ufshcd_clk_scaling_suspend_work
2 spin_lock_irqsave(hba->host->host_lock, irq_flags);
3 hba->clk_scaling.is_suspended = true;
4 spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
__ufshcd_suspend_clkscaling
7 spin_lock_irqsave(hba->host->host_lock, flags);
8 hba->clk_scaling.window_start_t = 0;
9 spin_unlock_irqrestore(hba->host->host_lock, flags);

ufshcd_send_command (Thread B)
ufshcd_clk_scaling_start_busy
5 spin_lock_irqsave(hba->host->host_lock, flags);
....
6 spin_unlock_irqrestore(hba->host->host_lock, flags);

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20230831130826.5592-3-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Only suspend clock scaling if scaling down

If clock scale up and suspend clock scaling, ufs will keep high
performance/power mode but no read/write requests on going. It is logic
wrong and have power concern.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20230831130826.5592-2-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Remove unnecessary check

The "attr" pointer points to an offset into the "host" struct so it can't
be NULL. Delete the if statement and pull the code in a tab.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/fe3b8fcd-64a7-4887-bddd-32239a88a6a3@moroto.mountain
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: ufs-pci: Switch to use acpi_evaluate_dsm_typed()

The acpi_evaluate_dsm_typed() provides a way to check the type of the
object evaluated by _DSM call. Use it instead of open coded variant.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231002135125.2602895-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Remove dev cmd clock scaling busy

If a dev command times out, clk_scaling.active_reqs is not decreased which
causes abnormal clock scaling.

It is complicated to handle different dev command timeout cases in both
legacy mode and MCQ mode. Besides, dev cmds are rarely used and the busy
time is short.

Remove clock scaling busy window for dev cmds like we do for UIC or TM cmds
which don't update busy window either.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20231004062454.29165-1-peter.wang@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Replace deprecated strncpy() with strscpy()

strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

The only caller of mptsas_exp_repmanufacture_info() is
mptsas_probe_one_phy() which can allocate rphy in either
sas_end_device_alloc() or sas_expander_alloc(). Both of which
zero-allocate:

| rdev = kzalloc(sizeof(*rdev), GFP_KERNEL);

... this is supplied to mptsas_exp_repmanufacture_info() as edev meaning
that no future NUL-padding of edev members is needed.

Considering the above, a suitable replacement is strscpy() [2] due to the
fact that it guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

Also use the more idiomatic strscpy() pattern of (dest, src, sizeof(dest)).

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231003-strncpy-drivers-message-fusion-mptsas-c-v2-1-5ce07e60bd21@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Replace deprecated strncpy() with strscpy_pad()

strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

Since all these structs are copied out to userspace let's keep them
NUL-padded by using strscpy_pad() which guarantees NUL-termination of the
destination buffer while also providing the NUL-padding behavior that
strncpy() has.

Let's also opt to use the more idiomatic strscpy() usage of: 'dest, src,
sizeof(dest)' in cases where the compiler can determine the size of the
destination buffer. Do this for all cases of strscpy...() in this file.

To be abundantly sure we don't leak stack data out to user space let's also
change a strscpy() to strscpy_pad(). This strscpy() was introduced in
commit dbe37c71d124 ("scsi: message: fusion: Replace all non-returning
strlcpy() with strscpy()")

Note that since we are creating these structs with a copy_from_user() and
modifying fields and then copying back out to the user it is probably OK
not to explicitly NUL-pad everything as any data leak is probably just data
from the user themselves. If this is too eager, let's opt for strscpy()
which is still in the spirit of removing deprecated strncpy() usage
treewide.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20230927-strncpy-drivers-message-fusion-mptctl-c-v1-1-bb2eddc1743c@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: WLUN send SSU timeout recovery

When runtime PM send SSU times out, the SCSI core invokes
eh_host_reset_handler, in this case ufshcd_eh_host_reset_handler(), which
is then stuck waiting for flush_work(&hba->eh_work). However,
ufshcd_err_handler hangs in wait RPM resume. Do link recovery only in this
case. The following IO hang stack dump was observed:

kworker/4:0     D
<ffffffd7d31f6fb4> __switch_to+0x180/0x344
<ffffffd7d31f779c> __schedule+0x5ec/0xa14
<ffffffd7d31f7c3c> schedule+0x78/0xe0
<ffffffd7d31fefbc> schedule_timeout+0xb0/0x15c
<ffffffd7d31f8120> io_schedule_timeout+0x48/0x70
<ffffffd7d31f8e40> do_wait_for_common+0x108/0x19c
<ffffffd7d31f837c> wait_for_completion_io_timeout+0x50/0x78
<ffffffd7d2876bc0> blk_execute_rq+0x1b8/0x218
<ffffffd7d2b4297c> scsi_execute_cmd+0x148/0x238
<ffffffd7d2da7358> ufshcd_set_dev_pwr_mode+0xe8/0x244
<ffffffd7d2da7e40> __ufshcd_wl_resume+0x1e0/0x45c
<ffffffd7d2da7b28> ufshcd_wl_runtime_resume+0x3c/0x174
<ffffffd7d2b4f290> scsi_runtime_resume+0x7c/0xc8
<ffffffd7d2ae1d48> __rpm_callback+0xa0/0x410
<ffffffd7d2ae0128> rpm_resume+0x43c/0x67c
<ffffffd7d2ae1e98> __rpm_callback+0x1f0/0x410
<ffffffd7d2ae014c> rpm_resume+0x460/0x67c
<ffffffd7d2ae1450> pm_runtime_work+0xa4/0xac
<ffffffd7d22e39ac> process_one_work+0x208/0x598
<ffffffd7d22e3fc0> worker_thread+0x228/0x438
<ffffffd7d22eb038> kthread+0x104/0x1d4
<ffffffd7d22171a0> ret_from_fork+0x10/0x20

scsi_eh_0       D
<ffffffd7d31f6fb4> __switch_to+0x180/0x344
<ffffffd7d31f779c> __schedule+0x5ec/0xa14
<ffffffd7d31f7c3c> schedule+0x78/0xe0
<ffffffd7d31fef50> schedule_timeout+0x44/0x15c
<ffffffd7d31f8e40> do_wait_for_common+0x108/0x19c
<ffffffd7d31f8234> wait_for_completion+0x48/0x64
<ffffffd7d22deb88> __flush_work+0x260/0x2d0
<ffffffd7d22de918> flush_work+0x10/0x20
<ffffffd7d2da4728> ufshcd_eh_host_reset_handler+0x88/0xcc
<ffffffd7d2b41da4> scsi_try_host_reset+0x48/0xe0
<ffffffd7d2b410fc> scsi_eh_ready_devs+0x934/0xa40
<ffffffd7d2b41618> scsi_error_handler+0x168/0x374
<ffffffd7d22eb038> kthread+0x104/0x1d4
<ffffffd7d22171a0> ret_from_fork+0x10/0x20

kworker/u16:5   D
<ffffffd7d31f6fb4> __switch_to+0x180/0x344
<ffffffd7d31f779c> __schedule+0x5ec/0xa14
<ffffffd7d31f7c3c> schedule+0x78/0xe0
<ffffffd7d2adfe00> rpm_resume+0x114/0x67c
<ffffffd7d2adfca8> __pm_runtime_resume+0x70/0xb4
<ffffffd7d2d9cf48> ufshcd_err_handler+0x1a0/0xe68
<ffffffd7d22e39ac> process_one_work+0x208/0x598
<ffffffd7d22e3fc0> worker_thread+0x228/0x438
<ffffffd7d22eb038> kthread+0x104/0x1d4
<ffffffd7d22171a0> ret_from_fork+0x10/0x20

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20230927033557.13801-1-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge patch series "ibmvfc: fixes and generic prep work for NVMeoF support"

Tyrel Datwyler <tyreld@linux.ibm.com> says:

This series includes a couple minor fixes, generalization of some code
that is not protocol specific, and a reworking of the way event pool
buffers are accounted for by the driver. This is a precursor to a
series to follow that introduces support for NVMeoF protocol with
ibmvfc.

Link: https://lore.kernel.org/r/20230921225435.3537728-1-tyreld@linux.ibm.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: fnic: Clean up some inconsistent indenting

No functional modification involved.

drivers/scsi/fnic/fnic_fcs.c:152 fnic_handle_link() warn: inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6678
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230922100657.14566-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Annotate struct tcmu_tmr with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct tcmu_tmr.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230922175300.work.148-kees@kernel.org
Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge patch series "UFS core patches"

Bart Van Assche <bvanassche@acm.org> says:

Hi Martin,

Please consider these UFS core patches for the next merge window.

Thanks,

Bart.

Link: https://lore.kernel.org/r/20230921192335.676924-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Set the Command Priority (CP) flag for RT requests

Make the UFS device execute realtime (RT) requests before other requests.
This will be used in Android to reduce the I/O latency of the foreground
app.

Note: UFS devices do not support CDL so using CDL is not a viable
alternative.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230921192335.676924-5-bvanassche@acm.org
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Simplify ufshcd_comp_scsi_upiu()

ufshcd_comp_scsi_upiu() has one caller and that caller ensures that
lrbp->cmd != NULL. Hence leave out the lrbp->cmd check from
ufshcd_comp_scsi_upiu().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230921192335.676924-4-bvanassche@acm.org
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Move the 4K alignment code into the Exynos driver

The DMA alignment for the Exynos controller follows directly from the PRDT
segment size configured in ufs-exynos.c. Hence, move the DMA alignment code
into the Exynos driver source code.

Cc: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230921192335.676924-3-bvanassche@acm.org
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Remove request tag range checks

The block layer core guarantees that tag numbers are in the expected
range. Hence remove the statements that check this. This patch suppresses
Coverity warnings about left shifts with a negative right hand operand.
The following commit originally introduced request tag range checks:
14497328b6a6 ("scsi: ufs: verify command tag validity").

Cc: daejun7.park@samsung.com
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230921192335.676924-2-bvanassche@acm.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Remove the references to http://www.linux-iscsi.org/

The website http://www.linux-iscsi.org/ disappeared more than a year ago.
DNS records have been removed for linux-iscsi.org. The company that
sponsored this website (Datera; formerly called Rising Tide) has been
liquidated in early 2021 according to
https://blocksandfiles.com/2021/03/19/datera-is-being-liquidated/. Since
it is unlikely that the website http://www.linux-iscsi.org/ will be
restored, remove the references to that website.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230920200232.3721784-2-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Add protocol field to target structure

Add a per target protocol field so target code can determine correct
protocol specific actions as well as identify the correct channel group
target list.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-12-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Make discovery buffer per protocol channel group

The target discovery buffer that the VIOS populates with targets is
currently a host adapter field. To facilitate the discovery of NVMe targets
as well as SCSI another discovery buffer is required. Move the discovery
buffer out of the host struct and into the ibmvfc_channels struct so that
each channels instance for a given protocol has its own discovery buffer.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-11-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Add protocol field to ibmvfc_channels

There are cases in the generic code where protocol specific configuration
or actions may need to be taken. Add a protocol field to struct
ibmvfc_channels and initial IBMVFC_PROTO_[SCSI/NVME] definitions.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-10-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Make channel allocation generic

With the coming of NVMeoF support the driver will need to also allocate
channels for NVMe. Implement generic channel allocation wrappers that can
be used for both SCSI and NVMeoF protocol setup.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-9-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Track max and desired queue size in ibmvfc_channels

Add fields for desired and max number of queues to ibmvfc_channels. With
support for NVMeoF protocol coming these sorts of values should be tracked
in the protocol specific channel struct instead of the overarching host
adapter.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-8-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Rename ibmvfc_scsi_channels to ibmvfc_channels

There is nothing scsi specific about the ibmvfc_scsi_channels struct. It is
meant to encapsulate a set of channels regardless of protocol.

Remove _scsi from the struct name to reflect this genric nature.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-7-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Use a bitfield for boolean flags

There are currently 9 binary flag fields in the ibmvfc host
structure. Converting each of these to a single bitfield reduces the foot
print of the structure by 32 bytes.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-6-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Fix erroneous use of rtas_busy_delay with hcall return code

Commit 0217a272fe13 ("scsi: ibmvfc: Store return code of H_FREE_SUB_CRQ
during cleanup") wrongly changed the busy loop check to use
rtas_busy_delay() instead of H_BUSY and H_IS_LONG_BUSY(). The busy return
codes for RTAS and hypercalls are not the same.

Fix this issue by restoring the use of H_BUSY and H_IS_LONG_BUSY().

Fixes: 0217a272fe13 ("scsi: ibmvfc: Store return code of H_FREE_SUB_CRQ during cleanup")
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-5-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Limit max hw queues by num_online_cpus()

An LPAR could potentially be configured with a small logical cpu count that
is less then the default hardware queue max. Ensure that we don't allocate
more hw queues than available cpus.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-4-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Implement channel queue depth and event buffer accounting

Extend ibmvfc_queue, ibmvfc_event, and ibmvfc_event_pool to provide queue
depths for general I/O commands and reserved commands as well as proper
accounting of the free events of each type from the general event
pool. Further, calculate the negotiated max command limit with the VIOS at
NPIV login time as a function of the number of queues times their total
queue depth (general and reserved depths combined).

This does away with the legacy max_request value, and allows the driver to
better manage and track it resources.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-3-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Remove BUG_ON in the case of an empty event pool

In practice the driver should never send more commands than are allocated
to a queue's event pool. In the unlikely event that this happens, the code
asserts a BUG_ON, and in the case that the kernel is not configured to
crash on panic returns a junk event pointer from the empty event list
causing things to spiral from there. This BUG_ON is a historical artifact
of the ibmvfc driver first being upstreamed, and it is well known now that
the use of BUG_ON is bad practice except in the most unrecoverable
scenario. There is nothing about this scenario that prevents the driver
from recovering and carrying on.

Remove the BUG_ON in question from ibmvfc_get_event() and return a NULL
pointer in the case of an empty event pool. Update all call sites to
ibmvfc_get_event() to check for a NULL pointer and perfrom the appropriate
failure or recovery action.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20230921225435.3537728-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Allocate DFX memory during dump trigger

Currently, if CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE is enabled, the
memory space used by DFX is allocated during device initialization, which
occupies a large number of memory resources. The memory usage before and
after the driver is loaded is as follows:

Memory usage before the driver is loaded:
$ free -m
         total        used        free      shared  buff/cache   available
Mem:     867352        2578      864037          11         735  861681
Swap:    4095           0        4095

Memory usage after the driver which include 4 HBAs is loaded:
$ insmod hisi_sas_v3_hw.ko
$ free -m
         total        used        free      shared  buff/cache available
Mem:     867352        4760      861848          11   743   859495
Swap:    4095           0        4095

The driver with 4 HBAs connected will allocate about 110 MB of memory
without enabling debugfs.

Therefore, to avoid wasting memory resources, DFX memory is allocated
during dump triggering. The dump may fail due to memory allocation
failure. After this change, each dump costs about 10 MB of memory, and each
dump lasts about 100 ms.

Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1694571327-78697-4-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Directly call register snapshot instead of using workqueue

Currently, register information dump is performed via workqueue, regardless
of the trigger mode (automatic or manual). There is a delay in dumping
register through workqueue, the exact register information at trigger time
cannot be obtained.

Call register snapshot directly instead of through a workqueue.

Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1694571327-78697-3-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Set debugfs_dir pointer to NULL after removing debugfs

If init debugfs failed during device registration due to memory allocation
failure, debugfs_remove_recursive() is called, after which debugfs_dir is
not set to NULL. debugfs_remove_recursive() will be called again during
device removal. As a result, illegal pointer is accessed.

[ 1665.467244] hisi_sas_v3_hw 0000:b4:02.0: failed to init debugfs!
...
[ 1669.836708] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0
[ 1669.872669] pc : down_write+0x24/0x70
[ 1669.876315] lr : down_write+0x1c/0x70
[ 1669.879961] sp : ffff000036f53a30
[ 1669.883260] x29: ffff000036f53a30 x28: ffffa027c31549f8
[ 1669.888547] x27: ffffa027c3140000 x26: 0000000000000000
[ 1669.893834] x25: ffffa027bf37c270 x24: ffffa027bf37c270
[ 1669.899122] x23: ffff0000095406b8 x22: ffff0000095406a8
[ 1669.904408] x21: 0000000000000000 x20: ffffa027bf37c310
[ 1669.909695] x19: 00000000000000a0 x18: ffff8027dcd86f10
[ 1669.914982] x17: 0000000000000000 x16: 0000000000000000
[ 1669.920268] x15: 0000000000000000 x14: ffffa0274014f870
[ 1669.925555] x13: 0000000000000040 x12: 0000000000000228
[ 1669.930842] x11: 0000000000000020 x10: 0000000000000bb0
[ 1669.936129] x9 : ffff000036f537f0 x8 : ffff80273088ca10
[ 1669.941416] x7 : 000000000000001d x6 : 00000000ffffffff
[ 1669.946702] x5 : ffff000008a36310 x4 : ffff80273088be00
[ 1669.951989] x3 : ffff000009513e90 x2 : 0000000000000000
[ 1669.957276] x1 : 00000000000000a0 x0 : ffffffff00000001
[ 1669.962563] Call trace:
[ 1669.965000]  down_write+0x24/0x70
[ 1669.968301]  debugfs_remove_recursive+0x5c/0x1b0
[ 1669.972905]  hisi_sas_debugfs_exit+0x24/0x30 [hisi_sas_main]
[ 1669.978541]  hisi_sas_v3_remove+0x130/0x150 [hisi_sas_v3_hw]
[ 1669.984175]  pci_device_remove+0x48/0xd8
[ 1669.988082]  device_release_driver_internal+0x1b4/0x250
[ 1669.993282]  device_release_driver+0x28/0x38
[ 1669.997534]  pci_stop_bus_device+0x84/0xb8
[ 1670.001611]  pci_stop_and_remove_bus_device_locked+0x24/0x40
[ 1670.007244]  remove_store+0xfc/0x140
[ 1670.010802]  dev_attr_store+0x44/0x60
[ 1670.014448]  sysfs_kf_write+0x58/0x80
[ 1670.018095]  kernfs_fop_write+0xe8/0x1f0
[ 1670.022000]  __vfs_write+0x60/0x190
[ 1670.025472]  vfs_write+0xac/0x1c0
[ 1670.028771]  ksys_write+0x6c/0xd8
[ 1670.032071]  __arm64_sys_write+0x24/0x30
[ 1670.035977]  el0_svc_common+0x78/0x130
[ 1670.039710]  el0_svc_handler+0x38/0x78
[ 1670.043442]  el0_svc+0x8/0xc

To fix this, set debugfs_dir to NULL after debugfs_remove_recursive().

Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1694571327-78697-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Convert all platform drivers to return void

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart from
emitting a warning) and this typically results in resource leaks. To
improve here there is a quest to make the remove callback return void. In
the first step of this quest all drivers are converted to .remove_new()
which already returns void. Eventually after all drivers are converted,
.remove_new() is renamed to .remove().

All platform drivers below drivers/ufs/ unconditionally return zero in
their remove callback and so can be converted trivially to the variant
returning void.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230917145722.1131557-1-u.kleine-koenig@pengutronix.de
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge patch series "scsi: pm8001: Bug fix and cleanup"

Damien Le Moal <dlemoal@kernel.org> says:

The first patch of this series fixes an issue with IRQ setup which
prevents the controller from resuming after a system suspend. The
following patches are code cleanup without any functional changes.

[mkp: The first patch went into v6.6-rc2 and thus this merge
constitutes the remaining patches of the series]

Link: https://lore.kernel.org/r/20230911232745.325149-1-dlemoal@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Remove PM8001_READ_VPD

Remove the macro PM8001_READ_VPD used to define if a controller WWN should
be retrieved from the device. Instead, define the better named boolean
module parameter "read_wwn" to control this.

The code to set a fixed address for a phy device address when read_wwn is
set to false is simplified and fixed to avoid sparse warnings.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-11-dlemoal@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Remove PM8001_USE_TASKLET

Remove the macro PM8001_USE_TASKLET used to conditionally use tasklets for
MSI-X interrupts handling and replace it with the boolean module parameter
pm8001_use_tasklet. This parameter defaults to true and can be true only if
pm8001_use_msix is also true.

Code conditionnaly defined with PM8001_USE_TASKLET is modified to instead
use the parameter pm8001_use_tasklet.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-10-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Remove PM8001_USE_MSIX

The pm8001 driver does not compile if PM8001_USE_MSIX is not defined in
pm8001_sas.h because various fields and functions conditionally defined are
used unconditionally without a "#ifdef PM8001_USE_MSIX" protection. This
macro is rather useless anyway and not convenient as diabling MSI-X use
requires recompiling the driver.

Remove this macro and replace it with the bool module parameter "use_msix"
which defaults to true. The use of MSI-X interrupts for an adapter is gated
by this module parameter for adapters that actually support MSI-X. The
"use_msix" boolean field is added to struct pm8001_hba_info and all code
defined depending on PM8001_USE_MSIX is modified to rely on
pm8001_hba_info->use_msix instead.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-9-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Remove pm80xx_chip_intx_interrupt_enable/disable()

Remove the functions pm80xx_chip_intx_interrupt_enable() and
pm80xx_chip_intx_interrupt_disable() and open code them respectively in
pm80xx_chip_interrupt_enable() and pm80xx_chip_interrupt_disable().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-8-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Simplify pm8001_chip_interrupt_enable/disable()

pm8001_chip_msix_interrupt_enable() and
pm8001_chip_msix_interrupt_disable() are always cold with the vector
argument equal to 0. This allows simplifying the code for these
functions. With this change, the functions are simple enough and can be
removed by open coding them directly in pm8001_chip_interrupt_enable() and
pm8001_chip_interrupt_disable(). Also do the same for the functions
pm8001_chip_intx_interrupt_enable() and
pm8001_chip_intx_interrupt_disable() and remove these functions.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-7-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Introduce pm8001_handle_irq()

Factor out the common code of pm8001_interrupt_handler_msix and of
pm8001_interrupt_handler_intx() into the new function pm8001_handle_irq()
and use this new helper in these two functions to simplify the code.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-6-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Introduce pm8001_kill_tasklet()

Factor out the identical code for killing tasklets in pm8001_pci_remove()
and pm8001_pci_suspend() and instead use the new function
pm8001_kill_tasklet().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-5-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Introduce pm8001_init_tasklet()

Factor out the identical code for initializing tasklets in
pm8001_pci_alloc() and pm8001_pci_resume() and instead use the new function
pm8001_init_tasklet().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-4-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Introduce pm8001_free_irq()

Instead of repeating the same code twice in pm8001_pci_remove() and
pm8001_pci_suspend() to free IRQs, introduuce the function
pm8001_free_irq() to do that.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-3-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge patch series "scsi: ufs: qcom: Align programming sequence as per HW spec"

Nitin Rawat <quic_nitirawa@quicinc.com> says:

This patch series adds programming support for Qualcomm UFS V4 and
above to align avoid with Hardware Specification. This patch series
will address stability and performance issues.

In this patch series below changes are taken care.

1) Register layout for DME_VS_CORE_CLK_CTRL has changed for v4 and above.
2) Adds Support to configure PA_VS_CORE_CLK_40NS_CYCLES attibute for UFS V4
   and above.
3) Adds Support to configure multiple unipro frequencies like 403MHz,
   300MHz, 202MHz, 150 MHz, 75Mhz, 37.5 MHz for Qualcomm UFS Controller V4
   and above.
4) Allow configuration of SYS1CLK_1US_REG for UFS V4 and above.

Link: https://lore.kernel.org/r/20230905052400.13935-1-quic_nitirawa@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Rename "hs_gear" to "phy_gear"

The "hs_gear" variable is used to cache the gear setting for the PHY that
will be used during ufs_qcom_power_up_sequence(). But it creates ambiguity
with the gear setting used by the ufshcd driver.

So let's rename it to "phy_gear" to make it explicit that this variable
caches the gear setting for the PHY.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20230908145329.154024-2-manivannan.sadhasivam@linaro.org
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Tested-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Update PHY settings only when scaling to higher gears

The "hs_gear" variable is used to program the PHY settings (submode) during
ufs_qcom_power_up_sequence(). Currently, it is being updated every time the
agreed gear changes. Due to this, if the gear got downscaled before suspend
(runtime/system), then while resuming, the PHY settings for the lower gear
will be applied first and later when scaling to max gear with REINIT, the
PHY settings for the max gear will be applied.

This adds a latency while resuming and also really not needed as the PHY
gear settings are backwards compatible i.e., we can continue using the PHY
settings for max gear with lower gear speed.

So let's update the "hs_gear" variable _only_ when the agreed gear is
greater than the current one. This guarantees that the PHY settings will be
changed only during probe time and fatal error condition.

Due to this, UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH can now be skipped
when the PM operation is in progress.

Cc: stable@vger.kernel.org
Fixes: 96a7141da332 ("scsi: ufs: core: Add support for reinitializing the UFS device")
Reported-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20230908145329.154024-1-manivannan.sadhasivam@linaro.org
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Tested-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Configure SYS1CLK_1US_REG for UFS V4 and above

SYS1CLK_1US represents the required number of system 1-clock cycles for one
microsecond. UFS Host Controller V4.0 and above mandates to write
SYS1CLK_1US_REG register and also these timer configuration needs to be
called from clk scaling pre ops as per HPG.

Refactor ufs_qcom_cfg_timers and add the below code support to align
with HPG.

a)Configure SYS1CLK_1US_REG for UFS V4 and above.
b)Introduce a new argument is_pre_scale_up for ufs_qcom_cfg_timers
to configure SYS1CLK_1US for max freq during prescale and link startup
condition.
c)Move ufs_qcom_cfg_timers from clk scaling post change ops
to clk scaling pre change ops.

Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-6-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Align programing of unipro clk attributes

Currently CORE_CLK_1US_CYCLES, PA_VS_CORE_CLK_40NS_CYCLES are configured
in clk scaling post change ops. This is not aligning to HPG.

Move this to clk scaling pre change ops to align completely with hardware
specification.

Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-5-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Add support to configure PA_VS_CORE_CLK_40NS_CYCLES

PA_VS_CORE_CLK_40NS_CYCLES attribute represents the required number of
Qunipro core clock for 40 nanoseconds. For UFS host controller V4 and above
PA_VS_CORE_CLK_40NS_CYCLES needs to be programmed as per frequency of
unipro core clk of UFS host controller.

Add Support to configure PA_VS_CORE_CLK_40NS_CYCLES for Controller V4 and
above to align with the hardware specification and to avoid functionality
issues like h8 enter/exit failure, command timeout.

Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-4-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Add multiple frequency support for MAX_CORE_CLK_1US_CYCLES

Qualcomm UFS Controller V4 and above supports multiple unipro frequencies
like 403MHz, 300MHz, 202MHz, 150 MHz, 75Mhz, 37.5 MHz. Current code
supports only 150MHz and 75MHz which have performance impact due to low UFS
controller frequencies.

For targets which supports frequencies other than 150 MHz and 75 Mhz, needs
an update of MAX_CORE_CLK_1US_CYCLES to match the configured frequency to
avoid functionality issues. Add multiple frequency support for
MAX_CORE_CLK_1US_CYCLES based on the frequency configured.

Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-3-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: qcom: Update MAX_CORE_CLK_1US_CYCLES for UFS V4 and above

UFS Controller V4 and above, the register layout for DME_VS_CORE_CLK_CTRL
register has changed. MAX_CORE_CLK_1US_CYCLES offset has changed from 0 to
0x10 and length of attrbute is changed from 8bit to 12bit.

Add support to configure MAX_CORE_CLK_1US_CYCLES for UFS V4 and above as
per new register layout.

Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-2-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: imm: Add a module parameter for the transfer mode

Fix in the imm driver the same problem that was fixed in the ppa driver by
commit 68a4f84a17c1 ("scsi: ppa: Add a module parameter for the transfer
mode").

Tested and confirmed working with an Iomega Z250P zip drive and a StarTech
PEX1P2 AX99100 PCIe parallel port.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Link: https://lore.kernel.org/r/20230831054620.515611-1-alexhenrie24@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libsas: Declare sas_discover_end_dev() static

sas_discover_end_dev() is defined and used only in sas_discover.c. Define
this function as static.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230912230551.454357-4-dlemoal@kernel.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libsas: Declare sas_set_phy_speed() static

sas_set_phy_speed() is used only within sas_init.c. Declare this function
as static.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230912230551.454357-3-dlemoal@kernel.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libsas: Move local functions declarations to sas_internal.h

Move the declarations of functions used only within libsas from
include/scsi/libsas.h to drivers/scsi/libsas/sas_internal.h

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230912230551.454357-2-dlemoal@kernel.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Do not look for unsupported vdd-hba-max-microamp

Bindings do not allow vdd-hba-max-microamp property and the driver does not
use it (does not control load of vdd-hba supply). Skip looking for this
property to avoid misleading dmesg messages:

ufshcd-qcom 1d84000.ufs: ufshcd_populate_vreg: unable to find vdd-hba-max-microamp

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230906113302.201888-1-krzysztof.kozlowski@linaro.org
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Use FIELD_GET() to extract PCIe capability fields

Use FIELD_GET() to extract PCIe capability registers field instead of
custom masking and shifting.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230913122748.29530-9-ilpo.jarvinen@linux.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: esas2r: Use FIELD_GET() to extract PCIe capability fields

Use FIELD_GET() to extract PCIe capability register fields instead of
custom masking and shifting. Also remove the unnecessary cast to u8, the
value in those fields always fits to u8.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230913122748.29530-8-ilpo.jarvinen@linux.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rports

During rmmod, when dev_loss_tmo callback is called, an ndlp kref count is
decremented twice.  Once for SCSI transport registration and second to
remove the initial node allocation kref.  If there is also an NVMe
transport registration, another reference count decrement is expected in
lpfc_nvme_unregister_port().

Race conditions between the NVMe transport remoteport_delete and
dev_loss_tmo callbacks sometimes results in premature ndlp object release
resulting in use-after-free issues.

Fix by not dropping the ndlp object in dev_loss_tmo callback with an
outstanding NVMe transport registration.  Inversely, mark the final
NLP_DROPPED flag in lpfc_nvme_unregister_port when rmmod flag is set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230908211923.37603-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmo

When a dev_loss_tmo event occurs, an ndlp lock is taken before checking
nlp_flag for NLP_DROPPED. There is an attempt to restore the ndlp lock
when exiting the if statement, but the nlp_put kref could be the final
decrement causing a use-after-free memory access on a released ndlp object.

Instead of trying to reacquire the ndlp lock after checking nlp_flag, just
return after calling nlp_put.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230908211852.37576-1-justintee8345@gmail.com
Reviewed-by: "Ewan D. Milne" <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()

Since debugfs_create_file() returns ERR_PTR and never NULL, use IS_ERR() to
check the return value.

Fixes: 2fcbc569b9f5 ("scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI")
Fixes: 4c47efc140fa ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures")
Fixes: 6a828b0f6192 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues")
Fixes: 95bfc6d8ad86 ("scsi: lpfc: Make FW logging dynamically configurable")
Fixes: 9f77870870d8 ("scsi: lpfc: Add debugfs support for cm framework buffers")
Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20230906030809.2847970-1-ruanjinjie@huawei.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Include the SCSI ID in UFS command tracing output

The logical unit information is missing from the UFS command tracing
output. Although the device name is logged, e.g. 13200000.ufs, this name
does not include logical unit information. Hence this patch that replaces
the device name with the SCSI ID in the tracing output. An example of
tracing output with this patch applied:

kworker/8:0H-80 [008] ..... 89.106063: ufshcd_command: send_req: 0:0:0:4: tag: 10, DB: 0x7ffffbff, size: 524288, IS: 0, LBA: 1085538, opcode: 0x8a (WRITE_16), group_id: 0x0
dd-4225 [000] d.h.. 89.106219: ufshcd_command: complete_rsp: 0:0:0:4: tag: 11, DB: 0x7ffff7ff, size: 524288, IS: 0, LBA: 1081728, opcode: 0x8a (WRITE_16), group_id: 0x0

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230907183739.905938-1-bvanassche@acm.org
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: core: Fix target_cmd_counter leak

The target_cmd_counter struct allocated via target_alloc_cmd_counter() is
never freed, resulting in leaks across various transport types, e.g.:

unreferenced object 0xffff88801f920120 (size 96):
  comm "sh", pid 102, jiffies 4294892535 (age 713.412s)
  hex dump (first 32 bytes):
    07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 38 01 92 1f 80 88 ff ff  ........8.......
  backtrace:
    [<00000000e58a6252>] kmalloc_trace+0x11/0x20
    [<0000000043af4b2f>] target_alloc_cmd_counter+0x17/0x90 [target_core_mod]
    [<000000007da2dfa7>] target_setup_session+0x2d/0x140 [target_core_mod]
    [<0000000068feef86>] tcm_loop_tpg_nexus_store+0x19b/0x350 [tcm_loop]
    [<000000006a80e021>] configfs_write_iter+0xb1/0x120
    [<00000000e9f4d860>] vfs_write+0x2e4/0x3c0
    [<000000008143433b>] ksys_write+0x80/0xb0
    [<00000000a7df29b2>] do_syscall_64+0x42/0x90
    [<0000000053f45fb8>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Free the structure alongside the corresponding iscsit_conn / se_sess
parent.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Link: https://lore.kernel.org/r/20230831183459.6938-1-ddiss@suse.de
Fixes: becd9be6069e ("scsi: target: Move sess cmd counter to new struct")
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Setup IRQs on resume

The function pm8001_pci_resume() only calls pm8001_request_irq() without
calling pm8001_setup_irq(). This causes the IRQ allocation to fail, which
leads all drives being removed from the system.

Fix this issue by integrating the code for pm8001_setup_irq() directly
inside pm8001_request_irq() so that MSI-X setup is performed both during
normal initialization and resume operations.

Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-2-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command

Tags allocated for OPC_INB_SET_CONTROLLER_CONFIG command need to be freed
when we receive the response.

Signed-off-by: Michal Grzedzicki <mge@meta.com>
Link: https://lore.kernel.org/r/20230911170340.699533-2-mge@meta.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command

Some cards have more than one SAS address. Using an incorrect address
causes communication issues with some devices like expanders.

Closes: https://lore.kernel.org/linux-kernel/A57AEA84-5CA0-403E-8053-106033C73C70@fb.com/
Signed-off-by: Michal Grzedzicki <mge@meta.com>
Link: https://lore.kernel.org/r/20230913155611.3183612-1-mge@meta.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge branch '6.6/scsi-staging' into 6.6/scsi-fixes

Pull in staged fixes for 6.6.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Merge tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm

Pull drm ci scripts from Dave Airlie:
"This is a bunch of ci integration for the freedesktop gitlab instance
  where we currently do upstream userspace testing on diverse sets of
  GPU hardware. From my perspective I think it's an experiment worth
  going with and seeing how the benefits/noise playout keeping these
  files useful.

  Ideally I'd like to get this so we can do pre-merge testing on PRs
  eventually.

  Below is some info from danvet on why we've ended up making the
  decision and how we can roll it back if we decide it was a bad plan.

  Why in upstream?

   - like documentation, testcases, tools CI integration is one of these
     things where you can waste endless amounts of time if you
     accidentally have a version that doesn't match your source code

   - but also like the above, there's a balance, this is the initial cut
     of what we think makes sense to keep in sync vs out-of-tree,
     probably needs adjustment

   - gitlab supports out-of-repo gitlab integration and that's what's
     been used for the kernel in drm, but it results in per-driver
     fragmentation and lots of duplicated effort. the simple act of
     smashing an arbitrary winner into a topic branch already started
     surfacing patches on dri-devel and sparking good cross driver team
     discussions

  Why gitlab?

   - it's not any more shit than any of the other CI

   - drm userspace uses it extensively for everything in userspace, we
     have a lot of people and experience with this, including
     integration of hw testing labs

   - media userspace like gstreamer is also on gitlab.fd.o, and there's
     discussion to extend this to the media subsystem in some fashion

  Can this be shared?

   - there's definitely a pile of code that could move to scripts/ if
     other subsystem adopt ci integration in upstream kernel git. other
     bits are more drm/gpu specific like the igt-gpu-tests/tools
     integration

   - docker images can be run locally or in other CI runners

  Will we regret this?

   - it's all in one directory, intentionally, for easy deletion

   - probably 1-2 years in upstream to see whether this is worth it or a
     Big Mistake. that's roughly what it took to _really_ roll out solid
     CI in the bigger userspace projects we have on gitlab.fd.o like
     mesa3d"

* tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm:
  drm: ci: docs: fix build warning - add missing escape
  drm: Add initial ci/ subdirectory