scsi: ufs: ufshpb: Make host mode parameters configurable
authorAvri Altman <avri.altman@wdc.com>
Mon, 12 Jul 2021 09:50:39 +0000 (12:50 +0300)
committerMartin K. Petersen <martin.petersen@oracle.com>
Sun, 1 Aug 2021 20:05:15 +0000 (16:05 -0400)
Elaborate some more on the host control mode logic parameters, explaining
what they do and how to configure them.

Link: https://lore.kernel.org/r/20210712095039.8093-13-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Documentation/ABI/testing/sysfs-driver-ufs
drivers/scsi/ufs/ufshpb.c
drivers/scsi/ufs/ufshpb.h

index 929460738651f2d6a461cedaea450cdc07e4fa18..ec3a7149ced59dd53853a89dd2ab5f80d8db4b03 100644 (file)
@@ -1449,7 +1449,7 @@ Description:      This entry shows the maximum HPB data size for using a single HPB
 
                The file is read only.
 
-What:          /sys/bus/platform/drivers/ufshcd/*/flags/wb_enable
+What:          /sys/bus/platform/drivers/ufshcd/*/flags/hpb_enable
 Date:          June 2021
 Contact:       Daejun Park <daejun7.park@samsung.com>
 Description:   This entry shows the status of HPB.
@@ -1460,3 +1460,77 @@ Description:     This entry shows the status of HPB.
                == ============================
 
                The file is read only.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/activation_thld
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   In host control mode, reads are the major source of activation
+               trials.  Once this threshold hs met, the region is added to the
+               "to-be-activated" list.  Since we reset the read counter upon
+               write, this include sending a rb command updating the region
+               ppn as well.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/normalization_factor
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   In host control mode, we think of the regions as "buckets".
+               Those buckets are being filled with reads, and emptied on write.
+               We use entries_per_srgn - the amount of blocks in a subregion as
+               our bucket size.  This applies because HPB1.0 only handles
+               single-block reads.  Once the bucket size is crossed, we trigger
+               a normalization work - not only to avoid overflow, but mainly
+               because we want to keep those counters normalized, as we are
+               using those reads as a comparative score, to make various decisions.
+               The normalization is dividing (shift right) the read counter by
+               the normalization_factor. If during consecutive normalizations
+               an active region has exhausted its reads - inactivate it.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/eviction_thld_enter
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   Region deactivation is often due to the fact that eviction took
+               place: A region becomes active at the expense of another. This is
+               happening when the max-active-regions limit has been crossed.
+               In host mode, eviction is considered an extreme measure. We
+               want to verify that the entering region has enough reads, and
+               the exiting region has much fewer reads.  eviction_thld_enter is
+               the min reads that a region must have in order to be considered
+               a candidate for evicting another region.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/eviction_thld_exit
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   Same as above for the exiting region. A region is considered to
+               be a candidate for eviction only if it has fewer reads than
+               eviction_thld_exit.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/read_timeout_ms
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   In order not to hang on to "cold" regions, we inactivate
+               a region that has no READ access for a predefined amount of
+               time - read_timeout_ms. If read_timeout_ms has expired, and the
+               region is dirty, it is less likely that we can make any use of
+               HPB reading it so we inactivate it.  Still, deactivation has
+               its overhead, and we may still benefit from HPB reading this
+               region if it is clean - see read_timeout_expiries.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/read_timeout_expiries
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   If the region read timeout has expired, but the region is clean,
+               just re-wind its timer for another spin.  Do that as long as it
+               is clean and did not exhaust its read_timeout_expiries threshold.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/timeout_polling_interval_ms
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   The frequency with which the delayed worker that checks the
+               read_timeouts is awakened.
+
+What:          /sys/class/scsi_device/*/device/hpb_param_sysfs/inflight_map_req
+Date:          February 2021
+Contact:       Avri Altman <avri.altman@wdc.com>
+Description:   In host control mode the host is the originator of map requests.
+               To avoid flooding the device with map requests, use a simple throttling
+               mechanism that limits the number of inflight map requests.
index 49f58598dba76e8b590affa1c4a51ca57393251c..54e8e019bdbe77dcd540c8b5b825b863c20985cd 100644 (file)
@@ -17,7 +17,6 @@
 #include "../sd.h"
 
 #define ACTIVATION_THRESHOLD 8 /* 8 IOs */
-#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 5) /* 256 IOs */
 #define READ_TO_MS 1000
 #define READ_TO_EXPIRIES 100
 #define POLLING_INTERVAL_MS 200
@@ -195,7 +194,7 @@ next_srgn:
                } else {
                        srgn->reads++;
                        rgn->reads++;
-                       if (srgn->reads == ACTIVATION_THRESHOLD)
+                       if (srgn->reads == hpb->params.activation_thld)
                                activate = true;
                }
                spin_unlock(&rgn->rgn_lock);
@@ -744,10 +743,11 @@ static struct ufshpb_req *ufshpb_get_map_req(struct ufshpb_lu *hpb,
        struct bio *bio;
 
        if (hpb->is_hcm &&
-           hpb->num_inflight_map_req >= THROTTLE_MAP_REQ_DEFAULT) {
+           hpb->num_inflight_map_req >= hpb->params.inflight_map_req) {
                dev_info(&hpb->sdev_ufs_lu->sdev_dev,
                         "map_req throttle. inflight %d throttle %d",
-                        hpb->num_inflight_map_req, THROTTLE_MAP_REQ_DEFAULT);
+                        hpb->num_inflight_map_req,
+                        hpb->params.inflight_map_req);
                return NULL;
        }
 
@@ -1053,6 +1053,7 @@ static void ufshpb_read_to_handler(struct work_struct *work)
        struct victim_select_info *lru_info = &hpb->lru_info;
        struct ufshpb_region *rgn, *next_rgn;
        unsigned long flags;
+       unsigned int poll;
        LIST_HEAD(expired_list);
 
        if (test_and_set_bit(TIMEOUT_WORK_RUNNING, &hpb->work_data_bits))
@@ -1071,7 +1072,7 @@ static void ufshpb_read_to_handler(struct work_struct *work)
                                list_add(&rgn->list_expired_rgn, &expired_list);
                        else
                                rgn->read_timeout = ktime_add_ms(ktime_get(),
-                                                        READ_TO_MS);
+                                               hpb->params.read_timeout_ms);
                }
        }
 
@@ -1089,8 +1090,9 @@ static void ufshpb_read_to_handler(struct work_struct *work)
 
        clear_bit(TIMEOUT_WORK_RUNNING, &hpb->work_data_bits);
 
+       poll = hpb->params.timeout_polling_interval_ms;
        schedule_delayed_work(&hpb->ufshpb_read_to_work,
-                             msecs_to_jiffies(POLLING_INTERVAL_MS));
+                             msecs_to_jiffies(poll));
 }
 
 static void ufshpb_add_lru_info(struct victim_select_info *lru_info,
@@ -1100,8 +1102,11 @@ static void ufshpb_add_lru_info(struct victim_select_info *lru_info,
        list_add_tail(&rgn->list_lru_rgn, &lru_info->lh_lru_rgn);
        atomic_inc(&lru_info->active_cnt);
        if (rgn->hpb->is_hcm) {
-               rgn->read_timeout = ktime_add_ms(ktime_get(), READ_TO_MS);
-               rgn->read_timeout_expiries = READ_TO_EXPIRIES;
+               rgn->read_timeout =
+                       ktime_add_ms(ktime_get(),
+                                    rgn->hpb->params.read_timeout_ms);
+               rgn->read_timeout_expiries =
+                       rgn->hpb->params.read_timeout_expiries;
        }
 }
 
@@ -1130,7 +1135,8 @@ static struct ufshpb_region *ufshpb_victim_lru_info(struct ufshpb_lu *hpb)
                 * in host control mode, verify that the exiting region
                 * has fewer reads
                 */
-               if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1))
+               if (hpb->is_hcm &&
+                   rgn->reads > hpb->params.eviction_thld_exit)
                        continue;
 
                victim_rgn = rgn;
@@ -1346,7 +1352,8 @@ static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn)
                         * in host control mode, verify that the entering
                         * region has enough reads
                         */
-                       if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) {
+                       if (hpb->is_hcm &&
+                           rgn->reads < hpb->params.eviction_thld_enter) {
                                ret = -EACCES;
                                goto out;
                        }
@@ -1697,6 +1704,7 @@ static void ufshpb_normalization_work_handler(struct work_struct *work)
        struct ufshpb_lu *hpb = container_of(work, struct ufshpb_lu,
                                             ufshpb_normalization_work);
        int rgn_idx;
+       u8 factor = hpb->params.normalization_factor;
 
        for (rgn_idx = 0; rgn_idx < hpb->rgns_per_lu; rgn_idx++) {
                struct ufshpb_region *rgn = hpb->rgn_tbl + rgn_idx;
@@ -1707,7 +1715,7 @@ static void ufshpb_normalization_work_handler(struct work_struct *work)
                for (srgn_idx = 0; srgn_idx < hpb->srgns_per_rgn; srgn_idx++) {
                        struct ufshpb_subregion *srgn = rgn->srgn_tbl + srgn_idx;
 
-                       srgn->reads >>= 1;
+                       srgn->reads >>= factor;
                        rgn->reads += srgn->reads;
                }
                spin_unlock(&rgn->rgn_lock);
@@ -2030,8 +2038,247 @@ requeue_timeout_ms_store(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(requeue_timeout_ms);
 
+ufshpb_sysfs_param_show_func(activation_thld);
+static ssize_t
+activation_thld_store(struct device *dev, struct device_attribute *attr,
+                     const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       if (val <= 0)
+               return -EINVAL;
+
+       hpb->params.activation_thld = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(activation_thld);
+
+ufshpb_sysfs_param_show_func(normalization_factor);
+static ssize_t
+normalization_factor_store(struct device *dev, struct device_attribute *attr,
+                          const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       if (val <= 0 || val > ilog2(hpb->entries_per_srgn))
+               return -EINVAL;
+
+       hpb->params.normalization_factor = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(normalization_factor);
+
+ufshpb_sysfs_param_show_func(eviction_thld_enter);
+static ssize_t
+eviction_thld_enter_store(struct device *dev, struct device_attribute *attr,
+                         const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       if (val <= hpb->params.eviction_thld_exit)
+               return -EINVAL;
+
+       hpb->params.eviction_thld_enter = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(eviction_thld_enter);
+
+ufshpb_sysfs_param_show_func(eviction_thld_exit);
+static ssize_t
+eviction_thld_exit_store(struct device *dev, struct device_attribute *attr,
+                        const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       if (val <= hpb->params.activation_thld)
+               return -EINVAL;
+
+       hpb->params.eviction_thld_exit = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(eviction_thld_exit);
+
+ufshpb_sysfs_param_show_func(read_timeout_ms);
+static ssize_t
+read_timeout_ms_store(struct device *dev, struct device_attribute *attr,
+                     const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       /* read_timeout >> timeout_polling_interval */
+       if (val < hpb->params.timeout_polling_interval_ms * 2)
+               return -EINVAL;
+
+       hpb->params.read_timeout_ms = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(read_timeout_ms);
+
+ufshpb_sysfs_param_show_func(read_timeout_expiries);
+static ssize_t
+read_timeout_expiries_store(struct device *dev, struct device_attribute *attr,
+                           const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       if (val <= 0)
+               return -EINVAL;
+
+       hpb->params.read_timeout_expiries = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(read_timeout_expiries);
+
+ufshpb_sysfs_param_show_func(timeout_polling_interval_ms);
+static ssize_t
+timeout_polling_interval_ms_store(struct device *dev,
+                                 struct device_attribute *attr,
+                                 const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       /* timeout_polling_interval << read_timeout */
+       if (val <= 0 || val > hpb->params.read_timeout_ms / 2)
+               return -EINVAL;
+
+       hpb->params.timeout_polling_interval_ms = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(timeout_polling_interval_ms);
+
+ufshpb_sysfs_param_show_func(inflight_map_req);
+static ssize_t inflight_map_req_store(struct device *dev,
+                                     struct device_attribute *attr,
+                                     const char *buf, size_t count)
+{
+       struct scsi_device *sdev = to_scsi_device(dev);
+       struct ufshpb_lu *hpb = ufshpb_get_hpb_data(sdev);
+       int val;
+
+       if (!hpb)
+               return -ENODEV;
+
+       if (!hpb->is_hcm)
+               return -EOPNOTSUPP;
+
+       if (kstrtouint(buf, 0, &val))
+               return -EINVAL;
+
+       if (val <= 0 || val > hpb->sdev_ufs_lu->queue_depth - 1)
+               return -EINVAL;
+
+       hpb->params.inflight_map_req = val;
+
+       return count;
+}
+static DEVICE_ATTR_RW(inflight_map_req);
+
+static void ufshpb_hcm_param_init(struct ufshpb_lu *hpb)
+{
+       hpb->params.activation_thld = ACTIVATION_THRESHOLD;
+       hpb->params.normalization_factor = 1;
+       hpb->params.eviction_thld_enter = (ACTIVATION_THRESHOLD << 5);
+       hpb->params.eviction_thld_exit = (ACTIVATION_THRESHOLD << 4);
+       hpb->params.read_timeout_ms = READ_TO_MS;
+       hpb->params.read_timeout_expiries = READ_TO_EXPIRIES;
+       hpb->params.timeout_polling_interval_ms = POLLING_INTERVAL_MS;
+       hpb->params.inflight_map_req = THROTTLE_MAP_REQ_DEFAULT;
+}
+
 static struct attribute *hpb_dev_param_attrs[] = {
        &dev_attr_requeue_timeout_ms.attr,
+       &dev_attr_activation_thld.attr,
+       &dev_attr_normalization_factor.attr,
+       &dev_attr_eviction_thld_enter.attr,
+       &dev_attr_eviction_thld_exit.attr,
+       &dev_attr_read_timeout_ms.attr,
+       &dev_attr_read_timeout_expiries.attr,
+       &dev_attr_timeout_polling_interval_ms.attr,
+       &dev_attr_inflight_map_req.attr,
        NULL,
 };
 
@@ -2115,6 +2362,8 @@ static void ufshpb_stat_init(struct ufshpb_lu *hpb)
 static void ufshpb_param_init(struct ufshpb_lu *hpb)
 {
        hpb->params.requeue_timeout_ms = HPB_REQUEUE_TIME_MS;
+       if (hpb->is_hcm)
+               ufshpb_hcm_param_init(hpb);
 }
 
 static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb)
@@ -2169,9 +2418,13 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb)
        ufshpb_stat_init(hpb);
        ufshpb_param_init(hpb);
 
-       if (hpb->is_hcm)
+       if (hpb->is_hcm) {
+               unsigned int poll;
+
+               poll = hpb->params.timeout_polling_interval_ms;
                schedule_delayed_work(&hpb->ufshpb_read_to_work,
-                                     msecs_to_jiffies(POLLING_INTERVAL_MS));
+                                     msecs_to_jiffies(poll));
+       }
 
        return 0;
 
@@ -2350,10 +2603,13 @@ void ufshpb_resume(struct ufs_hba *hba)
                        continue;
                ufshpb_set_state(hpb, HPB_PRESENT);
                ufshpb_kick_map_work(hpb);
-               if (hpb->is_hcm)
-                       schedule_delayed_work(&hpb->ufshpb_read_to_work,
-                               msecs_to_jiffies(POLLING_INTERVAL_MS));
+               if (hpb->is_hcm) {
+                       unsigned int poll =
+                               hpb->params.timeout_polling_interval_ms;
 
+                       schedule_delayed_work(&hpb->ufshpb_read_to_work,
+                               msecs_to_jiffies(poll));
+               }
        }
 }
 
index edf565e9036f7543ac9e2a6d8da7e129b8ee65f6..c74a6c35a446064e8bff55be96875b7083b4f0cb 100644 (file)
@@ -185,8 +185,28 @@ struct victim_select_info {
        atomic_t active_cnt;
 };
 
+/**
+ * ufshpb_params - ufs hpb parameters
+ * @requeue_timeout_ms - requeue threshold of wb command (0x2)
+ * @activation_thld - min reads [IOs] to activate/update a region
+ * @normalization_factor - shift right the region's reads
+ * @eviction_thld_enter - min reads [IOs] for the entering region in eviction
+ * @eviction_thld_exit - max reads [IOs] for the exiting region in eviction
+ * @read_timeout_ms - timeout [ms] from the last read IO to the region
+ * @read_timeout_expiries - amount of allowable timeout expireis
+ * @timeout_polling_interval_ms - frequency in which timeouts are checked
+ * @inflight_map_req - number of inflight map requests
+ */
 struct ufshpb_params {
        unsigned int requeue_timeout_ms;
+       unsigned int activation_thld;
+       unsigned int normalization_factor;
+       unsigned int eviction_thld_enter;
+       unsigned int eviction_thld_exit;
+       unsigned int read_timeout_ms;
+       unsigned int read_timeout_expiries;
+       unsigned int timeout_polling_interval_ms;
+       unsigned int inflight_map_req;
 };
 
 struct ufshpb_stats {