python目标检测非极大抑制NMS与Soft-NMS

2023-02-21 原文

注意事项

Soft-NMS对于大多数数据集而言，作用比较小，提升效果非常不明显，它起作用的地方是大量密集的同类重叠场景，大量密集的不同类重叠场景其实也没什么作用，同学们可以借助Soft-NMS理解非极大抑制的含义，但是实现的必要性确实不强，在提升网络性能上，不建议死磕Soft-NMS。

已对该博文中的代码进行了重置，视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。

非极大抑制是目标检测中非常非常非常非常非常重要的一部分，了解一下原理，撕一下代码是必要的

什么是非极大抑制NMS

非极大抑制的概念只需要看这两幅图就知道了：

下图是经过非极大抑制的。

下图是未经过非极大抑制的。

可以很明显的看出来，未经过非极大抑制的图片有许多重复的框，这些框都指向了同一个物体！

可以用一句话概括非极大抑制的功能就是：

筛选出一定区域内属于同一种类得分最大的框。

1、非极大抑制NMS的实现过程

本博文实现的是多分类的非极大抑制，该非极大抑制使用在我的pytorch-yolov3例子中：输入shape为[ batch_size, all_anchors, 5 num_classes ]

第一个维度是图片的数量。

第二个维度是所有的预测框。

第三个维度是所有的预测框的预测结果。

非极大抑制的执行过程如下所示：

1、对所有图片进行循环。

2、找出该图片中得分大于门限函数的框。在进行重合框筛选前就进行得分的筛选可以大幅度减少框的数量。

3、判断第2步中获得的框的种类与得分。取出预测结果中框的位置与之进行堆叠。此时最后一维度里面的内容由5 num_classes变成了4 1 2，四个参数代表框的位置，一个参数代表预测框是否包含物体，两个参数分别代表种类的置信度与种类。

4、对种类进行循环，非极大抑制的作用是筛选出一定区域内属于同一种类得分最大的框，对种类进行循环可以帮助我们对每一个类分别进行非极大抑制。

5、根据得分对该种类进行从大到小排序。

6、每次取出得分最大的框，计算其与其它所有预测框的重合程度，重合程度过大的则剔除。

视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。

实现代码如下：

def bbox_iou(self, box1, box2, x1y1x2y2=True):
    """
        计算IOU
    """
    if not x1y1x2y2:
        b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0]   box1[:, 2] / 2
        b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1]   box1[:, 3] / 2
        b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0]   box2[:, 2] / 2
        b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1]   box2[:, 3] / 2
    else:
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3]
    inter_rect_x1 = torch.max(b1_x1, b2_x1)
    inter_rect_y1 = torch.max(b1_y1, b2_y1)
    inter_rect_x2 = torch.min(b1_x2, b2_x2)
    inter_rect_y2 = torch.min(b1_y2, b2_y2)
    inter_area = torch.clamp(inter_rect_x2 - inter_rect_x1, min=0) * \
                torch.clamp(inter_rect_y2 - inter_rect_y1, min=0)
    b1_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1)
    b2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1)
    iou = inter_area / torch.clamp(b1_area   b2_area - inter_area, min = 1e-6)
    return iou
def non_max_suppression(self, prediction, num_classes, input_shape, image_shape, letterbox_image, conf_thres=0.5, nms_thres=0.4):
    #----------------------------------------------------------#
    #   将预测结果的格式转换成左上角右下角的格式。
    #   prediction  [batch_size, num_anchors, 85]
    #----------------------------------------------------------#
    box_corner          = prediction.new(prediction.shape)
    box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2
    box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2
    box_corner[:, :, 2] = prediction[:, :, 0]   prediction[:, :, 2] / 2
    box_corner[:, :, 3] = prediction[:, :, 1]   prediction[:, :, 3] / 2
    prediction[:, :, :4] = box_corner[:, :, :4]
    output = [None for _ in range(len(prediction))]
    for i, image_pred in enumerate(prediction):
        #----------------------------------------------------------#
        #   对种类预测部分取max。
        #   class_conf  [num_anchors, 1]    种类置信度
        #   class_pred  [num_anchors, 1]    种类
        #----------------------------------------------------------#
        class_conf, class_pred = torch.max(image_pred[:, 5:5   num_classes], 1, keepdim=True)
        #----------------------------------------------------------#
        #   利用置信度进行第一轮筛选
        #----------------------------------------------------------#
        conf_mask = (image_pred[:, 4] * class_conf[:, 0] >= conf_thres).squeeze()
        #----------------------------------------------------------#
        #   根据置信度进行预测结果的筛选
        #----------------------------------------------------------#
        image_pred = image_pred[conf_mask]
        class_conf = class_conf[conf_mask]
        class_pred = class_pred[conf_mask]
        if not image_pred.size(0):
            continue
        #-------------------------------------------------------------------------#
        #   detections  [num_anchors, 7]
        #   7的内容为：x1, y1, x2, y2, obj_conf, class_conf, class_pred
        #-------------------------------------------------------------------------#
        detections = torch.cat((image_pred[:, :5], class_conf.float(), class_pred.float()), 1)
        #------------------------------------------#
        #   获得预测结果中包含的所有种类
        #------------------------------------------#
        unique_labels = detections[:, -1].cpu().unique()
        if prediction.is_cuda:
            unique_labels = unique_labels.cuda()
            detections = detections.cuda()
        for c in unique_labels:
            #------------------------------------------#
            #   获得某一类得分筛选后全部的预测结果
            #------------------------------------------#
            detections_class = detections[detections[:, -1] == c]
            # #------------------------------------------#
            # #   使用官方自带的非极大抑制会速度更快一些！
            # #------------------------------------------#
            # keep = nms(
            #     detections_class[:, :4],
            #     detections_class[:, 4] * detections_class[:, 5],
            #     nms_thres
            # )
            # max_detections = detections_class[keep]
            # 按照存在物体的置信度排序
            _, conf_sort_index = torch.sort(detections_class[:, 4]*detections_class[:, 5], descending=True)
            detections_class = detections_class[conf_sort_index]
            # 进行非极大抑制
            max_detections = []
            while detections_class.size(0):
                # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉
                max_detections.append(detections_class[0].unsqueeze(0))
                if len(detections_class) == 1:
                    break
                ious = self.bbox_iou(max_detections[-1], detections_class[1:])
                detections_class = detections_class[1:][ious < nms_thres]
            # 堆叠
            max_detections = torch.cat(max_detections).data
            # Add max detections to outputs
            output[i] = max_detections if output[i] is None else torch.cat((output[i], max_detections))
        if output[i] is not None:
            output[i]           = output[i].cpu().numpy()
            box_xy, box_wh      = (output[i][:, 0:2]   output[i][:, 2:4])/2, output[i][:, 2:4] - output[i][:, 0:2]
            output[i][:, :4]    = self.yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape, letterbox_image)
    return output

2、柔性非极大抑制Soft-NMS的实现过程

柔性非极大抑制和普通的非极大抑制相差不大，只差了几行代码。

柔性非极大抑制认为不应该直接只通过重合程度进行筛选，如图所示，很明显图片中存在两匹马，但是此时两匹马的重合程度较高，此时我们如果使用普通nms，后面那匹得分比较低的马会直接被剔除。

Soft-NMS认为在进行非极大抑制的时候要同时考虑得分和重合程度。

我们直接看NMS和Soft-NMS的代码差别：

视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。

如下为NMS：

while detections_class.size(0):
    # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉
    max_detections.append(detections_class[0].unsqueeze(0))
    if len(detections_class) == 1:
        break
    ious = self.bbox_iou(max_detections[-1], detections_class[1:])
    detections_class = detections_class[1:][ious < nms_thres]

如下为Soft-NMS：

while detections_class.size(0):
    # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉
    max_detections.append(detections_class[0].unsqueeze(0))
    if len(detections_class) == 1:
        break
    ious                    = self.bbox_iou(max_detections[-1], detections_class[1:])
    detections_class[1:, 4] = torch.exp(-(ious * ious) / sigma) * detections_class[1:, 4]
    detections_class        = detections_class[1:]
    detections_class        = detections_class[detections_class[:, 4] >= conf_thres]
    arg_sort                = torch.argsort(detections_class[:, 4], descending = True)
    detections_class        = detections_class[arg_sort]

我们可以看到，对于NMS而言，其直接将与得分最大的框重合程度较高的其它预测剔除。而Soft-NMS则以一个权重的形式，将获得的IOU取高斯指数后乘上原得分，之后重新排序。继续循环。

视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。

实现代码如下：

def bbox_iou(self, box1, box2, x1y1x2y2=True):
    """
        计算IOU
    """
    if not x1y1x2y2:
        b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0]   box1[:, 2] / 2
        b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1]   box1[:, 3] / 2
        b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0]   box2[:, 2] / 2
        b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1]   box2[:, 3] / 2
    else:
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3]
    inter_rect_x1 = torch.max(b1_x1, b2_x1)
    inter_rect_y1 = torch.max(b1_y1, b2_y1)
    inter_rect_x2 = torch.min(b1_x2, b2_x2)
    inter_rect_y2 = torch.min(b1_y2, b2_y2)
    inter_area = torch.clamp(inter_rect_x2 - inter_rect_x1, min=0) * \
                torch.clamp(inter_rect_y2 - inter_rect_y1, min=0)
    b1_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1)
    b2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1)
    iou = inter_area / torch.clamp(b1_area   b2_area - inter_area, min = 1e-6)
    return iou
def non_max_suppression(self, prediction, num_classes, input_shape, image_shape, letterbox_image, conf_thres=0.5, nms_thres=0.4, sigma=0.5):
    #----------------------------------------------------------#
    #   将预测结果的格式转换成左上角右下角的格式。
    #   prediction  [batch_size, num_anchors, 85]
    #----------------------------------------------------------#
    box_corner          = prediction.new(prediction.shape)
    box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2
    box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2
    box_corner[:, :, 2] = prediction[:, :, 0]   prediction[:, :, 2] / 2
    box_corner[:, :, 3] = prediction[:, :, 1]   prediction[:, :, 3] / 2
    prediction[:, :, :4] = box_corner[:, :, :4]
    output = [None for _ in range(len(prediction))]
    for i, image_pred in enumerate(prediction):
        #----------------------------------------------------------#
        #   对种类预测部分取max。
        #   class_conf  [num_anchors, 1]    种类置信度
        #   class_pred  [num_anchors, 1]    种类
        #----------------------------------------------------------#
        class_conf, class_pred = torch.max(image_pred[:, 5:5   num_classes], 1, keepdim=True)
        #----------------------------------------------------------#
        #   利用置信度进行第一轮筛选
        #----------------------------------------------------------#
        conf_mask = (image_pred[:, 4] * class_conf[:, 0] >= conf_thres).squeeze()
        #----------------------------------------------------------#
        #   根据置信度进行预测结果的筛选
        #----------------------------------------------------------#
        image_pred = image_pred[conf_mask]
        class_conf = class_conf[conf_mask]
        class_pred = class_pred[conf_mask]
        if not image_pred.size(0):
            continue
        #-------------------------------------------------------------------------#
        #   detections  [num_anchors, 7]
        #   7的内容为：x1, y1, x2, y2, obj_conf, class_conf, class_pred
        #-------------------------------------------------------------------------#
        detections = torch.cat((image_pred[:, :5], class_conf.float(), class_pred.float()), 1)
        #------------------------------------------#
        #   获得预测结果中包含的所有种类
        #------------------------------------------#
        unique_labels = detections[:, -1].cpu().unique()
        if prediction.is_cuda:
            unique_labels = unique_labels.cuda()
            detections = detections.cuda()
        for c in unique_labels:
            #------------------------------------------#
            #   获得某一类得分筛选后全部的预测结果
            #------------------------------------------#
            detections_class = detections[detections[:, -1] == c]
            # #------------------------------------------#
            # #   使用官方自带的非极大抑制会速度更快一些！
            # #------------------------------------------#
            # keep = nms(
            #     detections_class[:, :4],
            #     detections_class[:, 4] * detections_class[:, 5],
            #     nms_thres
            # )
            # max_detections = detections_class[keep]
            # 按照存在物体的置信度排序
            _, conf_sort_index = torch.sort(detections_class[:, 4]*detections_class[:, 5], descending=True)
            detections_class = detections_class[conf_sort_index]
            # 进行非极大抑制
            max_detections = []
            while detections_class.size(0):
                # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉
                max_detections.append(detections_class[0].unsqueeze(0))
                if len(detections_class) == 1:
                    break
                ious                    = self.bbox_iou(max_detections[-1], detections_class[1:])
                detections_class[1:, 4] = torch.exp(-(ious * ious) / sigma) * detections_class[1:, 4]
                detections_class        = detections_class[1:]
                detections_class        = detections_class[detections_class[:, 4] >= conf_thres]
                arg_sort                = torch.argsort(detections_class[:, 4], descending = True)
                detections_class        = detections_class[arg_sort]
            # 堆叠
            max_detections = torch.cat(max_detections).data
            # Add max detections to outputs
            output[i] = max_detections if output[i] is None else torch.cat((output[i], max_detections))
        if output[i] is not None:
            output[i]           = output[i].cpu().numpy()
            box_xy, box_wh      = (output[i][:, 0:2]   output[i][:, 2:4])/2, output[i][:, 2:4] - output[i][:, 0:2]
            output[i][:, :4]    = self.yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape, letterbox_image)
    return output

以上就是python目标检测非极大抑制NMS与Soft-NMS的详细内容，更多关于非极大抑制NMS Soft-NMS的资料请关注Devmax其它相关文章！

python目标检测非极大抑制NMS与Soft-NMS的更多相关文章

XCode 3.2 Ruby和Python模板

在xcode3.2下,我的ObjectiveCPython/Ruby项目仍然可以打开更新和编译,但是你无法创建新项目.鉴于xcode3.2中缺少ruby和python的所有痕迹(即创建项目并添加新的ruby/python文件),是否有一种简单的方法可以再次安装模板？我发现了一些关于将它们复制到某个文件夹的信息,但我似乎无法让它工作,我怀疑文件夹的位置已经改变为3.2.解决方法3.2中的应用程序模板
Swift基本使用-函数和闭包(三)

声明函数和其他脚本语言有相似的地方，比较明显的地方是声明函数的关键字swift也出现了Python中的组元，可以通过一个组元返回多个值。传递可变参数，函数以数组的形式获取参数swift中函数可以嵌套，被嵌套的函数可以访问外部函数的变量。可以通过函数的潜逃来重构过长或者太复杂的函数。
10 个Python中Pip的使用技巧分享

众所周知，pip 可以安装、更新、卸载 Python 的第三方库，非常方便。本文小编为大家总结了Python中Pip的使用技巧，需要的可以参考一下
Swift、Go、Julia与R能否挑战 Python 的王者地位

本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至dio@foxmail.com举报，一经查实，本站将立刻删除。
红薯因 Swift 重写开源中国失败，貌似欲改用 Python

本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至dio@foxmail.com举报，一经查实，本站将立刻删除。
你没看错：Swift可以直接调用Python函数库

上周Perfect又推出了新一轮服务器端Swift增强函数库：Perfect-Python。对，你没看错，在服务器端Swift其实可以轻松从其他语种的函数库中直接拿来调用，不需要修改任何内容。以如下python脚本为例：Perfect-Python可以用下列方法封装并调用以上函数，您所需要注意的仅仅是其函数名称以及参数。
Swift中的列表解析

在Swift中完成这个的最简单的方法是什么？我在寻找类似的东西：从Swift2.x开始，有一些与你的Python样式列表解析相当的东西。(在这个意义上，它更像是Python的xrange。如果你想保持集合懒惰一路通过，只是这样说：与Python中的列表解析语法不同，Swift中的这些操作遵循与其他操作相同的语法。
swift抛出终端的python错误

每当我尝试启动与python相关的swift时,我都会收到错误.我该如何解决？
在Android上用Java嵌入Python

解决方法看看this,它适用于J2SE,你可以尝试在Android上运行.
在android studio中使用python代码构建android应用程序

我有一些python代码和它的机器人,我正在寻找一种方法来使用android项目中的那些python代码.有没有办法做到这一点！？解决方法有两种主要工具可供使用,它们彼此不同：>QPython>Kivy使用Kivy,大致相同的代码也可以部署到IOS.

随机推荐

10 个Python中Pip的使用技巧分享

众所周知，pip 可以安装、更新、卸载 Python 的第三方库，非常方便。本文小编为大家总结了Python中Pip的使用技巧，需要的可以参考一下
python数学建模之三大模型与十大常用算法详情

这篇文章主要介绍了python数学建模之三大模型与十大常用算法详情，文章围绕主题展开详细的内容介绍，具有一定的参考价值，感想取得小伙伴可以参考一下
Python爬取奶茶店数据分析哪家最好喝以及性价比

这篇文章主要介绍了用Python告诉你奶茶哪家最好喝性价比最高，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习吧
使用pyinstaller打包.exe文件的详细教程

PyInstaller是一个跨平台的Python应用打包工具，能够把 Python 脚本及其所在的 Python 解释器打包成可执行文件,下面这篇文章主要给大家介绍了关于使用pyinstaller打包.exe文件的相关资料,需要的朋友可以参考下
基于Python实现射击小游戏的制作

这篇文章主要介绍了如何利用Python制作一个自己专属的第一人称射击小游戏，文中的示例代码讲解详细，感兴趣的小伙伴可以跟随小编一起动手试一试
Python list append方法之给列表追加元素

这篇文章主要介绍了Python list append方法如何给列表追加元素，具有很好的参考价值，希望对大家有所帮助。如有错误或未考虑完全的地方，望不吝赐教
Pytest+Request+Allure+Jenkins实现接口自动化

这篇文章介绍了Pytest+Request+Allure+Jenkins实现接口自动化的方法，文中通过示例代码介绍的非常详细。对大家的学习或工作具有一定的参考借鉴价值，需要的朋友可以参考下
利用python实现简单的情感分析实例教程

商品评论挖掘、电影推荐、股市预测……情感分析大有用武之地,下面这篇文章主要给大家介绍了关于利用python实现简单的情感分析的相关资料,文中通过示例代码介绍的非常详细,需要的朋友可以参考下
利用Python上传日志并监控告警的方法详解

这篇文章将详细为大家介绍如何通过阿里云日志服务搭建一套通过Python上传日志、配置日志告警的监控服务，感兴趣的小伙伴可以了解一下
Pycharm中运行程序在Python console中执行,不是直接Run问题

这篇文章主要介绍了Pycharm中运行程序在Python console中执行,不是直接Run问题，具有很好的参考价值，希望对大家有所帮助。如有错误或未考虑完全的地方，望不吝赐教