GreedySlide: An Efficient Sliding Window for Improving Edge-Object Detectors

To Hai Thien; Tung-Lam Duong; Chi-Luan Le

GreedySlide: An Efficient Sliding Window for Improving Edge-Object Detectors

To Hai Thien, Tung-Lam Duong, Chi-Luan Le

DOI: http://dx.doi.org/10.15439/2022R09

Citation: Proceedings of the 2022 Seventh International Conference on Research in Intelligent and Computing in Engineering, Vu Dinh Khoa, Shivani Agarwal, Gloria Jeanette Rincon Aponte, Nguyen Thi Hong Nga, Vijender Kumar Solanki, Ewa Ziemba (eds). ACSIS, Vol. 33, pages 243–248 (2022)

Full text

Abstract. The recent development in deep learning and edge hardware architecture has provided artificial applications with a robust foundation to move into real-life applications and allow a model to inference right on edge. If a well-trained edge object detection (OD) model is acquired, multiple scenarios such as autonomous driving, autonomous hospital management, or a self-shopping cart can be achieved. However, to make a model well-inference on edge, a model needs to be quantized to scale down the size and speed up at inference. This quantization scheme creates a degradation in the model where each layer is restricted to at most lower representations, forcing an output layer only to have fewer options to circle an object. Furthermore, it also limits model generalization where the behavior of the dataset gets cut off each activation layer. We proposed a novel method GreedySlide by sliding window that divides a capture into windows to make an object fits better on the quantization bound to address this problem. Even though the technique sounds simple, it helps increase the number of options for bounding an object and clips the variance that can have by scanning the whole image. Our work has improved an original edge model on its corresponding benchmark by experimenting and increasing the model generalization on other related datasets without retraining the model.

References

Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. R-fcn: Object detection via region-based fully convolutional networks. In Advances in neural information processing systems, pages 379–387, 2016.
Ruihao Gong, Xianglong Liu, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, and Junjie Yan. Differentiable soft quantization: Bridging full-precision and low-bit neural networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 4852–4861, 2019.
Samuel Greengard. Ai on edge. Commun. ACM, 63(9):18–20, August 2020.
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint https://arxiv.org/abs/1704.04861, 2017.
Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2704–2713, 2018.
Jangho Kim, KiYoon Yoo, and Nojun Kwak. Position-based scaled gradient for model quantization and pruning. 2020.
Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv preprint https://arxiv.org/abs/1806.08342, 2018.
Dongsoo Lee, Se Jung Kwon, Byeongwook Kim, Yongkweon Jeon, Baeseong Park, and Jeongin Yun. Flexor: Trainable fractional quantization. arXiv preprint https://arxiv.org/abs/2009.04126, 2020.
Rundong Li, Yan Wang, Feng Liang, Hongwei Qin, Junjie Yan, and Rui Fan. Fully quantized network for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2810–2819, 2019.
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
W Liu, D Anguelov, D Erhan, C Szegedy, S Reed, CY Fu, and AC Berg. Ssd: Single shot multibox detector. arxiv 2016. arXiv preprint https://arxiv.org/abs/1512.02325, 2020.
Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, and Trevor Darrell. Rethinking the value of network pruning. arXiv preprint https://arxiv.org/abs/1810.05270, 2018.
Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arxiv 2018. arXiv preprint https://arxiv.org/abs/1804.02767, pages 1–6, 2018.
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
S. Zhai, D. Shang, S. Wang, and S. Dong. Df-ssd: An improved ssd object detection algorithm based on densenet and feature fusion. IEEE Access, 8:24344–24357, 2020.
Michael Zhu and Suyog Gupta. To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint https://arxiv.org/abs/1710.01878, 2017.