Table of Contents

Research on Newspaper Layout Segmentation Method Based on Transformer(PDF)


Research Field:
Publishing date:


Research on Newspaper Layout Segmentation Method Based on Transformer
Zhu YifanGao HuaYe Ning
(College of Information Science and Technology & Artificial Intelligence,Nanjing Forestry University,Nanjing 210037,China)
layout segmentationDETRShufflNet V2Feature Pyramid Networks(FPN)ECA
The retrieval and research of information in the context of big data poses a challenge to the digitalization of massive traditional paper media. Thanks to the continuous development of computer vision and artificial intelligence methods,DETR model can be applied to newspaper layout segmentation. In view of the problems existing in the original model in layout segmentation,such as slow detection speed,large number of parameters and inaccurate classification,this paper proposes an improved model using ShuffleNet V2 lightweight backbone network,which can effectively improve computing efficiency and reduce the number of model parameters,thus easing the computing pressure of Transformer structure. At the same time,through the feature pyramid structure,the model can fully integrate the global information and detail information,and significantly enhance the recognition ability of multi-scale targets. In addition,the model also introduces Efficient Channel Attention(EAC)module to extract key target features to effectively suppress irrelevant background information and achieve lightweight design while ensuring segmentation performance. The experimental results show that the parameter number of the improved model is 38.5 M,the frame rate(FPS)is up to 47.5 img/s,and the mAP0.5 is up to 0.806. Compared with the original DETR model,the improved model reduces the number of parameters by 2.8 M,increases the frame rate by 28.3 img/s and improves mAP0.5 by 3.2%. The model proposed in this paper can provide early technical support for OCR recognition of newspaper layout.


