«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.issn.1001-4616.2025.02.011]
点击复制

一种适用于政务区块链的跨模态人脸生成模型()

《南京师大学报（自然科学版）》[ISSN:1001-4616/CN:32-1239/N]

卷:: 48
期数:: 2025年02期

页码:: 102-111

栏目:: 计算机科学与技术

出版日期:: 2025-04-15

文章信息/Info

Title:: Cross-Modal Face Generation Model for Government Blockchain

文章编号:: 1001-4616(2025)02-0102-10

作者:: 崔思颖¹; 谭志杰¹; 袁想¹; 李伟平¹; 莫同¹; 乔秀全²; 吴中海¹; (1.北京大学软件与微电子学院,北京 102600)
(2.北京邮电大学网络与交换技术国家重点实验室,北京 100876)

Author(s):: Cui Siying¹; Tan Zhijie¹; Yuan Xiang¹; Li Weiping¹; Mo Tong¹; Qiao Xiuquan²; Wu Zhonghai¹; (1.School of Software and Microelectronics,Peking University,Beijing 102600,China)
(2.State Key Laboratory of Networking and Switching Technology,Beijing University of Posts and Telecommunications,Beijing 100876,China)

关键词:: 区块链; 跨模态人脸生成; 可控图像生成; 扩散模型; 人脸识别

Keywords:: blockchain; cross-modal face generation; controllable image generation; diffusion model; face recognition

分类号:: TP391

DOI:: 10.3969/j.issn.1001-4616.2025.02.011

文献标志码:: A

摘要:: 如今,区块链技术被应用到包含电子证照、人脸图像等政府数据共享领域,但当前的大型区块链系统普遍面临低带宽和高存储成本的问题. 本文提出了一种适用于政务区块链的跨模态人脸生成模型,将人脸图像转换为文本模态存储在链上,用户可使用文本与掩膜生成指定人的人脸图像. 首先利用多任务学习方法训练基于ResNet-18网络结构的人脸分类器,将人脸图像转换为身份代号文本存储在链上. 然后设计了区域感知码本和基于Transformer结构的混合专家采样器,采样器采用扩散模型的方法从码本中采样索引,采样结果由一个可学习的解码器转换成细粒度的人脸图像. 在进行数据增强后的Casia Face V5数据集上的实验表明,模型在人脸分类任务中准确率可达95%以上,压缩效果达到了传统图像压缩方法1/10 000的持久化时间与1/200的文件大小,与其他先进人脸图像生成方法相比,此模型可以可控地生成高保真度的指定人的人脸图像,并以1/20的参数量达到与大型预训练模型相近的人脸生成效果.

Abstract:: Blockchain technology is currently used in government data sharing,but faces challenges such as limited bandwidth and high storage costs. To address this,the study proposed a cross-modal face generation model for the government blockchain. This model converted face images into text modals and stored them on the chain,allowing users to generate face images of specific individuals using text and masks.To achieve this,the study trained a face classifier based on the ResNet-18 network structure using a multi-task learning method. The resulting identity code text is then stored on the blockchain. Additionally,the study constructed region-aware codebooks and designed a diffusion-based transformer sampler with mixture-of-experts. This sampler converts indexed from the codebooks into fine-grained face images using a learnable decoder. The experiments on the enhanced Casia Face V5 dataset demonstrated that the model achieved a face classification accuracy rate of 95%. Furthermore,it offered a persistence time of 1/10 000 and a file size of 1/200 compared to traditional image compression methods. Compared to other advanced face image generation methods,this model can generate high-fidelity face images of specific individuals while requiring only 1/20 of the parameters of large pre-trained models.

参考文献/References:

[1]王鹏,魏必,王聪. 区块链技术在政务数据共享中的应用[J]. 大数据,2020,6(4):105-114.
[2]NI Q Y,ZHANG L F,ZHU X R,et al. A novel design method of high throughput blockchain for 6g networks:performance analysis and optimization model[J]. IEEE internet of things journal,2022,9(24):25643-25659.
[3]SHAKYA K,PATLE D. A schematic review on image compression and encryption algorithm[J]. International journal for research in engineering and emerging trends(IJREET),2022,6(1):400-403.
[4]TAN Z J,YUAN X,MENG S W,et al. Building a modal-balanced blockchain with semantic reconstruction[J/OL]. arXiv preprint,arXiv:2303.02428,2023.
[5]MANDAL B,OKEUKWU A,THEIS Y. Masked face recognition using resnet-50[J/OL]. arXiv preprint,arXiv:2104.08997,2021.
[6]KARRAS T,LAINE S,AILA T. A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2019:4401-4410.
[7]KARRAS T,LAINE S,AITTALA M,et al. Analyzing and improving the image quality of stylegan[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2020:8110-8119.
[8]STAP D,BLEEKER M,IBRAHIMI S,et al. Conditional image generation and manipulation for user-specified content[J/OL]. arXiv preprint,arXiv:2005.04909,2020.
[9]XIA W H,YANG Y J,XUE J H,et al. Tedigan:text-guided diverse face image generation and manipulation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2021:2256-2265.
[10]ZHANG L,RAO A,AGRAWALA M. Adding conditional control to text-to-image diffusion models[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV). Paris,France:IEEE,2023:3836-3847.
[11]OORD A V D,VINYALS O,KAVUKCUOGLU K. Neural discrete representation learning[C]//Proceedings of the 31st International Conference on Neural Information Proceedingsessing Systems. New York:Curran Associates Inc,2017:6309-6318.
[12]BOND-TAYLOR S,HESSEY P,SASAKI H,et al. Unleashing transformers:parallel token prediction with discrete absorbing diffusion for fast high-resolution image generation from vector-quantized codes[C]//Proceedings of the Computer Vision-ECCV 2022:17th European Conference. Berlin:Springer,2022:170-188.
[13]ESSER P,ROMBACH R,BLATTMANN A,et al. Imagebart:bidirectional context with multinomial diffusion for autoregressive image synthesis[C]//Advances in Neural Information Processing Systems. Montreal:MIT Press,2021,34:3518-3532.
[14]GU S Y,CHEN D,BAO J M,et al. Vector quantized diffusion model for text-to-image synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2022:10696-10706.
[15]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Proceedingsessing Systems-Volume 2. Cambridge:MIT Press,2014:2672-2680.
[16]KINGMA D P,WELLING M. Auto-encoding variational bayes[J/OL]. arXiv preprint,arXiv:1312.6114,2013.
[17]ESSER P,ROMBACH R,OMMER B. Taming transformers for high-resolution image synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2021:12873-12883.
[18]PARK T,LIU M Y,WANG T C,et al. Semantic image synthesis with spatially-adaptive normalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2019:2337-2346.
[19]CHEN X,QING L B,HE X H,et al. Ftgan:A fully-trained generative adversarial networks for text to face generation[J/OL]. arXiv preprint,arXiv:1904.05729,2019.
[20]CAMPOS-TABERNER M,ROMERO A,GATTA C,et al. Shared feature representations of lidar and optical images:trading sparsity for semantic discrimination[J]. 2015 IEEE international geoscience and remote sensing symposium(IGARSS),2015:4169-4172.
[21]SALIMANS T,KARPATHY A,CHEN X,et al. Pixelcnn++:improving the pixelcnn with discretized logistic mixture likelihood and other modifications[J/OL]. arXiv preprint,arXiv:1701.05517,2017.
[22]VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. California:MIT Press,2017.
[23]CHANG H W,ZHANG H,JIANG L,et al. Maskgit:Masked generative image transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2022:11315-11325.
[24]SHAZEER N,MIRHOSEINI A,MAZIARZ K,et al. Outrageously large neural networks:The sparsely-gated mixture-of-experts layer[J/OL]. arXiv preprint,arXiv:1701.06538,2017.
[25]LIANG X D,GONG K,SHEN X H,et al. Look into person:Joint body parsing & pose estimation network and a new benchmark[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence. Piscataway,NJ:IEEE,2018,41(4):871-885.
[26]VOUGIOUKAS K,PETRIDIS S,PANTIC M. Dino:A conditional energy-based gan for domain translation[J/OL]. arXiv preprint,arXiv:2102.09281,2021.
[27]YANG T,REN P R,XIE X S,et al. Gan prior embedded network for blind face restoration in the wild[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway,NJ:IEEE,2021:672-681.

相似文献/References:

[1]董言,努尔兰别克·哈巴斯,谷峥,等.基于可验证随机函数与PBFT共识算法的供应链管理方案研究[J].南京师大学报(自然科学版),2025,48(05):121.[doi:10.3969/j.issn.1001-4616.2025.05.014]
　Dong Yan,Nurlanbek Hapas,Gu Zheng,et al.Research on Supply Chain Management Scheme Based on Verifiable Random Function and PBFT Consensus Algorithm[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(02):121.[doi:10.3969/j.issn.1001-4616.2025.05.014]

备注/Memo

备注/Memo:: 收稿日期:2023-06-28.
基金项目:辽宁省科学技术计划揭榜挂帅项目(2021JH1/10400010).
通讯作者:莫同,博士,副教授,研究方向:大数据分析及挖掘、知识图谱等. E-mail:motong@ss.pku.edu.cn

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed729
全文下载/Downloads1011
评论/Comments

更新日期/Last Update: 2025-04-15