Publications

My papers can be found at: Google Scholar


Journal Articles

  • Multi-modal Validation and Domain Interaction Learning for Knowledge-based Visual Question Answering. [PDF]
    Ning Xu, Yifei Gao, An-An Liu*, Hongshuo Tian, Yongdong Zhang.
    IEEE Transactions on Knowledge and Data Engineering, 2024.

  • Learning to Supervise Knowledge Retrieval over a Tree Structure for Visual Question Answering. [PDF]
    Ning Xu, Zimu Lu, Hongshuo Tian, Rongbao Kang, Jinbo Cao, Yongdong Zhang, An-An Liu*.
    IEEE Transactions on Multimedia, vol. 26, pp. 6689-6700, 2024.

  • Event-aware Retrospective Learning for Knowledge-based Image Captioning. [PDF]
    An-An Liu, Yingchen Zhai, Ning Xu*, Hongshuo Tian, Weizhi Nie, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 26, pp. 4898-4911, 2024.

  • Counterfactual Visual Dialog: Robust Commonsense Knowledge Learning from Unbiased Training. [PDF]
    An-An Liu, Chenxi Huang, Ning Xu*, Hongshuo Tian, Jing Liu, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 26, pp. 1639-1651, 2024.

  • Prior knowledge guided text to image generation. [PDF]
    An-An Liu, Zefang Sun, Ning Xu*, Rongbao Kang, Jinbo Cao, Fan Yang, Weijun Qin, Shenyuan Zhang, Xuanya Li.
    Pattern Recognition Letters, vol. 177, pp. 89-95, 2024.

  • Exploring visual relationship for social media popularity prediction. [PDF]
    An-An Liu, Hongwei Du, Ning Xu*, Quan Zhang, Shenyuan Zhang, Yejun Tang, Xuanya Li.
    Journal of Visual Communication and Image Representation, vol. 90, 103738, 2023.

  • CD-GAN: Commonsense-Driven Generative Adversarial Network with Hierarchical Refinement for Text-to-Image Synthesis. [PDF]
    Guokai Zhang, Ning Xu*, Chenggang Yan, Bolun Zheng, Yulong Duan, Bo Lv, An-An Liu.
    Intelligent Computing, vol. 2, 0017, 2023.

  • Multi-stage reasoning on introspecting and revising bias for visual question answering. [PDF]
    An-An Liu, Zimu Lu, Ning Xu, Min Liu, Chenggang Yan, Bolun Zheng, Bo Lv, Yulong Duan, Zhuang Shao, Xuanya Li*.
    ACM Transactions on the Web, 2023.

  • A Comprehensive Survey on Deep-Learning-Based Visual Captioning. [PDF]
    Bowen Xin, Ning Xu*, Yingchen Zhai, Tingting Zhang, Zimu Lu, Jing Liu, Weizhi Nie, Xuanya Li, An-An Liu.
    Multimedia Systems, vol. 29, no. 6, pp. 3781-3804, 2023.

  • SMPC: Boosting Social Media Popularity Prediction with Caption. [PDF]
    An-An Liu, Xiaowen Wang, Ning Xu*, Jing Liu, Yuting Su, Quan Zhang, Shenyuan Zhang, Yejun Tang, Junbo Guo, Guoqing Jin, Xuanya Li*.
    Multimedia Systems, vol. 29, no. 2, pp. 577-586, 2023.

  • High-Order Interaction Learning for Image Captioning. [PDF]
    Yanhui Wang, Ning Xu*, An-An Liu*, Wenhui Li, Yongdong Zhang.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 7, pp. 4417-4430, 2022.

  • Region-Aware Image Captioning via Interaction Learning. [PDF]
    An-An Liu, Yingchen Zhai, Ning Xu*, Weizhi Nie, Wenhui Li*, Yongdong Zhang.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 6, pp. 3685-3696, 2022.

  • Closed-loop reasoning with graph‑aware dense interaction for visual dialog. [PDF]
    An-An Liu, Guokai Zhang, Ning Xu*, Junbo Guo, Guoqing Jin, Xuanya Li.
    Multimedia Systems, vol. 28, no. 5, pp. 1823-1832, 2022.

  • A review of feature fusion-based media popularity prediction methods. [PDF]
    An-An Liu, Xiaowen Wang, Ning Xu*, Junbo Guo, Guoqing Jin, Quan Zhang, Yejun Tang, Shenyuan Zhang.
    Visual Informatics, vol. 6, no. 4, pp. 78-89, 2022.

  • Toward Region-Aware Attention Learning for Scene Graph Generation. [PDF]
    An-An Liu, Hongshuo Tian, Ning Xu*, Weizhi Nie, Yongdong Zhang, M. Kankanhalli.
    IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 7655-7666, 2022.

  • Multi-type decision fusion network for visual Q&A. [PDF]
    An-An Liu, Zimu Lu, Ning Xu*, Weizhi Nie, Wenhui Li*.
    Image and Vision Computing, vol. 115, 104281, 2021.

  • Scene-Graph-Guided message passing network for dense captioning. [PDF]
    An-An Liu, Yanhui Wang, Ning Xu*, Shan Liu, Xuanya Li*.
    Pattern Recognition Letters, vol. 145, pp. 187-193, 2021.

  • Coupled-dynamic learning for vision and language: Exploring Interaction between different tasks. [PDF]
    Ning Xu, Hongshuo Tian, Yanhui Wang, Weizhi Nie, Dan Song, An-An Liu*, Wu Liu.
    Pattern Recognition, vol. 113, 107829, 2021.

  • Image Captioning with multi-level similarity-guided semantic matching. [PDF]
    Jiesi Li, Ning Xu*, Weizhi Nie, Shenyuan Zhang.
    Visual Informatics, vol. 5, no. 4, pp. 41-48, 2021.

  • Adaptively Clustering-Driven Learning for Visual Relationship Detection. [PDF]
    An-An Liu, Yanhui Wang, Ning Xu*, Weizhi Nie, Jie Nie*, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 23, pp. 4515-4525, 2021.

  • Scene Graph Inference via Multi-Scale Context Modeling. [PDF]
    Ning Xu, An-An Liu*, Yongkang Wong, Weizhi Nie, Yuting Su, Mohan Kankanhalli.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 3, pp. 1031-1041, 2021.

  • Multi-Level Policy and Reward-Based Deep Reinforcement Learning Framework for Image Captioning. [PDF]
    Ning Xu, Hanwang Zhang, An-An Liu*, Weizhi Nie, Yuting Su, Jie Nie*, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 22, no. 5, pp. 1372-1383, 2020.

  • Hierarchical Deep Neural Network for Image Captioning. [PDF]
    Yuting Su, Yuqian Li, Ning Xu*, An-An Liu*.
    Neural Processing Letters, vol. 52, no. 2, pp. 1057-1067, 2020.

  • Dual-Stream Recurrent Neural Network for Video Captioning. [PDF]
    Ning Xu, An-An Liu*, Yongkang Wong, Yongdong Zhang, Weizhi Nie, Yuting Su, Mohan S. Kankanhalli.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 8, pp. 2482-2493, 2019.

  • Multi-Guiding Long-Short Term Memory for Video Captioning. [PDF]
    Ning Xu, An-An Liu*, Weizhi Nie*, Yuting Su.
    Multimedia Systems, vol. 25, no. 6, pp. 663-672, 2019.

  • Multi-Domain and Multi-Task Learning for Human Action Recognition. [PDF]
    An-An Liu*, Ning Xu*, Weizhi Nie*, Yuting Su, Yongdong Zhang.
    IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 853-867, 2019.

  • Scene graph captioner: Image captioning based on structural visual representation. [PDF]
    Ning Xu, An-An Liu*, Jing Liu*, Weizhi Nie, Yuting Su.
    Journal of Visual Communication and Image Representation, vol. 58, pp. 477-485, 2019.

  • Attention-in-Attention Networks for Surveillance Video Understanding in Internet of Things. [PDF]
    Ning Xu, An-An Liu*, Weizhi Nie, Yuting Su.
    IEEE Internet of Things Journal, vol. 5, no. 5, pp. 3419-3429, 2018.

  • Hierarchical & Multimodal Video Captioning: Discovering and Transferring Multimodal Knowledge for Vision to Language. [PDF]
    An-An Liu*, Ning Xu, Yongkang Wong, Junnan Li, Yuting Su, Mohan S. Kankanhalli.
    Computer Vision and Image Understanding, vol. 163, pp. 113-125, 2017.

  • Benchmarking a Multimodal and Multiview and Interactive Dataset for Human Action Recognition. [PDF]
    An-An Liu*, Ning Xu, Weizhi Nie, Yuting Su*, Yongkang Wong, Mohan S. Kankanhalli.
    IEEE Transactions on Cybernetics, vol. 47, no. 7, pp. 1781-1794, 2017.

  • Single/multi-view human action recognition via regularized multi-task learning. [PDF]
    An-An Liu, Ning Xu, Yuting Su, Hong Lin, Tong Hao, Zhaoxuan Yang.
    Neurocomputing, vol. 151, pp. 544-553, 2014.

Conference Papers

  • Towards Confidence-Aware Commonsense Knowledge Integration for Scene Graph Generation. [PDF]
    Hongshuo Tian, Ning Xu*, Yanhui Wang, Chenggang Yan, Bolun Zheng, Xuanya Li, An-An Liu*.
    ICME, pp. 2255-2260, 2023. (CCF-B)

  • Knowledge Prompt Makes Composed Pre-Trained Models Zero-Shot News Captioner. [PDF]
    Yanhui Wang, Ning Xu, Hongshuo Tian, Bo Lv, YuLong Duan, Xuanya Li*, An-An Liu.
    ICME, pp. 2879-2884, 2023. (CCF-B)

  • Triangle-Reward Reinforcement Learning: A Visual-Linguistic Semantic Alignment for Image Captioning. [PDF]
    Weizhi Nie, Jiesi Li, Ning Xu*, An-An Liu*, Xuanya Li, Yongdong Zhang.
    ACM Multimedia, pp. 4510-4518, 2021. (CCF-A)

  • Mask and Predict: Multi-step Reasoning for Scene Graph Generation. [PDF]
    Hongshuo Tian, Ning Xu*, An-An Liu*, Chenggang Yan, Zhendong Mao, Quan Zhang, Yongdong Zhang.
    ACM Multimedia, pp. 4128-4136, 2021. (CCF-A)

  • Self-Attention Graph Residual Convolutional Networks for Event Detection with dependency relations. [PDF]
    An-An Liu*, Ning Xu*, Haozhe Liu.
    EMNLP (Findings), pp. 302-311, 2021. (CCF-B)

  • Part-Aware Interactive Learning for Scene Graph Generation. [PDF]
    Hongshuo Tian, Ning Xu*, An-An Liu*, Yongdong Zhang.
    ACM Multimedia, pp. 3155-3163, 2020. (CCF-A)

  • Multi-Level Policy and Reward Reinforcement Learning for Image Captioning. [PDF]
    An-An Liu*, Ning Xu, Hanwang Zhang, Weizhi Nie, Yuting Su, Yongdong Zhang.
    IJCAI, pp. 821-827, 2018. (CCF-A)

  • A Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition. [PDF]
    Ning Xu, An-An Liu*, Weizhi Nie, Yongkang Wong, Fuwu Li, Yuting Su.
    ACM Multimedia, pp. 1195-1198, 2015. (CCF-A)

Techinical Reports

  • TJU-NUS@TRECVID 2017: Video to Text Description. [PDF]
    An-An Liu, Yurui Qiu, Yongkang Wong, Ning Xu, Yuting Su, Mohan S. Kankanhalli.
    TRECVID Challenge Workshop, 2017.

  • MSR Video to Language Challenge. [PDF]
    Ning Xu, Junnan Li, Yang Li, An-An Liu, Yongkang Wong, Weizhi Nie, Yuting Su, Mohan S. Kankanhalli.
    Microsoft Research Video to Text (MSR-VTT) Challenge, 2016.

  • TJU-TJUT@TRECVID 2015: Surveillance Event Detection. [PDF]
    Yuting Su, An-An Liu*, Zan Gao, Weizhi Nie, Ning Xu, Fuwu Li.
    TRECVID Challenge Workshop, 2015.