Publications

My papers can be found at: Google Scholar


Selected Journal Articles

  • Rule-driven News Captioning. [PDF]
    Ning Xu, Tingting Zhang, Hongshuo Tian, An-An Liu*.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 11, pp. 11657-11667, 2024.

  • Enriched Image Captioning based on Knowledge Divergence and Focus. [PDF]
    An-An Liu, Quanhan Wu, Ning Xu*, Hongshuo Tian, Lanjun Wang.
    IEEE Transactions on Circuits and Systems for Video Technology, 2024.

  • Model can be subtle: Two important mechanisms for Social Media Popularity Prediction. [PDF]
    Ning Xu, Xiaowen Wang, Jing Liu, Lanjun Wang, Xuanya Li, Mengxiao Zhu, Yongdong Zhang, An-An Liu*.
    ACM Transactions on Multimedia Computing, Communications, and Applications, 2024.

  • Multi-modal Validation and Domain Interaction Learning for Knowledge-based Visual Question Answering. [PDF]
    Ning Xu, Yifei Gao, An-An Liu*, Hongshuo Tian, Yongdong Zhang.
    IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 11, pp. 6628-6640, 2024.

  • Learning to Supervise Knowledge Retrieval over a Tree Structure for Visual Question Answering. [PDF]
    Ning Xu, Zimu Lu, Hongshuo Tian, Rongbao Kang, Jinbo Cao, Yongdong Zhang, An-An Liu*.
    IEEE Transactions on Multimedia, vol. 26, pp. 6689-6700, 2024.

  • Gaussian Distribution-Aware Commonsense Knowledge Learning for Scene Graph Generation. [PDF]
    Hongshuo Tian, Ning Xu*, Mohan Kankanhalli, An-An Liu*.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 12, pp. 13044-13057, 2024.

  • Event-aware Retrospective Learning for Knowledge-based Image Captioning. [PDF]
    An-An Liu, Yingchen Zhai, Ning Xu*, Hongshuo Tian, Weizhi Nie, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 26, pp. 4898-4911, 2024.

  • Counterfactual Visual Dialog: Robust Commonsense Knowledge Learning from Unbiased Training. [PDF]
    An-An Liu, Chenxi Huang, Ning Xu*, Hongshuo Tian, Jing Liu, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 26, pp. 1639-1651, 2024.

  • Prior knowledge guided text to image generation. [PDF]
    An-An Liu, Zefang Sun, Ning Xu*, Rongbao Kang, Jinbo Cao, Fan Yang, Weijun Qin, Shenyuan Zhang, Xuanya Li.
    Pattern Recognition Letters, vol. 177, pp. 89-95, 2024.

  • Exploring visual relationship for social media popularity prediction. [PDF]
    An-An Liu, Hongwei Du, Ning Xu*, Quan Zhang, Shenyuan Zhang, Yejun Tang, Xuanya Li.
    Journal of Visual Communication and Image Representation, vol. 90, 103738, 2023.

  • Multi-stage reasoning on introspecting and revising bias for visual question answering. [PDF]
    An-An Liu, Zimu Lu, Ning Xu, Min Liu, Chenggang Yan, Bolun Zheng, Bo Lv, Yulong Duan, Zhuang Shao, Xuanya Li*.
    ACM Transactions on the Web, 2023.

  • High-Order Interaction Learning for Image Captioning. [PDF]
    Yanhui Wang, Ning Xu*, An-An Liu*, Wenhui Li, Yongdong Zhang.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 7, pp. 4417-4430, 2022.

  • Region-Aware Image Captioning via Interaction Learning. [PDF]
    An-An Liu, Yingchen Zhai, Ning Xu*, Weizhi Nie, Wenhui Li*, Yongdong Zhang.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 6, pp. 3685-3696, 2022.

  • Toward Region-Aware Attention Learning for Scene Graph Generation. [PDF]
    An-An Liu, Hongshuo Tian, Ning Xu*, Weizhi Nie, Yongdong Zhang, M. Kankanhalli.
    IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 7655-7666, 2022.

  • Coupled-dynamic learning for vision and language: Exploring Interaction between different tasks. [PDF]
    Ning Xu, Hongshuo Tian, Yanhui Wang, Weizhi Nie, Dan Song, An-An Liu*, Wu Liu.
    Pattern Recognition, vol. 113, 107829, 2021.

  • Adaptively Clustering-Driven Learning for Visual Relationship Detection. [PDF]
    An-An Liu, Yanhui Wang, Ning Xu*, Weizhi Nie, Jie Nie*, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 23, pp. 4515-4525, 2021.

  • Scene Graph Inference via Multi-Scale Context Modeling. [PDF]
    Ning Xu, An-An Liu*, Yongkang Wong, Weizhi Nie, Yuting Su, Mohan Kankanhalli.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 3, pp. 1031-1041, 2021.

  • Multi-Level Policy and Reward-Based Deep Reinforcement Learning Framework for Image Captioning. [PDF]
    Ning Xu, Hanwang Zhang, An-An Liu*, Weizhi Nie, Yuting Su, Jie Nie*, Yongdong Zhang.
    IEEE Transactions on Multimedia, vol. 22, no. 5, pp. 1372-1383, 2020.

  • Hierarchical Deep Neural Network for Image Captioning. [PDF]
    Yuting Su, Yuqian Li, Ning Xu*, An-An Liu*.
    Neural Processing Letters, vol. 52, no. 2, pp. 1057-1067, 2020.

  • Dual-Stream Recurrent Neural Network for Video Captioning. [PDF]
    Ning Xu, An-An Liu*, Yongkang Wong, Yongdong Zhang, Weizhi Nie, Yuting Su, Mohan S. Kankanhalli.
    IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 8, pp. 2482-2493, 2019.

  • Multi-Guiding Long-Short Term Memory for Video Captioning. [PDF]
    Ning Xu, An-An Liu*, Weizhi Nie*, Yuting Su.
    Multimedia Systems, vol. 25, no. 6, pp. 663-672, 2019.

  • Multi-Domain and Multi-Task Learning for Human Action Recognition. [PDF]
    An-An Liu*, Ning Xu*, Weizhi Nie*, Yuting Su, Yongdong Zhang.
    IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 853-867, 2019.

  • Scene graph captioner: Image captioning based on structural visual representation. [PDF]
    Ning Xu, An-An Liu*, Jing Liu*, Weizhi Nie, Yuting Su.
    Journal of Visual Communication and Image Representation, vol. 58, pp. 477-485, 2019.

  • Attention-in-Attention Networks for Surveillance Video Understanding in Internet of Things. [PDF]
    Ning Xu, An-An Liu*, Weizhi Nie, Yuting Su.
    IEEE Internet of Things Journal, vol. 5, no. 5, pp. 3419-3429, 2018.

  • Hierarchical & Multimodal Video Captioning: Discovering and Transferring Multimodal Knowledge for Vision to Language. [PDF]
    An-An Liu*, Ning Xu, Yongkang Wong, Junnan Li, Yuting Su, Mohan S. Kankanhalli.
    Computer Vision and Image Understanding, vol. 163, pp. 113-125, 2017.

  • Benchmarking a Multimodal and Multiview and Interactive Dataset for Human Action Recognition. [PDF]
    An-An Liu*, Ning Xu, Weizhi Nie, Yuting Su*, Yongkang Wong, Mohan S. Kankanhalli.
    IEEE Transactions on Cybernetics, vol. 47, no. 7, pp. 1781-1794, 2017.

  • Single/multi-view human action recognition via regularized multi-task learning. [PDF]
    An-An Liu, Ning Xu, Yuting Su, Hong Lin, Tong Hao, Zhaoxuan Yang.
    Neurocomputing, vol. 151, pp. 544-553, 2014.

Selected Conference Papers

  • Cross-Modal Coherence-Enhanced Feedback Prompting for News Captioning. [PDF]
    Ning Xu, Yifei Gao, Ting-Ting Zhang, Hongshuo Tian*, An-An Liu*.
    ACM Multimedia, pp. 9369-9377, 2024. (CCF-A)

  • Towards Confidence-Aware Commonsense Knowledge Integration for Scene Graph Generation. [PDF]
    Hongshuo Tian, Ning Xu*, Yanhui Wang, Chenggang Yan, Bolun Zheng, Xuanya Li, An-An Liu*.
    ICME, pp. 2255-2260, 2023. (CCF-B)

  • Knowledge Prompt Makes Composed Pre-Trained Models Zero-Shot News Captioner. [PDF]
    Yanhui Wang, Ning Xu, Hongshuo Tian, Bo Lv, YuLong Duan, Xuanya Li*, An-An Liu.
    ICME, pp. 2879-2884, 2023. (CCF-B)

  • Triangle-Reward Reinforcement Learning: A Visual-Linguistic Semantic Alignment for Image Captioning. [PDF]
    Weizhi Nie, Jiesi Li, Ning Xu*, An-An Liu*, Xuanya Li, Yongdong Zhang.
    ACM Multimedia, pp. 4510-4518, 2021. (CCF-A)

  • Mask and Predict: Multi-step Reasoning for Scene Graph Generation. [PDF]
    Hongshuo Tian, Ning Xu*, An-An Liu*, Chenggang Yan, Zhendong Mao, Quan Zhang, Yongdong Zhang.
    ACM Multimedia, pp. 4128-4136, 2021. (CCF-A)

  • Self-Attention Graph Residual Convolutional Networks for Event Detection with dependency relations. [PDF]
    An-An Liu*, Ning Xu*, Haozhe Liu.
    EMNLP (Findings), pp. 302-311, 2021. (CCF-B)

  • Part-Aware Interactive Learning for Scene Graph Generation. [PDF]
    Hongshuo Tian, Ning Xu*, An-An Liu*, Yongdong Zhang.
    ACM Multimedia, pp. 3155-3163, 2020. (CCF-A)

  • Multi-Level Policy and Reward Reinforcement Learning for Image Captioning. [PDF]
    An-An Liu*, Ning Xu, Hanwang Zhang, Weizhi Nie, Yuting Su, Yongdong Zhang.
    IJCAI, pp. 821-827, 2018. (CCF-A)

  • A Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition. [PDF]
    Ning Xu, An-An Liu*, Weizhi Nie, Yongkang Wong, Fuwu Li, Yuting Su.
    ACM Multimedia, pp. 1195-1198, 2015. (CCF-A)

Techinical Reports

  • TJU-NUS@TRECVID 2017: Video to Text Description. [PDF]
    An-An Liu, Yurui Qiu, Yongkang Wong, Ning Xu, Yuting Su, Mohan S. Kankanhalli.
    TRECVID Challenge Workshop, 2017.

  • MSR Video to Language Challenge. [PDF]
    Ning Xu, Junnan Li, Yang Li, An-An Liu, Yongkang Wong, Weizhi Nie, Yuting Su, Mohan S. Kankanhalli.
    Microsoft Research Video to Text (MSR-VTT) Challenge, 2016.

  • TJU-TJUT@TRECVID 2015: Surveillance Event Detection. [PDF]
    Yuting Su, An-An Liu*, Zan Gao, Weizhi Nie, Ning Xu, Fuwu Li.
    TRECVID Challenge Workshop, 2015.