Celeba-spoof: Large-scale face anti-spoofing dataset with rich annotations Y Zhang, ZF Yin, Y Li, G Yin, J Yan, J Shao, Z Liu Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 208 | 2020 |
Lamm: Language-assisted multi-modal instruction-tuning dataset, framework, and benchmark Z Yin, J Wang, J Cao, Z Shi, D Liu, M Li, X Huang, Z Wang, L Sheng, L Bai, ... Advances in Neural Information Processing Systems 36, 2024 | 126 | 2024 |
INTERN: A New Learning Paradigm Towards General Vision J Shao, S Chen, Y Li, K Wang, Z Yin, Y He, J Teng, Q Sun, M Gao, J Liu, ... arXiv preprint arXiv:2111.08687, 2021 | 34 | 2021 |
Benchmarking omni-vision representation through the lens of visual realms Y Zhang, Z Yin, J Shao, Z Liu European Conference on Computer Vision, 594-611, 2022 | 23 | 2022 |
Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models Z You, Z Li, J Gu, Z Yin, T Xue, C Dong arXiv preprint arXiv:2312.08962, 2023 | 19 | 2023 |
Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy Y Zhang, Q Sun, Y Zhou, Z He, Z Yin, K Wang, L Sheng, Y Qiao, J Shao, ... arXiv preprint arXiv:2203.07845, 2022 | 18 | 2022 |
Octavius: Mitigating task interference in mllms via moe Z Chen, Z Wang, Z Wang, H Liu, Z Yin, S Liu, L Sheng, W Ouyang, Y Qiao, ... arXiv preprint arXiv:2311.02684 3, 2023 | 17 | 2023 |
Few-Shot Domain Expansion for Face Anti-Spoofing B Yang, J Zhang, Z Yin, J Shao arXiv preprint arXiv:2106.14162, 2021 | 17 | 2021 |
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control E Zhou, Y Qin, Z Yin, Y Huang, R Zhang, L Sheng, Y Qiao, J Shao arXiv preprint arXiv:2403.12037, 2024 | 16 | 2024 |
CelebA-Spoof Challenge 2020 on Face Anti-Spoofing: Methods and Results Y Zhang, Z Yin, J Shao, Z Liu, S Yang, Y Xiong, W Xia, Y Xu, M Luo, J Liu, ... arXiv preprint arXiv:2102.12642, 2021 | 15 | 2021 |
Mp5: A multi-modal open-ended embodied system in minecraft via active perception Y Qin, E Zhou, Q Liu, Z Yin, L Sheng, R Zhang, Y Qiao, J Shao 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR …, 2024 | 14 | 2024 |
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities C Lu, C Qian, G Zheng, H Fan, H Gao, J Zhang, J Shao, J Deng, J Fu, ... arXiv preprint arXiv:2401.15071, 2024 | 12 | 2024 |
Uni3D-LLM: Unifying Point Cloud Perception, Generation and Editing with Large Language Models D Liu, X Huang, Y Hou, Z Wang, Z Yin, Y Gong, P Gao, W Ouyang arXiv preprint arXiv:2402.03327, 2024 | 11 | 2024 |
ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models Z Shi, Z Wang, H Fan, Z Yin, L Sheng, Y Qiao, J Shao arXiv preprint arXiv:2311.02692, 2023 | 9 | 2023 |
3D Point Cloud Pre-training with Knowledge Distillation from 2D Images Y Yao, Y Zhang, Z Yin, J Luo, W Ouyang, X Huang arXiv preprint arXiv:2212.08974, 2022 | 9 | 2022 |
Assessment of Multimodal Large Language Models in Alignment with Human Values Z Shi, Z Wang, H Fan, Z Zhang, L Li, Y Zhang, Z Yin, L Sheng, Y Qiao, ... arXiv preprint arXiv:2403.17830, 2024 | 8 | 2024 |
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models C Qian, J Zhang, W Yao, D Liu, Z Yin, Y Qiao, Y Liu, J Shao arXiv preprint arXiv:2402.19465, 2024 | 8 | 2024 |
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model Y Zhang, L Chen, G Zheng, Y Gao, R Zheng, J Fu, Z Yin, S Jin, Y Qiao, ... arXiv preprint arXiv:2406.12030, 2024 | 6 | 2024 |
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset Z Yin, J Wang, J Cao, Z Shi, D Liu, M Li, L Sheng, L Bai, X Huang, Z Wang, ... Framework, and Benchmark, 1-37, 2023 | 6 | 2023 |
X-learner: Learning cross sources and tasks for universal visual representation Y He, G Huang, S Chen, J Teng, K Wang, Z Yin, L Sheng, Z Liu, Y Qiao, ... European Conference on Computer Vision, 509-528, 2022 | 6 | 2022 |