Follow
Tanmay Gupta
Tanmay Gupta
Research Scientist @ PRIOR, Allen AI (Ai2)
Verified email at allenai.org - Homepage
Title
Cited by
Cited by
Year
Visual programming: Compositional visual reasoning without training
T Gupta, A Kembhavi
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
3632023
Completing 3d object shape from one depth image
J Rock, T Gupta, J Thorsen, JY Gwak, D Shin, D Hoiem
Proceedings of the IEEE conference on computer vision and pattern …, 2015
2122015
No-frills human-object interaction detection: Factorization, layout encodings, and training techniques
T Gupta, A Schwing, D Hoiem
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
1692019
Contrastive learning for weakly supervised phrase grounding
T Gupta, A Vahdat, G Chechik, X Yang, J Kautz, D Hoiem
European Conference on Computer Vision, 752-768, 2020
1462020
Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration0
A O’Neill, A Rehman, A Maddukuri, A Gupta, A Padalkar, A Lee, A Pooley, ...
2024 IEEE International Conference on Robotics and Automation (ICRA), 6892-6903, 2024
952024
Towards general purpose vision systems: An end-to-end task-agnostic vision-language architecture
T Gupta, A Kamath, A Kembhavi, D Hoiem
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
912022
Imagine this! scripts to compositions to videos
T Gupta, D Schwenk, A Farhadi, D Hoiem, A Kembhavi
Proceedings of the European conference on computer vision (ECCV), 598-613, 2018
882018
Visual semantic role labeling for video understanding
A Sadhu, T Gupta, M Yatskar, R Nevatia, A Kembhavi
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
712021
Webly supervised concept expansion for general purpose vision models
A Kamath, C Clark, T Gupta, E Kolve, D Hoiem, A Kembhavi
European Conference on Computer Vision, 662-681, 2022
582022
Vico: Word embeddings from visual co-occurrences
T Gupta, A Schwing, D Hoiem
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
342019
Grit: General robust image task benchmark
T Gupta, R Marten, A Kembhavi, D Hoiem
arXiv preprint arXiv:2204.13653, 2022
312022
Object 3dit: Language-guided 3d-aware image editing
O Michel, A Bhattad, E VanderBilt, R Krishna, A Kembhavi, T Gupta
Advances in Neural Information Processing Systems 36, 2024
272024
Learning curves for analysis of deep networks
D Hoiem, T Gupta, Z Li, M Shlapentokh-Rothman
International conference on machine learning, 4287-4296, 2021
252021
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
T Gupta, K Shih, S Singh, D Hoiem
International Conference on Computer Vision (ICCV), 2017
242017
SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World
K Ehsani, T Gupta, R Hendrix, J Salvador, L Weihs, KH Zeng, KP Singh, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
20*2024
Molmo and pixmo: Open weights and open data for state-of-the-art multimodal models
M Deitke, C Clark, S Lee, R Tripathi, Y Yang, JS Park, M Salehi, ...
arXiv preprint arXiv:2409.17146, 2024
162024
Task Me Anything
J Zhang, W Huang, Z Ma, O Michel, D He, T Gupta, WC Ma, A Farhadi, ...
arXiv preprint arXiv:2406.11775, 2024
122024
m &m’s: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
Z Ma, W Huang, J Zhang, T Gupta, R Krishna
European Conference on Computer Vision, 18-34, 2025
92025
3dfs: Deformable dense depth fusion and segmentation for object reconstruction from a handheld camera
T Gupta, D Shin, N Sivagnanadasan, D Hoiem
arXiv preprint arXiv:1606.05002, 2016
62016
Joint representation learning from images and text
A Vahdat, T Gupta, X Yang, J Kautz
US Patent 11,948,078, 2024
22024
The system can't perform the operation now. Try again later.
Articles 1–20