Session Four

MAIN CONFERENCE

All papers will be presented in the same manner. Each paper will have a five minute pre-recorded video and a PDF of the poster. An asynchronous text chat will be available for each paper. Attendees can view the papers and videos on demand at any time. Authors will also have individual Q&A sessions at the posted times below.

 All posted times are EDT but the chart linked below has all time zones’ conversions. When the virtual site is up, you will be able to select which sessions you are interested in and it will populate your own schedule.

Presentation Schedule

  • All times are Eastern Daylight Time

Date: Tuesday, June 22, 2021   11:00 – 13:30
Paper Session Four:

Paper ID Paper Title Authors
8890 Line Segment Detection Using Transformers Without Edges Yifan Xu, Weijian Xu, David Cheung, Zhuowen Tu
1139 Predator: Registration of 3D Point Clouds With Low Overlap Shengyu Huang, Zan Gojcic, Mikhail Usvyatsov, Andreas Wieser, Konrad Schindler
1824 Point2Skeleton: Learning Skeletal Representations from Point Clouds Cheng Lin, Changjian Li, Yuan Liu, Nenglun Chen, Yi-King Choi, Wenping Wang
4945 Neural Lumigraph Rendering Petr Kellnhofer, Lars C. Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, Gordon Wetzstein
2419 Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging Alvaro Parra, Shin-Fang Chng, Tat-Jun Chin, Anders Eriksson, Ian Reid
3659 Towards Evaluating and Training Verifiably Robust Neural Networks Zhaoyang Lyu, Minghao Guo, Tong Wu, Guodong Xu, Kehuan Zhang, Dahua Lin
1929 Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-Localization in Large Scenes From Body-Mounted Sensors Vladimir Guzov, Aymen Mir, Torsten Sattler, Gerard Pons-Moll
3358 Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification Qiong Wu, Pingyang Dai, Jie Chen, Chia-Wen Lin, Yongjian Wu, Feiyue Huang, Bineng Zhong, Rongrong Ji
1427 Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration Liyuan Pan, Shah Chowdhury, Richard Hartley, Miaomiao Liu, Hongguang Zhang, Hongdong Li
3586 Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets Yuan-Hong Liao, Amlan Kar, Sanja Fidler
1619 ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu
5085 Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos Jiawei Liu, Zheng-Jun Zha, Wei Wu, Kecheng Zheng, Qibin Sun
5822 SSN: Soft Shadow Network for Image Compositing Yichen Sheng, Jianming Zhang, Bedrich Benes
11788 Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder Tal Daniel, Aviv Tamar
4081 Learning Placeholders for Open-Set Recognition Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan
1896 ReNAS: Relativistic Evaluation of Neural Architecture Search Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu
4117 Learning To Filter: Siamese Relation Network for Robust Tracking Siyuan Cheng, Bineng Zhong, Guorong Li, Xin Liu, Zhenjun Tang, Xianxian Li, Jing Wang
2644 Generative Hierarchical Features From Synthesizing Images Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou
8086 Continual Adaptation of Visual Representations via Domain Randomization and Meta-Learning Riccardo Volpi, Diane Larlus, Grégory Rogez
8866 NewtonianVAE: Proportional Control and Goal Identification From Pixels via Physical Latent Spaces Miguel Jaques, Michael Burke, Timothy M. Hospedales
6073 3D-to-2D Distillation for Indoor Scene Parsing Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu
10030 Repurposing GANs for One-Shot Semantic Part Segmentation Nontawat Tritrong, Pitchaporn Rewatbowornwong, Supasorn Suwajanakorn
2958 Temporal Query Networks for Fine-Grained Video Understanding Chuhan Zhang, Ankush Gupta, Andrew Zisserman
3887 ManipulaTHOR: A Framework for Visual Object Manipulation Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha Kembhavi, Roozbeh Mottaghi
344 Omnimatte: Associating Objects and Their Effects in Video Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein
582 MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection Vibashan VS, Vikram Gupta, Poojan Oza, Vishwanath A. Sindagi, Vishal M. Patel
2564 Generalized Few-Shot Object Detection Without Forgetting Zhibo Fan, Yuchen Ma, Zeming Li, Jian Sun
6392 DAP: Detection-Aware Pre-Training With Weak Supervision Yuanyi Zhong, Jianfeng Wang, Lijuan Wang, Jian Peng, Yu-Xiong Wang, Lei Zhang
5538 A Multiplexed Network for End-to-End, Multilingual OCR Jing Huang, Guan Pang, Rama Kovvuri, Mandy Toh, Kevin J Liang, Praveen Krishnan, Xi Yin, Tal Hassner
6276 Scene Text Retrieval via Joint Text Detection and Similarity Learning Hao Wang, Xiang Bai, Mingkun Yang, Shenggao Zhu, Jing Wang, Wenyu Liu
1987 Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection Zhenyu Wang, Yali Li, Ye Guo, Lu Fang, Shengjin Wang
1291 pixelNeRF: Neural Radiance Fields From One or Few Images Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa
4891 From Points to Multi-Object 3D Reconstruction Francis Engelmann, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari
3949 4D Hyperspectral Photoacoustic Data Restoration With Reliability Analysis Weihang Liao, Art Subpa-asa, Yinqiang Zheng, Imari Sato
2306 RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner
4083 Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion Chulin Xie, Chuxin Wang, Bo Zhang, Hao Yang, Dong Chen, Fang Wen
7033 Denoise and Contrast for Category Agnostic Shape Completion Antonio Alliegro, Diego Valsesia, Giulia Fracastoro, Enrico Magli, Tatiana Tommasi
8562 Neural Surface Maps Luca Morreale, Noam Aigerman, Vladimir G. Kim, Niloy J. Mitra
3438 RGB-D Local Implicit Function for Depth Completion of Transparent Objects Luyang Zhu, Arsalan Mousavian, Yu Xiang, Hammad Mazhar, Jozef van Eenbergen, Shoubhik Debnath, Dieter Fox
10157 Uncertainty-Aware Camera Pose Estimation From Points and Lines Alexander Vakhitov, Luis Ferraz, Antonio Agudo, Francesc Moreno-Noguer
2491 Patch2Pix: Epipolar-Guided Pixel-Level Correspondences Qunjie Zhou, Torsten Sattler, Laura Leal-Taixé
7332 Deep Multi-Task Learning for Joint Localization, Perception, and Prediction John Phillips, Julieta Martinez, Ioan Andrei Bârsan, Sergio Casas, Abbas Sadat, Raquel Urtasun
4114 IBRNet: Learning Multi-View Image-Based Rendering Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P. Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser
7304 Unsupervised Learning of 3D Object Categories From Videos in the Wild Philipp Henzler, Jeremy Reizenstein, Patrick Labatut, Roman Shapovalov, Tobias Ritschel, Andrea Vedaldi, David Novotny
754 LiDAR-Aug: A General Rendering-Based Augmentation Framework for 3D Object Detection Jin Fang, Xinxin Zuo, Dingfu Zhou, Shengze Jin, Sen Wang, Liangjun Zhang
3226 Delving Into Localization Errors for Monocular 3D Object Detection Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang
1652 3D CNNs With Adaptive Temporal Feature Resolutions Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc Van Gool, Jürgen Gall
313 3D Human Action Representation Learning via Cross-View Consistency Pursuit Linguo Li, Minsi Wang, Bingbing Ni, Hang Wang, Jiancheng Yang, Wenjun Zhang
1723 Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations Zhihui Li, Lina Yao
960 Delving into Data: Effectively Substitute Training for Black-box Attack Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, Xiangyang Xue
7686 Data-Free Model Extraction Jean-Baptiste Truong, Pratyush Maini, Robert J. Walls, Nicolas Papernot
7269 Adaptive Weighted Discriminator for Training Generative Adversarial Networks Vasily Zadorozhnyy, Qiang Cheng, Qiang Ye
1898 Monocular Reconstruction of Neural Face Reflectance Fields Mallikarjun B R, Ayush Tewari, Tae-Hyun Oh, Tim Weyrich, Bernd Bickel, Hans-Peter Seidel, Hanspeter Pfister, Wojciech Matusik, Mohamed Elgharib, Christian Theobalt
2267 Towards Accurate 3D Human Motion Prediction From Incomplete Observations Qiongjie Cui, Huaijiang Sun
1429 Monocular Real-Time Full Body Capture With Inter-Part Correlations Yuxiao Zhou, Marc Habermann, Ikhsanul Habibie, Ayush Tewari, Christian Theobalt, Feng Xu
6178 Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin
2827 One Shot Face Swapping on Megapixels Yuhao Zhu, Qi Li, Jian Wang, Cheng-Zhong Xu, Zhenan Sun
2974 Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation Tengfei Song, Zijun Cui, Yuru Wang, Wenming Zheng, Qiang Ji
815 Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification Fengxiang Yang, Zhun Zhong, Zhiming Luo, Yuanzheng Cai, Yaojin Lin, Shaozi Li, Nicu Sebe
10905 Prototype-Guided Saliency Feature Learning for Person Search Hanjae Kim, Sunghun Joung, Ig-Jae Kim, Kwanghoon Sohn
3944 Labeled From Unlabeled: Exploiting Unlabeled Data for Few-Shot Deep HDR Deghosting K. Ram Prabhakar, Gowtham Senthil, Susmit Agrawal, R. Venkatesh Babu, Rama Krishna Sai S Gorthi
2053 Learning Spatially-Variant MAP Models for Non-Blind Image Deblurring Jiangxin Dong, Stefan Roth, Bernt Schiele
7780 NBNet: Noise Basis Learning for Image Denoising With Subspace Projection Shen Cheng, Yuzhi Wang, Haibin Huang, Donghao Liu, Haoqiang Fan, Shuaicheng Liu
6177 Image De-Raining via Continual Learning Man Zhou, Jie Xiao, Yifan Chang, Xueyang Fu, Aiping Liu, Jinshan Pan, Zheng-Jun Zha
6156 Exploring Sparsity in Image Super-Resolution for Efficient Inference Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, Yulan Guo
4280 From Shadow Generation To Shadow Removal Zhihao Liu, Hui Yin, Xinyi Wu, Zhenyao Wu, Yang Mi, Song Wang
1962 Spatiotemporal Registration for Event-Based Visual Odometry Daqi Liu, Alvaro Parra, Tat-Jun Chin
302 BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond Kelvin C.K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy
3343 Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single Image Super-Resolution Network Aupendu Kar, Prabir Kumar Biswas
2552 Learning Temporal Consistency for Low Light Video Enhancement From Single Images Fan Zhang, Yu Li, Shaodi You, Ying Fu
2107 Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges Qingyong Hu, Bo Yang, Sheikh Khalid, Wen Xiao, Niki Trigoni, Andrew Markham
8839 Neural Side-by-Side: Predicting Human Preferences for No-Reference Super-Resolution Evaluation Valentin Khrulkov, Artem Babenko
5369 Slimmable Compressive Autoencoders for Practical Neural Image Compression Fei Yang, Luis Herranz, Yongmei Cheng, Mikhail G. Mozerov
1390 Distilling Knowledge via Knowledge Review Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia
1043 Manifold Regularized Dynamic Network Pruning Yehui Tang, Yunhe Wang, Yixing Xu, Yiping Deng, Chao Xu, Dacheng Tao, Chang Xu
2224 Learnable Companding Quantization for Accurate Low-Bit Neural Networks Kohei Yamamoto
10100 Lips Don’t Lie: A Generalisable and Robust Approach To Face Forgery Detection Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
10338 Guided Integrated Gradients: An Adaptive Path Method for Removing Noise Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, Tolga Bolukbasi
3934 Scalable Differential Privacy With Sparse Network Finetuning Zelun Luo, Daniel J. Wu, Ehsan Adeli, Li Fei-Fei
2115 Deep Graph Matching Under Quadratic Constraint Quankai Gao, Fudong Wang, Nan Xue, Jin-Gang Yu, Gui-Song Xia
8225 T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval Xiaohan Wang, Linchao Zhu, Yi Yang
5722 FaceInpainter: High Fidelity Face Adaptation to Heterogeneous Domains Jia Li, Zhaoyang Li, Jie Cao, Xingguang Song, Ran He
5597 Partition-Guided GANs Mohammadreza Armandpour, Ali Sadeghian, Chunyuan Li, Mingyuan Zhou
1451 Repopulating Street Scenes Yifan Wang, Andrew Liu, Richard Tucker, Jiajun Wu, Brian L. Curless, Steven M. Seitz, Noah Snavely
2855 Image Inpainting With External-Internal Learning and Monochromic Bottleneck Tengfei Wang, Hao Ouyang, Qifeng Chen
2878 DG-Font: Deformable Generative Networks for Unsupervised Font Generation Yangchen Xie, Xinyuan Chen, Li Sun, Yue Lu
7839 Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao
5214 StylePeople: A Generative Model of Fullbody Human Avatars Artur Grigorev, Karim Iskakov, Anastasia Ianina, Renat Bashirov, Ilya Zakharkin, Alexander Vakhitov, Victor Lempitsky
11062 Synthesize-It-Classifier: Learning a Generative Classifier Through Recurrent Self-Analysis Arghya Pal, Raphaël C.-W. Phan, KokSheik Wong
244 Understanding Object Dynamics for Interactive Image-to-Video Synthesis Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer
5839 Learning Dynamic Alignment via Meta-Filter for Few-Shot Learning Chengming Xu, Yanwei Fu, Chen Liu, Chengjie Wang, Jilin Li, Feiyue Huang, Li Zhang, Xiangyang Xue
2175 Jo-SRC: A Contrastive Approach for Combating Noisy Labels Yazhou Yao, Zeren Sun, Chuanyi Zhang, Fumin Shen, Qi Wu, Jian Zhang, Zhenmin Tang
4599 On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective Nontawat Charoenphakdee, Jayakorn Vongkulbhisal, Nuttapong Chairatanakul, Masashi Sugiyama
6434 MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, Xinjing Cheng
7256 Open World Compositional Zero-Shot Learning Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata
6107 Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity Zhile Chen, Feng Li, Yuhui Quan, Yong Xu, Hui Ji
3260 Combinatorial Learning of Graph Edit Distance via Dynamic Embedding Runzhong Wang, Tianqi Zhang, Tianshu Yu, Junchi Yan, Xiaokang Yang
7448 TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li
10123 An Alternative Probabilistic Interpretation of the Huber Loss Gregory P. Meyer
7795 Joint Deep Model-Based MR Image and Coil Sensitivity Reconstruction Network (Joint-ICNet) for Fast MRI Yohan Jun, Hyungseob Shin, Taejoon Eo, Dosik Hwang
6230 Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-Constrained Optimization Fakai Wang, Kang Zheng, Le Lu, Jing Xiao, Min Wu, Shun Miao
3472 Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation Bin Yan, Xinyu Zhang, Dong Wang, Huchuan Lu, Xiaoyun Yang
1576 Learnable Graph Matching: Incorporating Graph Partitioning With Deep Feature Learning for Multiple Object Tracking Jiawei He, Zehao Huang, Naiyan Wang, Zhaoxiang Zhang
1821 Group-aware Label Transfer for Domain Adaptive Person Re-identification Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, Zheng-Jun Zha
8382 Double Low-Rank Representation With Projection Distance Penalty for Clustering Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang
453 Multiple Instance Active Learning for Object Detection Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye
2199 Learning Compositional Representation for 4D Captures With Neural ODE Boyan Jiang, Yinda Zhang, Xingkui Wei, Xiangyang Xue, Yanwei Fu
2879 Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation Subhankar Roy, Evgeny Krivosheev, Zhun Zhong, Nicu Sebe, Elisa Ricci
4872 Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation Astuti Sharma, Tarun Kalluri, Manmohan Chandraker
3798 Deep Stable Learning for Out-of-Distribution Generalization Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, Zheyan Shen
3143 ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-Supervised Continual Learning Liyuan Wang, Kuo Yang, Chongxuan Li, Lanqing Hong, Zhenguo Li, Jun Zhu
2689 Dynamic Metric Learning: Towards a Scalable Metric Space To Accommodate Multiple Semantic Scales Yifan Sun, Yuke Zhu, Yuhan Zhang, Pengkun Zheng, Xi Qiu, Chi Zhang, Yichen Wei
8722 Learning Cross-Modal Retrieval With Noisy Labels Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin
7087 How Well Do Self-Supervised Models Transfer? Linus Ericsson, Henry Gouk, Timothy M. Hospedales
2643 Generic Perceptual Loss for Modeling Structured Output Dependencies Yifan Liu, Hao Chen, Yu Chen, Wei Yin, Chunhua Shen
7564 EDNet: Efficient Disparity Estimation With Cost Volume Combination and Attention-Based Spatial Residual Songyan Zhang, Zhicheng Wang, Qiang Wang, Jinshuo Zhang, Gang Wei, Xiaowen Chu
1909 BoxInst: High-Performance Instance Segmentation With Box Annotations Zhi Tian, Chunhua Shen, Xinlong Wang, Hao Chen
2060 PhySG: Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing and Relighting Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, Noah Snavely
458 MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
3092 Scale-Aware Graph Neural Network for Few-Shot Semantic Segmentation Guo-Sen Xie, Jie Liu, Huan Xiong, Ling Shao
8742 Part-Aware Panoptic Segmentation Daan de Geus, Panagiotis Meletis, Chenyang Lu, Xiaoxiao Wen, Gijs Dubbelman
4107 Railroad Is Not a Train: Saliency As Pseudo-Pixel Supervision for Weakly Supervised Semantic Segmentation Seungho Lee, Minhyun Lee, Jongwuk Lee, Hyunjung Shim
6035 Mask-Embedded Discriminator With Region-Based Semantic Regularization for Semi-Supervised Class-Conditional Image Synthesis Yi Liu, Xiaoyang Huo, Tianyi Chen, Xiangping Zeng, Si Wu, Zhiwen Yu, Hau-San Wong
1953 Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders Jiwoong Park, Junho Cho, Hyung Jin Chang, Jin Young Choi
3346 4D Panoptic LiDAR Segmentation Mehmet Aygün, Aljoša Ošep, Mark Weber, Maxim Maximov, Cyrill Stachniss, Jens Behley, Laura Leal-Taixé
3147 EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation Yang Jiao, Trac D. Tran, Guangming Shi
2976 Learning by Aligning Videos in Time Sanjay Haresh, Sateesh Kumar, Huseyin Coskun, Shahram N. Syed, Andrey Konin, Zeeshan Zia, Quoc-Huy Tran
1786 Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang
5785 Polygonal Point Set Tracking Gunhee Nam, Miran Heo, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim
7322 VinVL: Revisiting Visual Representations in Vision-Language Models Pengchuan Zhang, Xiujun Li, Xiaowei Hu, Jianwei Yang, Lei Zhang, Lijuan Wang, Yejin Choi, Jianfeng Gao
3268 Visual Semantic Role Labeling for Video Understanding Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi
2308 Can Audio-Visual Integration Strengthen Robustness Under Multimodal Attacks? Yapeng Tian, Chenliang Xu
1835 Relation-aware Instance Refinement for Weakly Supervised Visual Grounding Yongfei Liu, Bo Wan, Lin Ma, Xuming He
10687 Learning Better Visual Dialog Agents With Pretrained Visual-Linguistic Representation Tao Tu, Qing Ping, Govindarajan Thattai, Gokhan Tur, Prem Natarajan
1154 Separating Skills and Concepts for Novel Visual Question Answering Spencer Whitehead, Hui Wu, Heng Ji, Rogerio Feris, Kate Saenko
2281 Generating Manga From Illustrations via Mimicking Manga Creation Workflow Lvmin Zhang, Xinrui Wang, Qingnan Fan, Yi Ji, Chunping Liu
3476 SelfDoc: Self-Supervised Document Representation Learning Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu
11090 Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha
2542 Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song