MAIN CONFERENCE
All papers will be presented in the same manner. Each paper will have a five minute pre-recorded video and a PDF of the poster. An asynchronous text chat will be available for each paper. Attendees can view the papers and videos on demand at any time. Authors will also have individual Q&A sessions at the posted times below.
All posted times are EDT but the chart linked below has all time zones’ conversions. When the virtual site is up, you will be able to select which sessions you are interested in and it will populate your own schedule.
Presentation Schedule
-
All times are Eastern Daylight Time
Date: Thursday, June 24, 2021 6:00– 8:30
Paper Session Nine:
Paper ID | Paper Title | Authors |
4256 | Indoor Panorama Planar 3D Reconstruction via Divide and Conquer | Cheng Sun, Chi-Wei Hsiao, Ning-Hsu Wang, Min Sun, Hwann-Tzong Chen |
3271 | SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud Based Place Recognition | Yan Xia, Yusheng Xu, Shuang Li, Rui Wang, Juan Du, Daniel Cremers, Uwe Stilla |
9941 | Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes | Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler |
4862 | SOLD2: Self-Supervised Occlusion-Aware Line Description and Detection | Rémi Pautrat, Juan-Ting Lin, Viktor Larsson, Martin R. Oswald, Marc Pollefeys |
1000 | PGT: A Progressive Method for Training Models on Long Videos | Bo Pang, Gao Peng, Yizhuo Li, Cewu Lu |
3232 | Dual Attention Guided Gaze Target Detection in the Wild | Yi Fang, Jiapeng Tang, Wang Shen, Wei Shen, Xiao Gu, Li Song, Guangtao Zhai |
2221 | ChallenCap: Monocular 3D Capture of Challenging Human Performances Using Multi-Modal References | Yannan He, Anqi Pang, Xin Chen, Han Liang, Minye Wu, Yuexin Ma, Lan Xu |
7777 | Blocks-World Cameras | Jongho Lee, Mohit Gupta |
1635 | Real-Time Sphere Sweeping Stereo From Multiview Fisheye Images | Andreas Meuleman, Hyeonjoong Jang, Daniel S. Jeon, Min H. Kim |
3454 | Optimal Gradient Checkpoint Search for Arbitrary Computation Graphs | Jianwei Feng, Dong Huang |
10440 | Black-Box Explanation of Object Detectors via Saliency Maps | Vitali Petsiuk, Rajiv Jain, Varun Manjunatha, Vlad I. Morariu, Ashutosh Mehra, Vicente Ordonez, Kate Saenko |
3367 | GIRAFFE: Representing Scenes As Compositional Generative Neural Feature Fields | Michael Niemeyer, Andreas Geiger |
3592 | CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation | Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen |
1802 | Your “Flamingo” is My “Bird”: Fine-Grained, or Not | Dongliang Chang, Kaiyue Pang, Yixiao Zheng, Zhanyu Ma, Yi-Zhe Song, Jun Guo |
4146 | Inception Convolution With Efficient Dilation Search | Jie Liu, Chuming Li, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu |
10409 | Geo-FARM: Geodesic Factor Regression Model for Misaligned Pre-Shape Responses in Statistical Shape Analysis | Chao Huang, Anuj Srivastava, Rongjie Liu |
2823 | UnrealPerson: An Adaptive Pipeline Towards Costless Person Re-Identification | Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian |
11469 | Transferable Semantic Augmentation for Domain Adaptation | Shuang Li, Mixue Xie, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Wei Li |
1391 | Jigsaw Clustering for Unsupervised Visual Representation Learning | Pengguang Chen, Shu Liu, Jiaya Jia |
1165 | SliceNet: Deep Dense Depth Estimation From a Single Indoor Panorama Using a Slice-Based Representation | Giovanni Pintore, Marco Agus, Eva Almansa, Jens Schneider, Enrico Gobbetti |
5623 | Fully Convolutional Scene Graph Generation | Hengyue Liu, Ning Yan, Masood Mortazavi, Bir Bhanu |
2061 | Meta Pseudo Labels | Hieu Pham, Zihang Dai, Qizhe Xie, Quoc V. Le |
3468 | ArtEmis: Affective Language for Visual Art | Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas J. Guibas |
872 | RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening | Sungha Choi, Sanghun Jung, Huiwon Yun, Joanne T. Kim, Seungryong Kim, Jaegul Choo |
765 | Simultaneously Localize, Segment and Rank the Camouflaged Objects | Yunqiu Lv, Jing Zhang, Yuchao Dai, Aixuan Li, Bowen Liu, Nick Barnes, Deng-Ping Fan |
7467 | Interpolation-Based Semi-Supervised Learning for Object Detection | Jisoo Jeong, Vikas Verma, Minsung Hyun, Juho Kannala, Nojun Kwak |
5510 | There Is More Than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking With Sound by Distilling Multimodal Knowledge | Francisco Rivera Valverde, Juana Valeria Hurtado, Abhinav Valada |
11502 | Variational Pedestrian Detection | Yuang Zhang, Huanyu He, Jianguo Li, Yuxi Li, John See, Weiyao Lin |
1513 | Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection | Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang |
7272 | Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization | Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, Weiming Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu |
2774 | Deep Active Surface Models | Udaranga Wickramasinghe, Pascal Fua, Graham Knott |
6565 | Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement | Huiwen Luo, Koki Nagano, Han-Wei Kung, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li |
280 | Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning | Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma |
2972 | PU-GCN: Point Cloud Upsampling Using Graph Convolutional Networks | Guocheng Qian, Abdulellah Abualshour, Guohao Li, Ali Thabet, Bernard Ghanem |
5820 | CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation | Tao Lu, Limin Wang, Gangshan Wu |
738 | UV-Net: Learning From Boundary Representations | Pradeep Kumar Jayaraman, Aditya Sanghi, Joseph G. Lambourne, Karl D.D. Willis, Thomas Davies, Hooman Shayani, Nigel Morris |
1247 | Joint Learning of 3D Shape Retrieval and Deformation | Mikaela Angelina Uy, Vladimir G. Kim, Minhyuk Sung, Noam Aigerman, Siddhartha Chaudhuri, Leonidas J. Guibas |
3360 | Square Root Bundle Adjustment for Large-Scale Reconstruction | Nikolaus Demmel, Christiane Sommer, Daniel Cremers, Vladyslav Usenko |
5522 | Pixel-Aligned Volumetric Avatars | Amit Raj, Michael Zollhöfer, Tomas Simon, Jason Saragih, Shunsuke Saito, James Hays, Stephen Lombardi |
4105 | Learning To Identify Correct 2D-2D Line Correspondences on Sphere | Haoang Li, Kai Chen, Ji Zhao, Jiangliu Wang, Pyojin Kim, Zhe Liu, Yun-Hui Liu |
2104 | SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration | Sheng Ao, Qingyong Hu, Bo Yang, Andrew Markham, Yulan Guo |
5511 | Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On | Igor Santesteban, Nils Thuerey, Miguel A. Otaduy, Dan Casas |
7345 | End-to-End Rotation Averaging With Multi-Source Propagation | Luwei Yang, Heng Li, Jamal Ahmed Rahim, Zhaopeng Cui, Ping Tan |
3493 | Center-Based 3D Object Detection and Tracking | Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl |
6010 | PointAugmenting: Cross-Modal Augmentation for 3D Object Detection | Chunwei Wang, Chao Ma, Ming Zhu, Xiaokang Yang |
4008 | Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning | Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun |
5981 | Trajectory Prediction With Latent Belief Energy-Based Model | Bo Pang, Tianyang Zhao, Xu Xie, Ying Nian Wu |
2505 | End-to-End Human Object Interaction Detection With HOI Transformer | Cheng Zou, Bohan Wang, Yue Hu, Junqi Liu, Qian Wu, Yu Zhao, Boxun Li, Chenguang Zhang, Chi Zhang, Yichen Wei, Jian Sun |
4438 | Simulating Unknown Target Models for Query-Efficient Black-Box Attacks | Chen Ma, Li Chen, Jun-Hai Yong |
3644 | Improving Transferability of Adversarial Patches on Face Recognition With Generative Models | Zihao Xiao, Xianfeng Gao, Chilin Fu, Yinpeng Dong, Wei Gao, Xiaolu Zhang, Jun Zhou, Jun Zhu |
3353 | When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks | Jiahang Wang, Sheng Jin, Wentao Liu, Weizhong Liu, Chen Qian, Ping Luo |
1281 | Body2Hands: Learning To Infer 3D Hands From Conversational Gesture Body Dynamics | Evonne Ng, Shiry Ginosar, Trevor Darrell, Hanbyul Joo |
4685 | SMPLicit: Topology-Aware Generative Model for Clothed People | Enric Corona, Albert Pumarola, Guillem Alenyà, Gerard Pons-Moll, Francesc Moreno-Noguer |
5051 | Multi-View Multi-Person 3D Pose Estimation With Plane Sweep Stereo | Jiahao Lin, Gim Hee Lee |
2234 | Progressive Semantic-Aware Style Transformation for Blind Face Restoration | Chaofeng Chen, Xiaoming Li, Lingbo Yang, Xianhui Lin, Lei Zhang, Kwan-Yee K. Wong |
1651 | Variational Prototype Learning for Deep Face Recognition | Jiankang Deng, Jia Guo, Jing Yang, Alexandros Lattas, Stefanos Zafeiriou |
3649 | Learning Spatial-Semantic Relationship for Facial Attribute Recognition With Limited Labeled Data | Ying Shu, Yan Yan, Si Chen, Jing-Hao Xue, Chunhua Shen, Hanzi Wang |
3279 | Intra-Inter Camera Similarity for Unsupervised Person Re-Identification | Shiyu Xuan, Shiliang Zhang |
8428 | Digital Gimbal: End-to-End Deep Image Stabilization With Learnable Exposure Times | Omer Dahary, Matan Jacoby, Alex M. Bronstein |
2803 | Learning Scalable l¥-Constrained Near-Lossless Image Compression via Joint Lossy Image and Residual Compression | Yuanchao Bai, Xianming Liu, Wangmeng Zuo, Yaowei Wang, Xiangyang Ji |
5198 | Explore Image Deblurring via Encoded Blur Kernel Space | Phong Tran, Anh Tuan Tran, Quynh Phung, Minh Hoai |
788 | Self-Aligned Video Deraining With Transmission-Depth Consistency | Wending Yan, Robby T. Tan, Wenhan Yang, Dengxin Dai |
3474 | Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression of Light Effects | Aashish Sharma, Robby T. Tan |
2151 | High-Quality Stereo Image Restoration From Double Refraction | Hakyeong Kim, Andreas Meuleman, Daniel S. Jeon, Min H. Kim |
5857 | Spk2ImgNet: Learning To Reconstruct Dynamic Scene From Continuous Spike Stream | Jing Zhao, Ruiqin Xiong, Hangfan Liu, Jian Zhang, Tiejun Huang |
5882 | Learning Tensor Low-Rank Prior for Hyperspectral Image Reconstruction | Shipeng Zhang, Lizhi Wang, Lei Zhang, Hua Huang |
1760 | ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic | Xiangtao Kong, Hengyuan Zhao, Yu Qiao, Chao Dong |
8483 | Scene Text Telescope: Text-Focused Scene Image Super-Resolution | Jingye Chen, Bin Li, Xiangyang Xue |
5698 | Real-Time Selfie Video Stabilization | Jiyang Yu, Ravi Ramamoorthi, Keli Cheng, Michel Sarkis, Ning Bi |
5430 | Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach | Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi |
3951 | PQA: Perceptual Question Answering | Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song |
8341 | Communication Efficient SGD via Gradient Sampling With Bayes Prior | Liuyihan Song, Kang Zhao, Pan Pan, Yu Liu, Yingya Zhang, Yinghui Xu, Rong Jin |
4920 | Student-Teacher Learning From Clean Inputs to Noisy Inputs | Guanzhe Hong, Zhiyuan Mao, Xiaojun Lin, Stanley H. Chan |
6370 | Towards Extremely Compact RNNs for Video Recognition With Fully Decomposed Hierarchical Tucker Structure | Miao Yin, Siyu Liao, Xiao-Yang Liu, Xiaodong Wang, Bo Yuan |
11155 | Optimal Quantization Using Scaled Codebook | Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu Yin, Miguel Á. Carreira-Perpiñán, Jose M. Alvarez |
5163 | Causal Hidden Markov Model for Time Series Disease Forecasting | Jing Li, Botong Wu, Xinwei Sun, Yizhou Wang |
8035 | Fair Feature Distillation for Visual Recognition | Sangwon Jung, Donggyu Lee, Taeeon Park, Taesup Moon |
10280 | DISCO: Dynamic and Invariant Sensitive Channel Obfuscation for Deep Neural Networks | Abhishek Singh, Ayush Chopra, Ethan Garza, Emily Zhang, Praneeth Vepakomma, Vivek Sharma, Ramesh Raskar |
2849 | Person Re-Identification Using Heterogeneous Local Graph Attention Networks | Zhong Zhang, Haijia Zhang, Shuang Liu |
7258 | Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions | Navaneeth Bodla, Gaurav Shrivastava, Rama Chellappa, Abhinav Shrivastava |
1346 | Content-Aware GAN Compression | Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, Sun-Yuan Kung |
8339 | Efficient Conditional GAN Transfer With Knowledge Propagation Across Classes | Mohamad Shahbazi, Zhiwu Huang, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool |
4467 | Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes | Huiting Yang, Liangyu Chai, Qiang Wen, Shuang Zhao, Zixun Sun, Shengfeng He |
3750 | Leveraging Line-Point Consistence To Preserve Structures for Wide Parallax Image Stitching | Qi Jia, ZhengJun Li, Xin Fan, Haotian Zhao, Shiyu Teng, Xinchen Ye, Longin Jan Latecki |
4836 | Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes | Dmytro Kotovenko, Matthias Wright, Arthur Heimbrecht, Björn Ommer |
1070 | Scene-Aware Generative Network for Human Motion Synthesis | Jingbo Wang, Sijie Yan, Bo Dai, Dahua Lin |
875 | Stable View Synthesis | Gernot Riegler, Vladlen Koltun |
10334 | Understanding and Simplifying Perceptual Distances | Dan Amir, Yair Weiss |
7155 | Behavior-Driven Synthesis of Human Dynamics | Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer |
10843 | Adaptive Image Transformer for One-Shot Object Detection | Ding-Jie Chen, He-Yen Hsieh, Tyng-Luh Liu |
10455 | Quality-Agnostic Image Recognition via Invertible Decoder | Insoo Kim, Seungju Han, Ji-won Baek, Seong-Jin Park, Jae-Joon Han, Jinwoo Shin |
6614 | Self-Supervised Wasserstein Pseudo-Labeling for Semi-Supervised Image Classification | Fariborz Taherkhani, Ali Dabouei, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi |
3724 | Improving Unsupervised Image Clustering With Robust Learning | Sungwon Park, Sungwon Han, Sundong Kim, Danu Kim, Sungkyu Park, Seunghoon Hong, Meeyoung Cha |
1094 | Group Collaborative Learning for Co-Salient Object Detection | Qi Fan, Deng-Ping Fan, Huazhu Fu, Chi-Keung Tang, Ling Shao, Yu-Wing Tai |
2435 | Pre-Trained Image Processing Transformer | Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao |
805 | DOTS: Decoupling Operation and Topology in Differentiable Architecture Search | Yu-Chao Gu, Li-Juan Wang, Yun Liu, Yi Yang, Yu-Huan Wu, Shao-Ping Lu, Ming-Ming Cheng |
2909 | Involution: Inverting the Inherence of Convolution for Visual Recognition | Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen |
5829 | Cross-Iteration Batch Normalization | Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin |
415 | Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling | Wei Ji, Shuang Yu, Junde Wu, Kai Ma, Cheng Bian, Qi Bi, Jingjing Li, Hanruo Liu, Li Cheng, Yefeng Zheng |
511 | Track To Detect and Segment: An Online Multi-Object Tracker | Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, Junsong Yuan |
7330 | Rotation Equivariant Siamese Networks for Tracking | Deepak K. Gupta, Devanshu Arya, Efstratios Gavves |
9904 | SiamMOT: Siamese Multi-Object Tracking | Bing Shuai, Andrew Berneshawi, Xinyu Li, Davide Modolo, Joseph Tighe |
3384 | On Feature Normalization and Data Augmentation | Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger |
2278 | Learning a Self-Expressive Network for Subspace Clustering | Shangzhi Zhang, Chong You, René Vidal, Chun-Guang Li |
3960 | Dual-GAN: Joint BVP and Noise Modeling for Remote Physiological Measurement | Hao Lu, Hu Han, S. Kevin Zhou |
717 | Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation | Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Yong Wang, Fang Wen |
3337 | RPN Prototype Alignment for Domain Adaptive Object Detector | Yixin Zhang, Zilei Wang, Yushi Mao |
10370 | PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training | Luke Melas-Kyriazi, Arjun K. Manrai |
10692 | Adversarial Invariant Learning | Nanyang Ye, Jingxuan Tang, Huayu Deng, Xiao-Yun Zhou, Qianxiao Li, Zhenguo Li, Guang-Zhong Yang, Zhanxing Zhu |
6619 | Few-Shot Incremental Learning With Continually Evolved Classifiers | Chi Zhang, Nan Song, Guosheng Lin, Yun Zheng, Pan Pan, Yinghui Xu |
7960 | Unsupervised Hyperbolic Metric Learning | Jiexi Yan, Lei Luo, Cheng Deng, Heng Huang |
1704 | Audio-Visual Instance Discrimination with Cross-Modal Agreement | Pedro Morgado, Nuno Vasconcelos, Ishan Misra |
10375 | CoCoNets: Continuous Contrastive 3D Scene Representations | Shamit Lal, Mihir Prabhudesai, Ishita Mediratta, Adam W. Harley, Katerina Fragkiadaki |
1461 | Bilateral Grid Learning for Stereo Matching Networks | Bin Xu, Yuhua Xu, Xiaoli Yang, Wei Jia, Yulan Guo |
10612 | Radar-Camera Pixel Depth Association for Depth Completion | Yunfei Long, Daniel Morris, Xiaoming Liu, Marcos Castro, Punarjay Chakravarty, Praveen Narayanan |
4372 | Panoptic Segmentation Forecasting | Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander G. Schwing |
3630 | Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation | Gengcong Yang, Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yujiu Yang |
1976 | Learning Statistical Texture for Semantic Segmentation | Lanyun Zhu, Deyi Ji, Shiping Zhu, Weihao Gan, Wei Wu, Junjie Yan |
4498 | (AF)2-S3Net: Attentive Feature Fusion With Adaptive Feature Selection for Sparse Semantic Segmentation Network | Ran Cheng, Ryan Razani, Ehsan Taghavi, Enxu Li, Bingbing Liu |
6104 | Scale-Localized Abstract Reasoning | Yaniv Benny, Niv Pekar, Lior Wolf |
5901 | Few-Shot Open-Set Recognition by Transformation Consistency | Minki Jeong, Seokeon Choi, Changick Kim |
2548 | I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors | Chaoqi Chen, Zebiao Zheng, Yue Huang, Xinghao Ding, Yizhou Yu |
4591 | Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination | Xudong Wang, Ziwei Liu, Stella X. Yu |
207 | Multi-Shot Temporal Event Localization: A Benchmark | Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H. S. Torr |
3921 | Learning the Predictability of the Future | Dídac Surís, Ruoshi Liu, Carl Vondrick |
8466 | SSAN: Separable Self-Attention Network for Video Representation Learning | Xudong Guo, Xun Guo, Yan Lu |
2811 | Action Shuffle Alternating Learning for Unsupervised Action Segmentation | Jun Li, Sinisa Todorovic |
1215 | Towards Accurate Text-Based Image Captioning With Content Diversity Exploration | Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu |
1104 | Kaleido-BERT: Vision-Language Pre-Training on Fashion Domain | Mingchen Zhuge, Dehong Gao, Deng-Ping Fan, Linbo Jin, Ben Chen, Haoming Zhou, Minghui Qiu, Ling Shao |
11014 | Transitional Adaptation of Pretrained Models for Visual Storytelling | Youngjae Yu, Jiwan Chung, Heeseung Yun, Jongseok Kim, Gunhee Kim |
8091 | Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos | Mingxing Zhang, Yang Yang, Xinghan Chen, Yanli Ji, Xing Xu, Jingjing Li, Heng Tao Shen |
3613 | Connecting What To Say With Where To Look by Modeling Human Attention Traces | Zihang Meng, Licheng Yu, Ning Zhang, Tamara L. Berg, Babak Damavandi, Vikas Singh, Amy Bearman |
6452 | SOON: Scenario Oriented Object Navigation With Graph-Based Exploration | Fengda Zhu, Xiwen Liang, Yi Zhu, Qizhi Yu, Xiaojun Chang, Xiaodan Liang |
3123 | Counterfactual VQA: A Cause-Effect Look at Language Bias | Yulei Niu, Kaihua Tang, Hanwang Zhang, Zhiwu Lu, Xian-Sheng Hua, Ji-Rong Wen |
778 | Learning by Watching | Jimuyang Zhang, Eshed Ohn-Bar |
5920 | Personalized Outfit Recommendation With Learnable Anchors | Zhi Lu, Yang Hu, Yan Chen, Bing Zeng |
7318 | Safe Local Motion Planning With Self-Supervised Freespace Forecasting | Peiyun Hu, Aaron Huang, John Dolan, David Held, Deva Ramanan |
6805 | Anomaly Detection in Video via Self-Supervised and Multi-Task Learning | Mariana-Iuliana Georgescu, Antonio Bărbălău, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, Mubarak Shah |