MAIN CONFERENCE
All papers will be presented in the same manner. Each paper will have a five minute pre-recorded video and a PDF of the poster. An asynchronous text chat will be available for each paper. Attendees can view the papers and videos on demand at any time. Authors will also have individual Q&A sessions at the posted times below.
All posted times are EDT but the chart linked below has all time zones’ conversions. When the virtual site is up, you will be able to select which sessions you are interested in and it will populate your own schedule.
Presentation Schedule
-
All times are Eastern Daylight Time
Date: Tuesday, June 22, 2021 6:00– 8:30
Paper Session Three:
Paper ID | Paper Title | Authors |
2557 | DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping | Yanchao Yang, Brian Lai, Stefano Soatto |
456 | Diffusion Probabilistic Models for 3D Point Cloud Generation | Shitong Luo, Wei Hu |
1292 | Learned Initializations for Optimizing Coordinate-Based Neural Representations | Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng |
4729 | Neural Scene Graphs for Dynamic Scenes | Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, Felix Heide |
6513 | Consensus Maximisation Using Influences of Monotone Boolean Functions | Ruwan Tennakoon, David Suter, Erchuan Zhang, Tat-Jun Chin, Alireza Bab-Hadiashar |
566 | Task Programming: Learning Data Efficient Behavior Representations | Jennifer J. Sun, Ann Kennedy, Eric Zhan, David J. Anderson, Yisong Yue, Pietro Perona |
1058 | SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks | Shunsuke Saito, Jinlong Yang, Qianli Ma, Michael J. Black |
1342 | Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer | Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, Feng Wu |
8456 | What’s in the Image? Explorable Decoding of Compressed Images | Yuval Bahat, Tomer Michaeli |
7798 | Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation | Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph |
732 | Face Forgery Detection by 3D Decomposition | Xiangyu Zhu, Hao Wang, Hongyan Fei, Zhen Lei, Stan Z. Li |
1902 | Convolutional Hough Matching Networks | Juhong Min, Minsu Cho |
2997 | L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing | Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang |
4865 | Patchwise Generative ConvNet: Training Energy-Based Models From a Single Natural Image for Internal Learning | Zilong Zheng, Jianwen Xie, Ping Li |
10084 | Generative Classifiers as a Basis for Trustworthy Image Classification | Radek Mackowiak, Lynton Ardizzone, Ullrich Köthe, Carsten Rother |
543 | HR-NAS: Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers | Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo |
2273 | Progressive Unsupervised Learning for Visual Object Tracking | Qiangqiang Wu, Jia Wan, Antoni B. Chan |
5934 | FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation | Yisheng He, Haibin Huang, Haoqiang Fan, Qifeng Chen, Jian Sun |
3270 | DER: Dynamically Expandable Representation for Class Incremental Learning | Shipeng Yan, Jiangwei Xie, Xuming He |
6831 | Dense Contrastive Learning for Self-Supervised Visual Pre-Training | Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li |
7928 | S2R-DepthNet: Learning a Generalizable Depth-Specific Structural Representation | Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng |
7444 | Depth-Aware Mirror Segmentation | Haiyang Mei, Bo Dong, Wen Dong, Pieter Peers, Xin Yang, Qiang Zhang, Xiaopeng Wei |
7876 | Video Prediction Recalling Long-Term Motion Context via Memory Alignment Learning | Sangmin Lee, Hak Gu Kim, Dae Hwi Choi, Hyung-Il Kim, Yong Man Ro |
1637 | Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression | Chen Gao, Jinyu Chen, Si Liu, Luting Wang, Qiong Zhang, Qi Wu |
7655 | GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction | Cheol-Hui Min, Jinseok Bae, Junho Lee, Young Min Kim |
6262 | Crossing Cuts Polygonal Puzzles: Models and Solvers | Peleg Harel, Ohad Ben-Shahar |
1769 | Transformation Invariant Few-Shot Object Detection | Aoxue Li, Zhenguo Li |
5174 | Adaptive Class Suppression Loss for Long-Tail Object Detection | Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Jinqiao Wang, Ming Tang |
4180 | What if We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels | Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa |
2714 | Fourier Contour Embedding for Arbitrary-Shaped Text Detection | Yiqin Zhu, Jianyong Chen, Lingyu Liang, Zhanghui Kuang, Lianwen Jin, Wayne Zhang |
10709 | Humble Teachers Teach Better Students for Semi-Supervised Object Detection | Yihe Tang, Weifeng Chen, Yijun Luo, Yuting Zhang |
10513 | Cross-Modal Center Loss for 3D Cross-Modal Retrieval | Longlong Jing, Elahe Vahdani, Jiaxing Tan, Yingli Tian |
4554 | Single-View 3D Object Reconstruction From Shape Priors in Memory | Shuo Yang, Min Xu, Haozhe Xie, Stuart Perry, Jiahao Xia |
1125 | NeuralFusion: Online Depth Fusion in Latent Space | Silvan Weder, Johannes L. Schönberger, Marc Pollefeys, Martin R. Oswald |
2252 | PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds | Mutian Xu, Runyu Ding, Hengshuang Zhao, Xiaojuan Qi |
3524 | Self-Supervised Pillar Motion Learning for Autonomous Driving | Chenxu Luo, Xiaodong Yang, Alan Yuille |
6946 | Scan2Cap: Context-Aware Dense Captioning in RGB-D Scans | Zhenyu Chen, Ali Gholami, Matthias Nießner, Angel X. Chang |
7082 | Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks | Despoina Paschalidou, Angelos Katharopoulos, Andreas Geiger, Sanja Fidler |
5581 | Universal Spectral Adversarial Attacks for Deformable Shapes | Arianna Rampini, Franco Pestarini, Luca Cosmo, Simone Melzi, Emanuele Rodolà |
7781 | Large-Scale Localization Datasets in Crowded Indoor Spaces | Donghwan Lee, Soohyun Ryu, Suyong Yeon, Yonghan Lee, Deokhwa Kim, Cheolho Han, Yohann Cabon, Philippe Weinzaepfel, Nicolas Guérin, Gabriela Csurka, Martin Humenberger |
2430 | Learnable Motion Coherence for Correspondence Pruning | Yuan Liu, Lingjie Liu, Cheng Lin, Zhen Dong, Wenping Wang |
2989 | Back to the Feature: Learning Robust Camera Localization From Pixels To Pose | Paul-Edouard Sarlin, Ajaykumar Unagar, Måns Larsson, Hugo Germain, Carl Toft, Viktor Larsson, Marc Pollefeys, Vincent Lepetit, Lars Hammarstrand, Fredrik Kahl, Torsten Sattler |
3629 | Wide-Baseline Relative Camera Pose Estimation With Directional Learning | Kefan Chen, Noah Snavely, Ameesh Makadia |
6536 | Deep Optimized Priors for 3D Shape Modeling and Reconstruction | Mingyue Yang, Yuxin Wen, Weikai Chen, Yongwei Chen, Kui Jia |
2912 | PVGNet: A Bottom-Up One-Stage 3D Object Detector With Integrated Multi-Level Features | Zhenwei Miao, Jikai Chen, Hongyu Pan, Ruiwen Zhang, Kaixuan Liu, Peihan Hao, Jun Zhu, Yang Wang, Xin Zhan |
2471 | Objects Are Different: Flexible Monocular 3D Object Detection | Yunpeng Zhang, Jiwen Lu, Jie Zhou |
466 | A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning | Christoph Feichtenhofer, Haoqi Fan, Bo Xiong, Ross Girshick, Kaiming He |
8549 | Representing Videos As Discriminative Sub-Graphs for Action Recognition | Dong Li, Zhaofan Qiu, Yingwei Pan, Ting Yao, Houqiang Li, Tao Mei |
880 | Learning Salient Boundary Feature for Anchor-free Temporal Action Localization | Chuming Lin, Chengming Xu, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu |
679 | QAIR: Practical Query-Efficient Black-Box Attacks for Image Retrieval | Xiaodan Li, Jinfeng Li, Yuefeng Chen, Shaokai Ye, Yuan He, Shuhui Wang, Hang Su, Hui Xue |
7333 | Defending Multimodal Fusion Models Against Single-Source Adversaries | Karren Yang, Wan-Yi Lin, Manash Barman, Filipe Condessa, Zico Kolter |
844 | Training Generative Adversarial Networks in One Stage | Chengchao Shen, Youtan Yin, Xinchao Wang, Xubin Li, Jie Song, Mingli Song |
1895 | Learning Complete 3D Morphable Face Models From Images and Videos | Mallikarjun B R, Ayush Tewari, Hans-Peter Seidel, Mohamed Elgharib, Christian Theobalt |
2253 | We Are More Than Our Joints: Predicting How 3D Bodies Move | Yan Zhang, Michael J. Black, Siyu Tang |
1017 | HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation | Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu |
3902 | Learning To Count Everything | Viresh Ranjan, Udbhav Sharma, Thu Nguyen, Minh Hoai |
1650 | Information Bottleneck Disentanglement for Identity Swapping | Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, Ran He |
9944 | Mitigating Face Recognition Bias via Group Adaptive Classifier | Sixue Gong, Xiaoming Liu, Anil K. Jain |
277 | Meta Batch-Instance Normalization for Generalizable Person Re-Identification | Seokeon Choi, Taekyung Kim, Minki Jeong, Hyoungseob Park, Changick Kim |
10117 | Refining Pseudo Labels With Clustering Consensus Over Generations for Unsupervised Object Re-Identification | Xiao Zhang, Yixiao Ge, Yu Qiao, Hongsheng Li |
8305 | Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy | Federico Paredes-Vallés, Guido C. H. E. de Croon |
548 | DeFMO: Deblurring and Shape Recovery of Fast Moving Objects | Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiří Matas, Marc Pollefeys |
5416 | Efficient Multi-Stage Video Denoising With Recurrent Spatio-Temporal Fusion | Matteo Maggioni, Yibin Huang, Cheng Li, Shuai Xiao, Zhongqian Fu, Fenglong Song |
5436 | ZeroScatter: Domain Transfer for Long Distance Imaging and Vision Through Scattering Media | Zheng Shi, Ethan Tseng, Mario Bijelic, Werner Ritter, Felix Heide |
6037 | Restoring Extremely Dark Images in Real Time | Mohit Lamba, Kaushik Mitra |
3964 | Practical Wide-Angle Portraits Correction With Deep Structured Models | Jing Tan, Shan Zhao, Pengfei Xiong, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu |
9951 | End-to-End Learning for Joint Image Demosaicing, Denoising and Super-Resolution | Wenzhu Xing, Karen Egiazarian |
267 | Image Super-Resolution With Non-Local Sparse Attention | Yiqun Mei, Yuchen Fan, Yuqian Zhou |
3036 | Video Rescaling Networks With Joint Optimization Strategies for Downscaling and Upscaling | Yan-Cheng Huang, Yi-Hsin Chen, Cheng-You Lu, Hui-Po Wang, Wen-Hsiao Peng, Ching-Chun Huang |
2170 | Restore From Restored: Video Restoration With Pseudo Clean Video | Seunghwan Lee, Donghyeon Cho, Jiwon Kim, Tae Hyun Kim |
1092 | Enriching ImageNet With Human Similarity Judgments and Psychological Embeddings | Brett D. Roads, Bradley C. Love |
8131 | Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts | Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut |
7919 | CondenseNet V2: Sparse Feature Reactivation for Deep Networks | Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian |
941 | Revisiting Knowledge Distillation: An Inheritance and Exploration Framework | Zhen Huang, Xu Shen, Jun Xing, Tongliang Liu, Xinmei Tian, Houqiang Li, Bing Deng, Jianqiang Huang, Xian-Sheng Hua |
946 | Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner | Chong Yu |
11586 | Effective Sparsification of Neural Networks With Global Sparsity Constraint | Xiao Zhou, Weizhong Zhang, Hang Xu, Tong Zhang |
7211 | Improving the Efficiency and Robustness of Deepfakes Detection Through Precise Geometric Features | Zekun Sun, Yujie Han, Zeyu Hua, Na Ruan, Weijia Jia |
8409 | Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting With Their Explanations | Wolfgang Stammer, Patrick Schramowski, Kristian Kersting |
3722 | Protecting Intellectual Property of Generative Adversarial Networks From Ambiguity Attacks | Ding Sheng Ong, Chee Seng Chan, Kam Woh Ng, Lixin Fan, Qiang Yang |
572 | VIGOR: Cross-View Image Geo-Localization Beyond One-to-One Retrieval | Sijie Zhu, Taojiannan Yang, Chen Chen |
4736 | On Semantic Similarity in Video Retrieval | Michael Wray, Hazel Doughty, Dima Damen |
3760 | Flow-Guided One-Shot Talking Face Generation With a High-Resolution Audio-Visual Dataset | Zhimeng Zhang, Lincheng Li, Yu Ding, Changjie Fan |
5478 | Navigating the GAN Parameter Space for Semantic Image Editing | Anton Cherepkov, Andrey Voynov, Artem Babenko |
1299 | IMAGINE: Image Synthesis by Image-Guided Model Inversion | Pei Wang, Yijun Li, Krishna Kumar Singh, Jingwan Lu, Nuno Vasconcelos |
1550 | Human De-Occlusion: Invisible Perception and Recovery for Humans | Qiang Zhou, Shiyin Wang, Yitong Wang, Zilong Huang, Xinggang Wang |
2573 | Learning To Warp for Style Transfer | Xiao-Chang Liu, Yong-Liang Yang, Peter Hall |
7828 | StEP: Style-Based Encoder Pre-Training for Multi-Modal Image Synthesis | Moustafa Meshry, Yixuan Ren, Larry S. Davis, Abhinav Shrivastava |
5132 | ANR: Articulated Neural Rendering for Virtual Avatars | Amit Raj, Julian Tanke, James Hays, Minh Vo, Carsten Stoll, Christoph Lassner |
8365 | LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity | Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang |
222 | Stochastic Image-to-Video Synthesis Using cINNs | Michael Dorkenwald, Timo Milbich, Andreas Blattmann, Robin Rombach, Konstantinos G. Derpanis, Björn Ommer |
5343 | Prototype Completion With Primitive Knowledge for Few-Shot Learning | Baoquan Zhang, Xutao Li, Yunming Ye, Zhichao Huang, Lisai Zhang |
11705 | Dynamic Class Queue for Large Scale Face Recognition in the Wild | Bi Li, Teng Xi, Gang Zhang, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Wenyu Liu |
10159 | Truly Shift-Invariant Convolutional Neural Networks | Anadi Chaman, Ivan Dokmanić |
3293 | RSG: A Simple but Effective Module for Learning Imbalanced Datasets | Jianfeng Wang, Thomas Lukasiewicz, Xiaolin Hu, Jianfei Cai, Zhenghua Xu |
6754 | Goal-Oriented Gaze Estimation for Zero-Shot Learning | Yang Liu, Lei Zhou, Xiao Bai, Yifei Huang, Lin Gu, Jun Zhou, Tatsuya Harada |
4294 | Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces | Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool |
10421 | Robust Bayesian Neural Networks by Spectral Expectation Bound Regularization | Jiaru Zhang, Yang Hua, Zhengui Xue, Tao Song, Chengyu Zheng, Ruhui Ma, Haibing Guan |
6542 | MobileDets: Searching for Object Detection Architectures for Mobile Accelerators | Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen |
2698 | Hilbert Sinkhorn Divergence for Optimal Transport | Qian Li, Zhichao Wang, Gang Li, Jun Pang, Guandong Xu |
2785 | Object Classification From Randomized EEG Trials | Hamad Ahmed, Ronnie B. Wilbur, Hari M. Bharadwaj, Jeffrey Mark Siskind |
4924 | Leveraging Large-Scale Weakly Labeled Data for Semi-Supervised Mass Detection in Mammograms | Yuxing Tang, Zhenjie Cao, Yanbo Zhang, Zhicheng Yang, Zongcheng Ji, Yiwei Wang, Mei Han, Jie Ma, Jing Xiao, Peng Chang |
3400 | Tracking Pedestrian Heads in Dense Crowd | Ramana Sundararaman, Cédric De Almeida Braga, Eric Marchand, Julien Pettré |
1310 | Multiple Object Tracking With Correlation Learning | Qiang Wang, Yun Zheng, Pan Pan, Yinghui Xu |
8823 | SMURF: Self-Teaching Multi-Frame Unsupervised RAFT With Full-Image Warping | Austin Stone, Daniel Maurer, Alper Ayvaci, Anelia Angelova, Rico Jonschkowski |
2448 | Bilinear Parameterization for Non-Separable Singular Value Penalties | Marcus Valtonen Örnhag, José Pedro Iglesias, Carl Olsson |
420 | DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-Scale Consistency | Zongxin Yang, Xin Yu, Yi Yang |
1223 | Unsupervised Disentanglement of Linear-Encoded Facial Semantics | Yutong Zheng, Yu-Kai Huang, Ran Tao, Zhiqiang Shen, Marios Savvides |
2813 | MetaCorrection: Domain-Aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation | Xiaoqing Guo, Chen Yang, Baopu Li, Yixuan Yuan |
4620 | Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation | Zhekai Du, Jingjing Li, Hongzu Su, Lei Zhu, Ke Lu |
1262 | Generative Interventions for Causal Learning | Chengzhi Mao, Augustine Cha, Amogh Gupta, Hao Wang, Junfeng Yang, Carl Vondrick |
2806 | Distilling Causal Effect of Data in Class-Incremental Learning | Xinting Hu, Kaihua Tang, Chunyan Miao, Xian-Sheng Hua, Hanwang Zhang |
2428 | Embedding Transfer With Label Relaxation for Improved Metric Learning | Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak |
8012 | M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training | Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Dongdong Zhang, Nan Duan |
6397 | Instance Localization for Self-Supervised Detection Pretraining | Ceyuan Yang, Zhirong Wu, Bolei Zhou, Stephen Lin |
1314 | VIP-DeepLab: Learning Visual Perception With Depth-Aware Video Panoptic Segmentation | Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen |
6985 | AdaBins: Depth Estimation Using Adaptive Bins | Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka |
1418 | Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers | Lei Ke, Yu-Wing Tai, Chi-Keung Tang |
10761 | Information-Theoretic Segmentation by Inpainting Error Maximization | Pedro Savarese, Sunnie S. Y. Kim, Michael Maire, Greg Shakhnarovich, David McAllester |
448 | PLOP: Learning Without Forgetting for Continual Semantic Segmentation | Arthur Douillard, Yifu Chen, Arnaud Dapogny, Matthieu Cord |
2873 | Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization | Haoyu Ma, Xiangru Lin, Zifeng Wu, Yizhou Yu |
6053 | HyperSeg: Patch-Wise Hypernetwork for Real-Time Semantic Segmentation | Yuval Nirkin, Lior Wolf, Tal Hassner |
4016 | Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation | Jungbeom Lee, Eunji Kim, Sungroh Yoon |
4572 | Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework | Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li |
11326 | Unbiased Mean Teacher for Cross-Domain Object Detection | Jinhong Deng, Wen Li, Yuhua Chen, Lixin Duan |
10922 | MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking | Jennifer Jang, Heinrich Jiang |
2940 | FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation | Yair Kittenplon, Yonina C. Eldar, Dan Raviv |
2034 | Recognizing Actions in Videos From Unseen Viewpoints | AJ Piergiovanni, Michael S. Ryoo |
1665 | VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild | Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang |
4755 | Learning Position and Target Consistency for Memory-Based Video Object Segmentation | Li Hu, Peng Zhang, Bang Zhang, Pan Pan, Yinghui Xu, Rong Jin |
1719 | UC2: Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training | Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu, Jingjing Liu |
10239 | Fingerspelling Detection in American Sign Language | Bowen Shi, Diane Brentari, Greg Shakhnarovich, Karen Livescu |
1845 | Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation | Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu |
1631 | Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation | Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang |
7642 | Cascaded Prediction Network via Segment Tree for Temporal Video Grounding | Yang Zhao, Zhou Zhao, Zhu Zhang, Zhijie Lin |
990 | How Transferable Are Reasoning Patterns in VQA? | Corentin Kervadec, Théo Jaunet, Grigory Antipov, Moez Baccouche, Romain Vuillemot, Christian Wolf |
3065 | PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation | Xiangtai Li, Hao He, Xia Li, Duo Li, Guangliang Cheng, Jianping Shi, Lubin Weng, Yunhai Tong, Zhouchen Lin |
2979 | HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps | Lu Mi, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao, Chen Sun, Cordelia Schmid, Nir Shavit, Yuning Chai, Dragomir Anguelov |
3362 | A Circular-Structured Representation for Visual Emotion Distribution Learning | Jingyuan Yang, Jie Li, Leida Li, Xiumei Wang, Xinbo Gao |
2304 | More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval | Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song |