MAIN CONFERENCE
All papers will be presented in the same manner. Each paper will have a five minute pre-recorded video and a PDF of the poster. An asynchronous text chat will be available for each paper. Attendees can view the papers and videos on demand at any time. Authors will also have individual Q&A sessions at the posted times below.
All posted times are EDT but the chart linked below has all time zones’ conversions. When the virtual site is up, you will be able to select which sessions you are interested in and it will populate your own schedule.
Presentation Schedule
-
All times are Eastern Daylight Time
Date: Monday, June 21, 2021 22:00 – 24:30
Paper Session Two:
Paper ID | Paper Title | Authors |
1567 | Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal Fusion | Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, Xi Li |
5860 | SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction From Video Data | Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing |
944 | Deep Implicit Templates for 3D Shape Representation | Zerong Zheng, Tao Yu, Qionghai Dai, Yebin Liu |
3893 | Pulsar: Efficient Sphere-Based Neural Rendering | Christoph Lassner, Michael Zollhöfer |
4796 | Neural Deformation Graphs for Globally-Consistent Non-Rigid Reconstruction | Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Justus Thies, Angela Dai, Matthias Nießner |
6299 | Modeling Multi-Label Action Dependencies for Temporal Action Localization | Praveen Tirupattur, Kevin Duarte, Yogesh S Rawat, Mubarak Shah |
3479 | ContactOpt: Optimizing Contact To Improve Grasps | Patrick Grady, Chengcheng Tang, Christopher D. Twigg, Minh Vo, Samarth Brahmbhatt, Charles C. Kemp |
6014 | From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation | Chen Li, Gim Hee Lee |
6772 | Deep Homography for Efficient Stereo Image Compression | Xin Deng, Wenzhe Yang, Ren Yang, Mai Xu, Enpeng Liu, Qianhan Feng, Radu Timofte |
3985 | FVC: A New Framework Towards Deep Video Compression in Feature Space | Zhihao Hu, Guo Lu, Dong Xu |
5222 | Zero-Shot Adversarial Quantization | Yuang Liu, Wei Zhang, Jun Wang |
1456 | Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-Identification | Xudong Tian, Zhizhong Zhang, Shaohui Lin, Yanyun Qu, Yuan Xie, Lizhuang Ma |
1478 | Closed-Form Factorization of Latent Semantics in GANs | Yujun Shen, Bolei Zhou |
1640 | High-Fidelity Neural Human Motion Transfer From Monocular Video | Moritz Kappel, Vladislav Golyanik, Mohamed Elgharib, Jann-Ole Henningson, Hans-Peter Seidel, Susana Castillo, Christian Theobalt, Marcus Magnor |
8309 | Correlated Input-Dependent Label Noise in Large-Scale Image Classification | Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent |
4210 | Bi-GCN: Binary Graph Convolutional Network | Junfu Wang, Yunhong Wang, Zhen Yang, Liang Yang, Yuanfang Guo |
2141 | Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking | Ning Wang, Wengang Zhou, Jie Wang, Houqiang Li |
3966 | FS-Net: Fast Shape-Based Network for Category-Level 6D Object Pose Estimation With Decoupled Rotation Mechanism | Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Linlin Shen, Aleš Leonardis |
2634 | On Learning the Geodesic Path for Incremental Learning | Christian Simon, Piotr Koniusz, Mehrtash Harandi |
5513 | UP-DETR: Unsupervised Pre-Training for Object Detection With Transformers | Zhigang Dai, Bolun Cai, Yugeng Lin, Junying Chen |
5075 | Robust Consistent Video Depth Estimation | Johannes Kopf, Xuejian Rong, Jia-Bin Huang |
6630 | Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing | Tianfei Zhou, Wenguan Wang, Si Liu, Yi Yang, Luc Van Gool |
5298 | Global Transport for Fluid Reconstruction With Learned Self-Supervision | Erik Franz, Barbara Solenthaler, Nils Thuerey |
784 | VLN BERT: A Recurrent Vision-and-Language BERT for Navigation | Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-Opazo, Stephen Gould |
4361 | Single-View Robot Pose and Joint Angle Estimation via Render & Compare | Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic |
3735 | Learning Deep Classifiers Consistent With Fine-Grained Novelty Detection | Jiacheng Cheng, Nuno Vasconcelos |
2983 | CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement | Noranart Vesdapunt, Baoyuan Wang |
2222 | Equalization Loss v2: A New Gradient Balance Approach for Long-Tailed Object Detection | Jingru Tan, Xin Lu, Gang Zhang, Changqing Yin, Quanquan Li |
3057 | Semantic-Aware Video Text Detection | Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu |
8441 | Improved Handling of Motion Blur in Online Object Detection | Mohamed Sayed, Gabriel Brostow |
5030 | IQDet: Instance-Wise Quality Distribution Sampling for Object Detection | Yuchen Ma, Songtao Liu, Zeming Li, Jian Sun |
6084 | One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation | Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu |
4316 | Learning Monocular 3D Reconstruction of Articulated Categories From Motion | Filippos Kokkinos, Iasonas Kokkinos |
490 | SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans | Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Nießner |
1506 | Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion | Shi Qiu, Saeed Anwar, Nick Barnes |
3516 | Unsupervised 3D Shape Completion Through GAN Inversion | Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy |
6913 | 3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding | Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia |
5873 | Deep Implicit Moving Least-Squares Functions for 3D Reconstruction | Shi-Lin Liu, Hao-Xiang Guo, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu |
1233 | Using Shape To Categorize: Low-Shot Learning With an Explicit Shape Bias | Stefan Stojanov, Anh Thai, James M. Rehg |
7160 | Privacy Preserving Localization and Mapping From Uncalibrated Cameras | Marcel Geppert, Viktor Larsson, Pablo Speciale, Johannes L. Schönberger, Marc Pollefeys |
2187 | HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences | Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Rohit Pandey, Cem Keskin, Ruofei Du, Deqing Sun, Sofien Bouaziz, Sean Fanello, Ping Tan, Yinda Zhang |
2631 | Learning Camera Localization via Dense Scene Matching | Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu, Ping Tan |
3096 | PlückerNet: Learn To Register 3D Line Reconstructions | Liu Liu, Hongdong Li, Haodong Yao, Ruyi Zha |
6766 | MultiLink: Multi-Class Structure Recovery via Agglomerative Clustering and Model Selection | Luca Magri, Filippo Leveni, Giacomo Boracchi |
2540 | 3D-MAN: 3D Multi-Frame Attention Network for Object Detection | Zetong Yang, Yin Zhou, Zhifeng Chen, Jiquan Ngiam |
1796 | Exploring intermediate representation for monocular vehicle pose estimation | Shichao Li, Zengqiang Yan, Hongyang Li, Kwang-Ting Cheng |
433 | Towards Long-Form Video Understanding | Chao-Yuan Wu, Philipp Krähenbühl |
8068 | TDN: Temporal Difference Networks for Efficient Action Recognition | Limin Wang, Zhan Tong, Bin Ji, Gangshan Wu |
759 | Self-Supervised Learning for Semi-Supervised Temporal Action Proposal | Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Changxin Gao, Nong Sang |
3889 | WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos | Mingfei Gao, Yingbo Zhou, Ran Xu, Richard Socher, Caiming Xiong |
6930 | Enhancing the Transferability of Adversarial Attacks Through Variance Tuning | Xiaosen Wang, Kun He |
10158 | You See What I Want You To See: Exploring Targeted Black-Box Transferability Attack for Hash-Based Image Retrieval Systems | Yanru Xiao, Cong Wang |
9906 | Pose Recognition With Cascade Transformers | Ke Li, Shijie Wang, Xiang Zhang, Yifan Xu, Weijian Xu, Zhuowen Tu |
1740 | End-to-End Human Pose and Mesh Reconstruction with Transformers | Kevin Lin, Lijuan Wang, Zicheng Liu |
1013 | Beyond Static Features for Temporally Consistent 3D Human Pose and Shape From a Video | Hongsuk Choi, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee |
3209 | A Generalized Loss Function for Crowd Counting and Localization | Jia Wan, Ziquan Liu, Antoni B. Chan |
10252 | LOHO: Latent Optimization of Hairstyles via Orthogonalization | Rohit Saha, Brendan Duke, Florian Shkurti, Graham W. Taylor, Parham Aarabi |
8057 | Pseudo Facial Generation With Extreme Poses for Face Recognition | Guoli Wang, Jiaqi Ma, Qian Zhang, Jiwen Lu, Jie Zhou |
261 | Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification | Hao Chen, Yaohui Wang, Benoit Lagadec, Antitza Dantcheva, Francois Bremond |
6862 | BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification | Ruibing Hou, Hong Chang, Bingpeng Ma, Rui Huang, Shiguang Shan |
8051 | Learning To Reconstruct High Speed and High Dynamic Range Videos From Events | Yunhao Zou, Yinqiang Zheng, Tsuyoshi Takatani, Ying Fu |
208 | Iterative Filter Adaptive Network for Single Image Defocus Deblurring | Junyong Lee, Hyeongseok Son, Jaesung Rim, Sunghyun Cho, Seungyong Lee |
5129 | Recorrupted-to-Recorrupted: Unsupervised Deep Learning for Image Denoising | Tongyao Pang, Huan Zheng, Yuhui Quan, Hui Ji |
4108 | Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation | Yuntong Ye, Yi Chang, Hanyu Zhou, Luxin Yan |
5493 | Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments | Zhihao Xia, Michaël Gharbi, Federico Perazzi, Kalyan Sunkavalli, Ayan Chakrabarti |
3932 | Controllable Image Restoration for Under-Display Camera in Smartphones | Kinam Kwon, Eunhee Kang, Sangwon Lee, Su-Jin Lee, Hyong-Euk Lee, ByungIn Yoo, Jae-Joon Han |
7697 | MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing | Zhengjue Wang, Hao Zhang, Ziheng Cheng, Bo Chen, Xin Yuan |
224 | Learning the Non-Differentiable Optimization for Blind Super-Resolution | Zheng Hui, Jie Li, Xiumei Wang, Xinbo Gao |
2840 | Robust Reference-Based Super-Resolution via C2-Matching | Yuming Jiang, Kelvin C.K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu |
1463 | Space-Time Distillation for Video Super-Resolution | Zeyu Xiao, Xueyang Fu, Jie Huang, Zhen Cheng, Zhiwei Xiong |
1090 | Person30K: A Dual-Meta Generalization Network for Person Re-Identification | Yan Bai, Jile Jiao, Wang Ce, Jun Liu, Yihang Lou, Xuetao Feng, Ling-Yu Duan |
7260 | Zillow Indoor Dataset: Annotated Floor Plans With 360° Panoramas and 3D Room Layouts | Steve Cruz, Will Hutchcroft, Yuguang Li, Naji Khosravan, Ivaylo Boyadzhiev, Sing Bing Kang |
6195 | The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures | Yawei Li, Wen Li, Martin Danelljan, Kai Zhang, Shuhang Gu, Luc Van Gool, Radu Timofte |
376 | Distilling Object Detectors via Decoupled Features | Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu |
562 | S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-Bit Neural Networks via Guided Distribution Calibration | Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, Marios Savvides |
10832 | BCNet: Searching for Network Width With Bilaterally Coupled Network | Xiu Su, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu |
4819 | Multi-Attentional Deepfake Detection | Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, Nenghai Yu |
7128 | A Peek Into the Reasoning of Neural Networks: Interpreting With Structural Visual Concepts | Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu |
3239 | Probabilistic Selective Encryption of Convolutional Neural Networks for Hierarchical Services | Jinyu Tian, Jiantao Zhou, Jia Duan |
3620 | Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval | Yawen Zeng, Da Cao, Xiaochi Wei, Meng Liu, Zhou Zhao, Zheng Qin |
4009 | PhD Learning: Learning With Pompeiu-Hausdorff Distances for Video-Based Vehicle Re-Identification | Jianan Zhao, Fengliang Qi, Guangyu Ren, Lin Xu |
819 | Pareidolia Face Reenactment | Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He |
5476 | Hyper-LifelongGAN: Scalable Lifelong Learning for Image Conditioned Generation | Mengyao Zhai, Lei Chen, Greg Mori |
743 | TediGAN: Text-Guided Diverse Face Image Generation and Manipulation | Weihao Xia, Yujiu Yang, Jing-Hao Xue, Baoyuan Wu |
1133 | TransFill: Reference-Guided Image Inpainting by Merging Multiple Color and Spatial Transformations | Yuqian Zhou, Connelly Barnes, Eli Shechtman, Sohrab Amirghodsi |
1437 | ArtCoder: An End-to-End Method for Generating Scanning-Robust Stylized QR Codes | Hao Su, Jianwei Niu, Xuefeng Liu, Qingfeng Li, Ji Wan, Mingliang Xu, Tao Ren |
7078 | Encoding in Style: A StyleGAN Encoder for Image-to-Image Translation | Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or |
4560 | Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling | Zhichao Huang, Xintong Han, Jia Xu, Tong Zhang |
7586 | OCONet: Image Extrapolation by Object Completion | Richard Strong Bowen, Huiwen Chang, Charles Herrmann, Piotr Teterwak, Ce Liu, Ramin Zabih |
6394 | Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction | Bohan Wu, Suraj Nair, Roberto Martín-Martín, Li Fei-Fei, Chelsea Finn |
3984 | Mutual CRF-GNN for Few-Shot Learning | Shixiang Tang, Dapeng Chen, Lei Bai, Kaijian Liu, Yixiao Ge, Wanli Ouyang |
4535 | Re-Labeling ImageNet: From Single to Multi-Labels, From Global to Localized Labels | Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk Chun |
8886 | Differentiable Patch Selection for Image Recognition | Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner |
2836 | Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition | Songyang Zhang, Zeming Li, Shipeng Yan, Xuming He, Jian Sun |
3177 | Contrastive Embedding for Generalized Zero-Shot Learning | Zongyan Han, Zhenyong Fu, Shuo Chen, Jian Yang |
4268 | Normal Integration via Inverse Plane Fitting With Minimum Point-to-Plane Distance | Xu Cao, Boxin Shi, Fumio Okura, Yasuyuki Matsushita |
7985 | Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression | Yufei Cui, Ziquan Liu, Qiao Li, Antoni B. Chan, Chun Jason Xue |
4614 | NetAdaptV2: Efficient Neural Architecture Search With Fast Super-Network Training and Architecture Optimization | Tien-Ju Yang, Yi-Lun Liao, Vivienne Sze |
2167 | MIST: Multiple Instance Spatial Transformer | Baptiste Angles, Yuhe Jin, Simon Kornblith, Andrea Tagliasacchi, Kwang Moo Yi |
2067 | Multi-Institutional Collaborations for Improving Deep Learning-Based Magnetic Resonance Image Reconstruction Using Federated Learning | Pengfei Guo, Puyang Wang, Jinyuan Zhou, Shanshan Jiang, Vishal M. Patel |
4271 | A Self-Boosting Framework for Automated Radiographic Report Generation | Zhanyu Wang, Luping Zhou, Lei Wang, Xiu Li |
2427 | Learning a Proposal Classifier for Multiple Object Tracking | Peng Dai, Renliang Weng, Wongun Choi, Changshui Zhang, Zhangping He, Wei Ding |
234 | Improving Multiple Object Tracking With Single Object Tracking | Linyu Zheng, Ming Tang, Yingying Chen, Guibo Zhu, Jinqiao Wang, Hanqing Lu |
5850 | Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion | Cheng Chi, Qingjie Wang, Tianyu Hao, Peng Guo, Xin Yang |
7377 | MaxUp: Lightweight Adversarial Training With Data Augmentation Improves Neural Network Training | Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu |
6748 | Unsupervised Human Pose Estimation Through Transforming Shape Templates | Luca Schmidtke, Athanasios Vlontzos, Simon Ellershaw, Anna Lukens, Tomoki Arichi, Bernhard Kainz |
8484 | Understanding the Behaviour of Contrastive Loss | Feng Wang, Huaping Liu |
2728 | Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation | Jichang Li, Guanbin Li, Yemin Shi, Yizhou Yu |
4528 | Divergence Optimization for Noisy Universal Domain Adaptation | Qing Yu, Atsushi Hashimoto, Yoshitaka Ushiku |
11434 | Limitations of Post-Hoc Feature Alignment for Robustness | Collin Burns, Jacob Steinhardt |
1954 | Semantic-Aware Knowledge Distillation for Few-Shot Class-Incremental Learning | Ali Cheraghian, Shafin Rahman, Pengfei Fang, Soumava Kumar Roy, Lars Petersson, Mehrtash Harandi |
11197 | Adaptive Aggregation Networks for Class-Incremental Learning | Yaoyao Liu, Bernt Schiele, Qianru Sun |
6113 | Progressive Modality Reinforcement for Human Multimodal Emotion Recognition From Unaligned Multimodal Sequences | Fengmao Lv, Xiang Chen, Yanyong Huang, Lixin Duan, Guosheng Lin |
5916 | Unsupervised Visual Representation Learning by Tracking Patches in Video | Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong |
983 | HoHoNet: 360 Indoor Holistic Understanding With Latent Horizontal Features | Cheng Sun, Min Sun, Hwann-Tzong Chen |
5458 | Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries | Saif Imran, Xiaoming Liu, Daniel Morris |
1395 | Zero-Shot Instance Segmentation | Ye Zheng, Jiahong Wu, Yongqiang Qin, Faen Zhang, Li Cui |
10636 | Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision | Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, Serena Yeung |
446 | Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision | Xiaokang Chen, Yuhui Yuan, Gang Zeng, Jingdong Wang |
2532 | Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation | Yazhou Yao, Tao Chen, Guo-Sen Xie, Chuanyi Zhang, Fumin Shen, Qi Wu, Zhenmin Tang, Jian Zhang |
5925 | ABMDRNet: Adaptive-Weighted Bi-Directional Modality Difference Reduction Network for RGB-T Semantic Segmentation | Qiang Zhang, Shenlu Zhao, Yongjiang Luo, Dingwen Zhang, Nianchang Huang, Jungong Han |
4014 | BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation | Jungbeom Lee, Jihun Yi, Chaehun Shin, Sungroh Yoon |
4558 | Positive-Unlabeled Data Purification in the Wild for Object Detection | Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang |
7652 | Ranking Neural Checkpoints | Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong |
10682 | SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning | Colorado J Reed, Sean Metzger, Aravind Srinivas, Trevor Darrell, Kurt Keutzer |
2077 | Self-Supervised Multi-Frame Monocular Scene Flow | Junhwa Hur, Stefan Roth |
1549 | Skip-Convolutions for Efficient Video Processing | Amirhossein Habibian, Davide Abati, Taco S. Cohen, Babak Ehteshami Bejnordi |
1609 | Learning To Associate Every Segment for Video Panoptic Segmentation | Sanghyun Woo, Dahun Kim, Joon-Young Lee, In So Kweon |
4598 | Triple-Cooperative Video Shadow Detection | Zhihao Chen, Liang Wan, Lei Zhu, Jia Shen, Huazhu Fu, Wennan Liu, Jing Qin |
10358 | Image Change Captioning by Learning From an Auxiliary Task | Mehrdad Hosseinzadeh, Yang Wang |
4172 | How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language | Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giro-i-Nieto |
1731 | Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation | Yapeng Tian, Di Hu, Chenliang Xu |
8850 | LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces From Video Using Pose and Lighting Normalization | Avisek Lahiri, Vivek Kwatra, Christian Frueh, John Lewis, Chris Bregler |
7458 | Interventional Video Grounding With Dual Contrastive Learning | Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu |
987 | Roses Are Red, Violets Are Blue… but Should VQA Expect Them To? | Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf |
2862 | ReDet: A Rotation-Equivariant Detector for Aerial Object Detection | Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia |
2191 | Roof-GAN: Learning To Generate Roof Geometry and Relations for Residential Houses | Yiming Qian, Hao Zhang, Yasutaka Furukawa |
6778 | PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation | Tal Reiss, Niv Cohen, Liron Bergman, Yedid Hoshen |
7658 | Differentiable SLAM-Net: Learning Particle SLAM for Visual Navigation | Peter Karkus, Shaojun Cai, David Hsu |