MAIN CONFERENCE
All papers will be presented in the same manner. Each paper will have a five minute pre-recorded video and a PDF of the poster. An asynchronous text chat will be available for each paper. Attendees can view the papers and videos on demand at any time. Authors will also have individual Q&A sessions at the posted times below.
All posted times are EDT but the chart linked below has all time zones’ conversions. When the virtual site is up, you will be able to select which sessions you are interested in and it will populate your own schedule.
Presentation Schedule
-
All times are Eastern Daylight Time
Date: Wednesday, June 23, 2021 11:00 – 13:30
Paper Session Seven:
Paper ID | Paper Title | Authors |
5050 | VarifocalNet: An IoU-Aware Dense Object Detector | Haoyang Zhang, Ying Wang, Feras Dayoub, Niko Sünderhauf |
2755 | Variational Relational Point Completion Network | Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu |
8458 | NeX: Real-Time View Synthesis With Neural Basis Expansion | Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, Supasorn Suwajanakorn |
2610 | Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments | Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, Leonidas J. Guibas |
3916 | Categorical Depth Distribution Network for Monocular 3D Object Detection | Cody Reading, Ali Harakeh, Julia Chae, Steven L. Waslander |
1610 | Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World | Jiakai Wang, Aishan Liu, Zixin Yin, Shunchang Liu, Shiyu Tang, Xianglong Liu |
902 | PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation | Kehong Gong, Jianfeng Zhang, Jiashi Feng |
5465 | Passive Inter-Photon Imaging | Atul Ingle, Trevor Seets, Mauro Buttafava, Shantanu Gupta, Alberto Tosi, Mohit Gupta, Andreas Velten |
2663 | Adaptive Consistency Prior Based Deep Network for Image Denoising | Chao Ren, Xiaohai He, Chuncheng Wang, Zhibo Zhao |
4547 | Dynamic Slimmable Network | Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, Xiaojun Chang |
7174 | The Neural Tangent Link Between CNN Denoisers and Non-Local Filters | Julián Tachella, Junqi Tang, Mike Davies |
6581 | Learning Continuous Image Representation With Local Implicit Image Function | Yinbo Chen, Sifei Liu, Xiaolong Wang |
327 | Image-to-Image Translation via Hierarchical Style Disentanglement | Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji |
5483 | Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction | Guy Gafni, Justus Thies, Michael Zollhöfer, Matthias Nießner |
3633 | Adversarial Robustness Under Long-Tailed Distribution | Tong Wu, Ziwei Liu, Qingqiu Huang, Yu Wang, Dahua Lin |
1364 | Representative Batch Normalization With Feature Calibration | Shang-Hua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng |
829 | Learning to Track Instances without Video Annotations | Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz |
7002 | Reducing Domain Gap by Reducing Style Bias | Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, Donggeun Yoo |
5588 | Taskology: Utilizing Task Relations at Scale | Yao Lu, Sören Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon |
6415 | MOS: Towards Scaling Out-of-Distribution Detection for Large Semantic Space | Rui Huang, Yixuan Li |
8088 | DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation | Xing Shen, Jirui Yang, Chunbo Wei, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Xiaoliang Cheng, Kewei Liang |
3872 | Fine-Grained Angular Contrastive Learning With Coarse Labels | Guy Bukchin, Eli Schwartz, Kate Saenko, Ori Shahar, Rogerio Feris, Raja Giryes, Leonid Karlinsky |
5858 | End-to-End Video Instance Segmentation With Transformers | Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, Huaxia Xia |
5007 | TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption | Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Florencio, Lijuan Wang, Cha Zhang, Lei Zhang, Jiebo Luo |
10509 | Real-Time High-Resolution Background Matting | Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian L. Curless, Steven M. Seitz, Ira Kemelmacher-Shlizerman |
7869 | Camouflaged Object Segmentation With Distraction Mining | Haiyang Mei, Ge-Peng Ji, Ziqi Wei, Xin Yang, Xiaopeng Wei, Deng-Ping Fan |
5445 | Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection | Chenchen Zhu, Fangyi Chen, Uzair Ahmed, Zhiqiang Shen, Marios Savvides |
1616 | Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection | Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye |
10173 | TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text | Amanpreet Singh, Guan Pang, Mandy Toh, Jing Huang, Wojciech Galuba, Tal Hassner |
11134 | MOST: A Multi-Oriented Scene Text Detector With Localization Refinement | Minghang He, Minghui Liao, Zhibo Yang, Humen Zhong, Jun Tang, Wenqing Cheng, Cong Yao, Yongpan Wang, Xiang Bai |
3082 | Points As Queries: Weakly Semi-Supervised Object Detection by Points | Liangyu Chen, Tong Yang, Xiangyu Zhang, Wei Zhang, Jian Sun |
2192 | Holistic 3D Scene Understanding From a Single Image With Implicit Representation | Cheng Zhang, Zhaopeng Cui, Yinda Zhang, Bing Zeng, Marc Pollefeys, Shuaicheng Liu |
5540 | Shelf-Supervised Mesh Prediction in the Wild | Yufei Ye, Shubham Tulsiani, Abhinav Gupta |
2868 | Mesh Saliency: An Independent Perceptual Measure or a Derivative of Image Saliency? | Ran Song, Wei Zhang, Yitian Zhao, Yonghuai Liu, Paul L. Rosin |
2771 | MetaSets: Meta-Learning on Point Sets for Generalizable Representations | Chao Huang, Zhangjie Cao, Yunbo Wang, Jianmin Wang, Mingsheng Long |
4761 | Few-Shot 3D Point Cloud Semantic Segmentation | Na Zhao, Tat-Seng Chua, Gim Hee Lee |
11241 | Point Cloud Instance Segmentation Using Probabilistic Embeddings | Biao Zhang, Peter Wonka |
772 | Robust Point Cloud Registration Framework Based on Deep Graph Matching | Kexue Fu, Shaolei Liu, Xiaoyuan Luo, Manning Wang |
10511 | Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food | Quin Thames, Arjun Karpur, Wade Norris, Fangting Xia, Liviu Panait, Tobias Weyand, Jack Sim |
4401 | Differentiable Diffusion for Dense Depth Estimation From Multi-View Images | Numair Khan, Min H. Kim, James Tompkin |
3434 | LoFTR: Detector-Free Local Feature Matching With Transformers | Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, Xiaowei Zhou |
1972 | DI-Fusion: Online Implicit 3D Reconstruction With Deep Priors | Jiahui Huang, Shi-Sheng Huang, Haoxuan Song, Shi-Min Hu |
6130 | SMD-Nets: Stereo Mixture Density Networks | Fabio Tosi, Yiyi Liao, Carolin Schmitt, Andreas Geiger |
3297 | Deep Two-View Structure-From-Motion Revisited | Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, Hongdong Li |
2149 | Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds | Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu |
5545 | GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection | Abhinav Kumar, Garrick Brazil, Xiaoming Liu |
2472 | Graph-Based High-Order Relation Modeling for Long-Term Action Recognition | Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng |
4028 | SGCN: Sparse Graph Convolution Network for Pedestrian Trajectory Prediction | Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua |
1453 | Reformulating HOI Detection As Adaptive Set Prediction | Mingfei Chen, Yue Liao, Si Liu, Zhiyuan Chen, Fei Wang, Chen Qian |
3072 | MagDR: Mask-Guided Detection and Reconstruction for Defending Deepfakes | Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Bo Zhang |
11041 | Improving the Transferability of Adversarial Samples With Adversarial Transformations | Weibin Wu, Yuxin Su, Michael R. Lyu, Irwin King |
2440 | FCPose: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions | Weian Mao, Zhi Tian, Xinlong Wang, Chunhua Shen |
387 | DexYCB: A Benchmark for Capturing Hand Grasping of Objects | Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, Dieter Fox |
4286 | Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans | Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, Xiaowei Zhou |
3459 | PCLs: Geometry-Aware Neural Reconstruction of 3D Pose With Perspective Crop Layers | Frank Yu, Mathieu Salzmann, Pascal Fua, Helge Rhodin |
4259 | Affective Processes: Stochastic Modelling of Temporal Context for Emotion and Facial Expression Recognition | Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar, Georgios Tzimiropoulos |
685 | Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes | Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie Zhou |
7274 | Cross-View Gait Recognition With Deep Universal Linear Embeddings | Shaoxiong Zhang, Yunhong Wang, Annan Li |
2457 | Partial Person Re-Identification With Part-Part Correspondence Learning | Tianyu He, Xu Shen, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua |
5463 | Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging | Ilya Chugunov, Seung-Hwan Baek, Qiang Fu, Wolfgang Heidrich, Felix Heide |
4269 | Effective Snapshot Compressive-Spectral Imaging via Deep Denoising and Total Variation Priors | Haiquan Qiu, Yao Wang, Deyu Meng |
4914 | Test-Time Fast Adaptation for Dynamic Scene Deblurring via Meta-Auxiliary Learning | Zhixiang Chi, Yang Wang, Yuanhao Yu, Jin Tang |
412 | Removing Raindrops and Rain Streaks in One Go | Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang |
1532 | Learning Multi-Scale Photo Exposure Correction | Mahmoud Afifi, Konstantinos G. Derpanis, Björn Ommer, Michael S. Brown |
1193 | Towards Real-World Blind Face Restoration With Generative Facial Prior | Xintao Wang, Yu Li, Honglun Zhang, Ying Shan |
4882 | Image Restoration for Under-Display Camera | Yuqian Zhou, David Ren, Neil Emerton, Sehoon Lim, Timothy Large |
3062 | LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-Resolution | Xin Deng, Hao Wang, Mai Xu, Yichen Guo, Yuhang Song, Li Yang |
1089 | Interpreting Super-Resolution Networks With Local Attribution Maps | Jinjin Gu, Chao Dong |
7145 | Deep Burst Super-Resolution | Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte |
7674 | Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes | Zhihang Zhong, Yinqiang Zheng, Imari Sato |
3977 | Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline | Lingzhi He, Hongguang Zhu, Feng Li, Huihui Bai, Runmin Cong, Chunjie Zhang, Chunyu Lin, Meiqin Liu, Yao Zhao |
10750 | Learning To Restore Hazy Video: A New Real-World Dataset and a New Method | Xinyi Zhang, Hang Dong, Jinshan Pan, Chao Zhu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Fei Wang |
5272 | Gradient Forward-Propagation for Large-Scale Temporal Video Modelling | Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Świrszcz, Viorica Pătrăucean, João Carreira |
3978 | Complementary Relation Contrastive Distillation | Jinguo Zhu, Shixiang Tang, Dapeng Chen, Shijie Yu, Yakun Liu, Mingzhe Rong, Aijun Yang, Xiaohua Wang |
5524 | Network Pruning via Performance Maximization | Shangqian Gao, Feihu Huang, Weidong Cai, Heng Huang |
8376 | Distribution-Aware Adaptive Multi-Bit Quantization | Sijie Zhao, Tao Yue, Xuemei Hu |
10749 | The Affective Growth of Computer Vision | Norman Makoto Su, David J. Crandall |
1263 | Fair Attribute Classification Through Latent Space De-Biasing | Vikram V. Ramaswamy, Sunnie S. Y. Kim, Olga Russakovsky |
7885 | Soteria: Provable Defense Against Privacy Leakage in Federated Learning From Representation Perspective | Jingwei Sun, Ang Li, Binghui Wang, Huanrui Yang, Hai Li, Yiran Chen |
1957 | Deep Compositional Metric Learning | Wenzhao Zheng, Chengkun Wang, Jiwen Lu, Jie Zhou |
5411 | Physically-Aware Generative Network for 3D Shape Modeling | Mariem Mezghanni, Malika Boulkenafed, André Lieutier, Maks Ovsjanikov |
1179 | Semantic Palette: Guiding Scene Generation With Class Proportions | Guillaume Le Moing, Tuan-Hung Vu, Himalaya Jain, Patrick Pérez, Matthieu Cord |
7612 | Linear Semantics in Generative Adversarial Networks | Jianjin Xu, Changxi Zheng |
2910 | Region-Aware Adaptive Instance Normalization for Image Harmonization | Jun Ling, Han Xue, Li Song, Rong Xie, Xiao Gu |
6187 | PD-GAN: Probabilistic Diverse GAN for Image Inpainting | Hongyu Liu, Ziyu Wan, Wei Huang, Yibing Song, Xintong Han, Jing Liao |
4675 | In the Light of Feature Distributions: Moment Matching for Neural Style Transfer | Nikolai Kalischek, Jan D. Wegner, Konrad Schindler |
11816 | High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network | Jie Liang, Hui Zeng, Lei Zhang |
6691 | Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes | Jiashun Wang, Huazhe Xu, Jingwei Xu, Sifei Liu, Xiaolong Wang |
5246 | A Sliced Wasserstein Loss for Neural Texture Synthesis | Eric Heitz, Kenneth Vanhoey, Thomas Chambon, Laurent Belcour |
3892 | Space-Time Neural Irradiance Fields for Free-Viewpoint Video | Wenqi Xian, Jia-Bin Huang, Johannes Kopf, Changil Kim |
8703 | Rethinking Class Relations: Absolute-Relative Supervised and Unsupervised Few-Shot Learning | Hongguang Zhang, Piotr Koniusz, Songlei Jian, Hongdong Li, Philip H. S. Torr |
7719 | Joint Negative and Positive Learning for Noisy Labels | Youngdong Kim, Juseung Yun, Hyounguk Shon, Junmo Kim |
6322 | Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces | Alireza Zaeemzadeh, Niccolò Bisagno, Zeno Sambugaro, Nicola Conci, Nazanin Rahnavard, Mubarak Shah |
866 | OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World | Zhun Zhong, Linchao Zhu, Zhiming Luo, Shaozi Li, Yi Yang, Nicu Sebe |
419 | Calibrated RGB-D Salient Object Detection | Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng |
6016 | Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification | Oren Nuriel, Sagie Benaim, Lior Wolf |
7354 | Binary Graph Neural Networks | Mehdi Bahri, Gaétan Bahl, Stefanos Zafeiriou |
11671 | Contrastive Neural Architecture Search With Neural Architecture Comparators | Yaofo Chen, Yong Guo, Qi Chen, Minli Li, Wei Zeng, Yaowei Wang, Mingkui Tan |
2687 | Group Whitening: Balancing Learning Efficiency and Representational Capacity | Lei Huang, Yi Zhou, Li Liu, Fan Zhu, Ling Shao |
2227 | Towards Unified Surgical Skill Assessment | Daochang Liu, Qiyue Li, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li |
4326 | Every Annotation Counts: Multi-Label Deep Supervision for Medical Image Segmentation | Simon Reiß, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen |
6911 | Graph Attention Tracking | Dongyan Guo, Yanyan Shao, Ying Cui, Zhenhua Wang, Liyan Zhang, Chunhua Shen |
8651 | Discriminative Appearance Modeling With Multi-Track Pooling for Real-Time Multi-Object Tracking | Chanho Kim, Li Fuxin, Mazen Alotaibi, James M. Rehg |
3648 | Scale-Aware Automatic Augmentation for Object Detection | Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia |
2574 | Confluent Vessel Trees With Accurate Bifurcations | Zhongwen Zhang, Dmitrii Marin, Maria Drangova, Yuri Boykov |
3996 | Sequential Graph Convolutional Network for Active Learning | Razvan Caramalau, Binod Bhattarai, Tae-Kyun Kim |
11307 | CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models | Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, Jun Wang |
3181 | Domain-Specific Suppression for Adaptive Object Detection | Yu Wang, Rui Zhang, Shuo Zhang, Miao Li, Yangyang Xia, Xishan Zhang, Shaoli Liu |
6132 | Uncertainty Reduction for Model Adaptation in Semantic Segmentation | Prabhu Teja S, François Fleuret |
5777 | Open Domain Generalization with Domain-Augmented Meta-Learning | Yang Shu, Zhangjie Cao, Chenyu Wang, Jianmin Wang, Mingsheng Long |
6235 | Layerwise Optimization by Gradient Decomposition for Continual Learning | Shixiang Tang, Dapeng Chen, Jinguo Zhu, Shijie Yu, Wanli Ouyang |
3886 | SLADE: A Self-Training Framework for Distance Metric Learning | Jiali Duan, Yen-Liang Lin, Son Tran, Larry S. Davis, C.-C. Jay Kuo |
11511 | DualGraph: A Graph-Based Method for Reasoning About Label Noise | HaiYang Zhang, XiMing Xing, Liang Liu |
8732 | CutPaste: Self-Supervised Learning for Anomaly Detection and Localization | Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, Tomas Pfister |
916 | Self-Supervised Visibility Learning for Novel View Synthesis | Yujiao Shi, Hongdong Li, Xin Yu |
9996 | Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging | S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağiz Aksoy |
3597 | Seesaw Loss for Long-Tailed Instance Segmentation | Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin |
1368 | Exploiting Edge-Oriented Reasoning for 3D Point-Based Scene Graph Analysis | Chaoyi Zhang, Jianhui Yu, Yang Song, Weidong Cai |
1585 | Rethinking BiSeNet for Real-Time Semantic Segmentation | Mingyuan Fan, Shenqi Lai, Junshi Huang, Xiaoming Wei, Zhenhua Chai, Junfeng Luo, Xiaolin Wei |
4281 | Exploit Visual Dependency Relations for Semantic Segmentation | Mingyuan Liu, Dan Schonfeld, Wei Tang |
4845 | Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution | Chi Zhang, Baoxiong Jia, Song-Chun Zhu, Yixin Zhu |
1386 | Anti-Aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation | Binghao Liu, Yao Ding, Jianbin Jiao, Xiangyang Ji, Qixiang Ye |
315 | Domain Consensus Clustering for Universal Domain Adaptation | Guangrui Li, Guoliang Kang, Yi Zhu, Yunchao Wei, Yi Yang |
4039 | Progressive Stage-Wise Learning for Unsupervised Feature Representation Enhancement | Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao |
7060 | NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions | Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua |
862 | Spatio-temporal Contrastive Domain Adaptation for Action Recognition | Xiaolin Song, Sicheng Zhao, Jingyu Yang, Huanjing Yue, Pengfei Xu, Runbo Hu, Hua Chai |
5125 | Shot Contrastive Self-Supervised Learning for Scene Boundary Detection | Shixing Chen, Xiaohan Nie, David Fan, Dongqing Zhang, Vimal Bhat, Raffay Hamid |
2351 | Anchor-Constrained Viterbi for Set-Supervised Action Segmentation | Jun Li, Sinisa Todorovic |
7419 | SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation | Dongfang Liu, Yiming Cui, Wenbo Tan, Yingjie Chen |
393 | Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers | Antoine Miech, Jean-Baptiste Alayrac, Ivan Laptev, Josef Sivic, Andrew Zisserman |
7043 | Open-Book Video Captioning With Retrieve-Copy-Generate Network | Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu |
6641 | Causal Attention for Vision-Language Tasks | Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai |
3112 | Locate Then Segment: A Strong Pipeline for Referring Image Segmentation | Ya Jing, Tao Kong, Wei Wang, Liang Wang, Lei Li, Tieniu Tan |
3639 | Pushing It Out of the Way: Interactive Visual Navigation | Kuo-Hao Zeng, Luca Weihs, Ali Farhadi, Roozbeh Mottaghi |
2065 | SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning Over Traffic Events | Li Xu, He Huang, Jun Liu |
11433 | User-Guided Line Art Flat Filling With Split Filling Mechanism | Lvmin Zhang, Chengze Li, Edgar Simo-Serra, Yi Ji, Tien-Tsin Wong, Chunping Liu |
4041 | CT-Net: Complementary Transfering Network for Garment Transfer With Arbitrary Geometric Changes | Fan Yang, Guosheng Lin |
4790 | AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles | Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, Raquel Urtasun |
8342 | AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training | Mihai Fieraru, Mihai Zanfir, Silviu Cristian Pirlea, Vlad Olaru, Cristian Sminchisescu |