ICCV 2021接收結(jié)果出爐！最新40篇論文分方向匯總（附打包下載）

極市平臺 2021-07-23

展開全文

報道丨極市平臺

極市導(dǎo)讀

ICCV2021結(jié)果出爐！你的論文中了嗎？ >>加入極市CV技術(shù)交流群，走在計算機視覺的最前沿

不久前，計算機視覺三大頂會之一ICCV2021接收結(jié)果已經(jīng)公布，本次ICCV共計 6236 篇有效提交論文，其中有 1617 篇論文被接收，接收率為25.9%。

接收論文ID：

https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vRfaTmsNweuaA0Gjyu58H_Cx56pGwFhcTYII0u1pg0U7MbhlgY0R6Y-BbK3xFhAiwGZ26u3TAtN5MnS/pubhtml

雖然我們目前還只能看到官方公布的接收論文ID，具體的接收論文還不清楚。但是在論文結(jié)果出爐后，部分作者在社交媒體展示了自己被接收的工作，有些更是已經(jīng)放出了開源代碼。

極市平臺對此次ICCV2021接收的論文進行了分類匯總，分為檢測、分割、估計、跟蹤、視覺定位、底層圖像處理、圖像視頻檢索、三維視覺等多個方向。所有關(guān)于ICCV2021的論文整理都匯總在了我們的Github項目中，該項目目前已收獲1200 Star。

這個Github項目將持續(xù)更新，項目地址（點擊閱讀原文即可跳轉(zhuǎn)）：

https://github.com/extreme-assistant/ICCV2021-Paper-Code-Interpretation/edit/master/ICCV2021.md

最新整理的40篇論文如下，在極市平臺公眾號后臺回復(fù)“ICCV2021”，即可獲得最新的ICCV2021論文合集下載。

神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)設(shè)計(Neural Network Structure Design)

Transformer

[3] Rethinking Spatial Dimensions of Vision Transformers
paper:https:///abs/2103.16302
code:https://github.com/naver-ai/pit

[2] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers(Oral)
paper:https:///pdf/2103.15679.pdf
code:https://github.com/hila-chefer/Transformer-MM-Explainability

[1] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions(Oral)
paper:https:///abs/2102.12122
code:https://github.com/whai362/PVT
解讀：PVT重磅升級：三點改進，性能大幅提升

檢測

圖像目標檢測(2D Object Detection)

[5] Active Learning for Deep Object Detection via Probabilistic Modeling
paper:https:///abs/2103.16130

[4] Detecting Invisible People
paper:https:///abs/2012.08419
project:https://www.cs./~tkhurana/invisible.htm
video:https:///StEfnshXrCE

[3] Conditional Variational Capsule Network for Open Set Recognition
paper:https:///abs/2104.09159
code:https://github.com/guglielmocamporese/cvaecaposr

[2] MDETR : Modulated Detection for End-to-End Multi-Modal Understanding(Oral)
paper:https:///pdf/2104.12763
code:https://github.com/ashkamath/mdetr
project:https://ashkamath./mdetr_page/
colab:https://colab.research.google.com/github/ashkamath/mdetr/blob/colab/notebooks/MDETR_demo.ipynb

[1] DetCo: Unsupervised Contrastive Learning for Object Detection
paper:https:///abs/2102.04803
code:https://github.com/xieenze/DetCo

分割(Segmentation)

圖像分割(Image Segmentation)

[2] Labels4Free: Unsupervised Segmentation using StyleGAN
paper:https:///abs/2103.14968
code:https://rameenabdal./Labels4Free
project:https://rameenabdal./Labels4Free/

[1] Mining Latent Classes for Few-shot Segmentation(Oral)
paper:https:///abs/2103.15402
code:https://github.com/LiheYoung/MiningFSS

實例分割(Instance Segmentation)

[2] Crossover Learning for Fast Online Video Instance Segmentation
code:https://github.com/hustvl/CrossVIS)

[1] Instances as Queries
paper:https:///abs/2105.01928
code:https://github.com/hustvl/QueryInst

語義分割(Semantic Segmentation)

[1] Calibrated Adversarial Refinement for Stochastic Semantic Segmentation
paper:https:///abs/2006.13144
code:https://github.com/EliasKassapis/CARSSS

GAN/生成式/對抗式(GAN/Generative/Adversarial)

[2] Labels4Free: Unsupervised Segmentation using StyleGAN
paper:https:///abs/2103.14968
code:https://rameenabdal./Labels4Free
project:https://rameenabdal./Labels4Free/)

[1] EigenGAN: Layer-Wise Eigen-Learning for GANs
paper:https:///abs/2104.12476
code:https://github.com/LynnHo/EigenGAN-Tensorflow

圖像處理(Image Processing)

[1] Equivariant Imaging: Learning Beyond the Range Space(Oral)
paper:https:///pdf/2103.14756.pdf

超分辨率(Super Resolution)

[1] Learning for Scale-Arbitrary Super-Resolution from Scale-Specific Networks
paper:https:///abs/2004.03791
code:https://github.com/LongguangWang/ArbSR

風(fēng)格遷移(Style Transfer)

[1] Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts(字體生成)
paper:https:///abs/2104.00887
code:https://github.com/clovaai/mxfont

估計(Estimation)

姿態(tài)估計(Human Pose Estimation)

[1] HuMoR: 3D Human Motion Model for Robust Pose Estimation(Oral)
paper:https://geometry./projects/humor/docs/humor.pdf
video:https:///5VWirxUHG0Y
project:https://geometry./projects/humor/

圖像&視頻檢索/理解(Image&Video Retrieval/Video Understanding)

行人重識別/檢測(Re-Identification/Detection)

[1] TransReID: Transformer-based Object Re-Identification
paper:https:///abs/2102.04378
code:https://github.com/heshuting555/TransReID
解讀：來自Transformer的降維打擊：ReID各項任務(wù)全面領(lǐng)先，阿里&浙大提出TransReID

視覺定位(Visual Localization)

[2] TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization
paper:https:///abs/2103.14862
code:https://github.com/vasgaowei/TS-CAM

[1] Boundary-sensitive Pre-training for Temporal Localization in Videos
paper:https:///abs/2011.10830

圖像匹配(Image Matching)

[1] COTR: Correspondence Transformer for Matching Across Images
paper:https:///abs/2103.14167)

三維視覺(3D Vision)

[1] MVTN: Multi-View Transformation Network for 3D Shape Recognition
paper:https:///abs/2011.13244)

目標跟蹤(Object Tracking)

[1] Detecting Invisible People
paper:https:///abs/2012.08419
project:https://www.cs./~tkhurana/invisible.htm
video:https:///StEfnshXrCE

遙感圖像(Remote Sensing Image)

[1] Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data
paper:https:///abs/2103.16607
code:https://github.com/ElementAI/seasonal-contrast

場景圖(Scene Graph

場景圖生成(Scene Graph Generation)

[1] Unconstrained Scene Generation with Locally Conditioned Radiance Fields
paper:https:///abs/2104.00670

場景圖預(yù)測(Scene Graph Prediction)

[1] Generative Compositional Augmentations for Scene Graph Prediction
paper:https:///abs/2007.05756
code:https://github.com/bknyaz/sgg

數(shù)據(jù)處理(Data Processing)

數(shù)據(jù)增廣(Data Augmentation)

[1] MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks
paper:https:///abs/2103.06132

異常檢測(Anomaly Detection)

[1] Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning
paper:https:///abs/2101.10030
code:https://github.com/tianyu0207/RTFM

表征學(xué)習(xí)(Representation Learning)

[1] In-Place Scene Labelling and Understanding with Implicit Scene Representation(Oral)
paper:https:///abs/2103.15875
project:https:///Semantic-NeRF/

遷移學(xué)習(xí)(Transfer Learning)

[2] Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data
paper:https:///abs/2103.16607
code:https://github.com/ElementAI/seasonal-contrast

[1] Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling
paper:https:///abs/2105.12441

度量學(xué)習(xí)(Metric Learning)

[1] Learning with Memory-based Virtual Classes for Deep Metric Learning
paper:https:///abs/2103.16940

增量學(xué)習(xí)(Incremental Learning)

[1] Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning
paper:https:///abs/2106.09701
code:https://github.com/GT-RIPL/AlwaysBeDreaming-DFCIL
project:https://jamessealesmith./project/dfcil/

對比學(xué)習(xí)(Contrastive Learning)

[1] CoMatch: Semi-supervised Learning with Contrastive Graph Regularization
paper:https:///abs/2011.11183
code:https://github.com/salesforce/CoMatch

主動學(xué)習(xí)(Active Learning)

[1] Active Learning for Deep Object Detection via Probabilistic Modeling
paper:https:///abs/2103.16130

視覺推理/視覺問答(Visual Reasoning/VQA)

[2] On the hidden treasure of dialog in video question answering
paper:https:///abs/2103.14517

[1] Just Ask: Learning to Answer Questions from Millions of Narrated Videos(Oral)
paper:https:///abs/2012.00451
code:https://github.com/antoyang/just-ask
project:https://antoyang./just-ask.html

數(shù)據(jù)集(Dataset)

[1] 4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface(4D重建)
paper:https:///abs/2105.01905
dataset:https://github.com/rabbityl/DeformingThings4D)
video:https:///QrSsVoTRpWk

其他分類

Pathdreamer: A World Model for Indoor Navigation(視覺導(dǎo)航)
paper:https:///abs/2105.08756

IPOKE: POKING A STILL IMAGE FOR CONTROLLED STOCHASTIC VIDEO SYNTHESIS
paper:https:///abs/2107.02790
code:https://github.com/CompVis/ipoke
project:https://compvis./ipoke/)

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
paper:https:///abs/2104.00677
project:https://www./dietnerf

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
paper:https:///abs/2103.13744
code:https://github.com/creiser/kilonerf

極市一直非常關(guān)注各大視覺頂會，在每年都會對頂會資源進行整理，包括論文解讀、代碼、技術(shù)直播、分方向盤點、最佳論文匯總等，也得到了許多開發(fā)者的支持。在今年，我們也會對ICCV2021進行實時跟進。