Representation Learning for Computer Vision

Labs from the Master MVA Representation Learning for Computer Vision class, taught by Pietro Gori and Loïc Le Folgoc. This graduate course is an introduction to representation learning in computer vision and medical imaging applications. It covers topics such as Transfer Learning, Self-Supervised Learning, Vision Transformers, and Explainability in Neural Networks, among others. Each lab assignement explores one of these topics.

TP1. Intriguing Properties

This lab's goal is to reproduce some results of the paper "Intriguing properties of neural networks" (Szegedy et. al., 2014). In particular, we produce adversarial examples: by applying a certain hardly perceptible perturbation to an image, we can cause the network to misclassify the sample. This illustrates the non-smoothness of the representions learned by deep neural networks.

TP2. Domain Adaptation

In this lab, we implement the method of "Unsupervised Visual Domain Adaptation Using Subspace Alignment" (Fernando et. al., 2013) for Unsupervised Domain Adaptation. The goal is to learn a model from a labeled source dataset that generalizes to an unlabeled target dataset whose input distribution differs (covariate shift), while assuming the labeling function remains the same.

TP3. Self-Supervised Learning 1: Rotation Prediction

Self-supervised learning trains models on unlabeled data by creating artificial (pretext) task from the data itself, enabling the model to learn meaningful features without human-provided labels. This lab implement the method proposed in "Unsupervised Representation Learning by Predicting Image Rotations" (Gidaris et. al., 2018), whose pretext task consists in predicting a rotation (0°, 90°, 180°, 270°) that has been applied to an image.

TP4. Self-Supervised Learning 2: Contrastive Learning

In this lab we implement "A Simple Framework for Contrastive Learning of Visual Representations" (Chen et. al. 2020)'s SimCLR method. This method's goal is contrastive learning: training a model to distinguish between similar and dissimilar data points by pulling together representations of positive pairs and separating those of negative pairs, in a self-supervised way.

TP5. Vision Transformers

This lab's goal is to reimplement a Vision Transformer (ViT), an model based on self-attention mechanisms and introduced by "An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale" (Dosovitskiy et. al., 2021). We implement such a model, and then train and test it on the CIFAR-10 dataset.

TP6. Masked Auto-Encoding

In this lab, we pretrain a ViT in a self-supervised way, using the Masked Auto-Ecoding (MAE) approach, introduced by "Masked Autoencoders Are Scalable Vision Learners" (He et. al., 2021). To do so, we mask random patches of the input images and train an auto-encoder on reconstructing the missing pixels. The ViT encoder has learned efficient representations, and can then be fine-tuned for downstream tasks.

TP7. Variational Auto-Encoder

In this lab, we train a Variational Auto-Encoder (VAE), and compare different VAE models in terms of image generation, reconstruction and disentanglement. In particular, we focus on the $\beta$-VAE, introduced by "beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework" (Higgins et. al., 2017). We explore how the $\beta$ hyperparameter offers a tradeoff between image reconstruction and disentanglement.

TP8. Interpretability

In this last lab, we implement two explainability methods, and visualize them on two datasets from MedMNSIT, a database of biomedical images.

Here, we visualize occlusion maps "Visualizing and Understanding Convolutional Networks" (Zeiler and Fergus, 2014) on a dataset of blood cells images.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
TP1_IntriguingProperties.ipynb		TP1_IntriguingProperties.ipynb
TP2_DomainAdaptation.ipynb		TP2_DomainAdaptation.ipynb
TP3_SelfSupervised_Rotation.ipynb		TP3_SelfSupervised_Rotation.ipynb
TP4_SelfSupervised_Contrastive.ipynb		TP4_SelfSupervised_Contrastive.ipynb
TP5_Lab_RL_ViT.ipynb		TP5_Lab_RL_ViT.ipynb
TP6_Lab_RL_MAE.ipynb		TP6_Lab_RL_MAE.ipynb
TP7_Lab_RL_VAE.ipynb		TP7_Lab_RL_VAE.ipynb
TP8_Interp_MedMNIST.ipynb		TP8_Interp_MedMNIST.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Representation Learning for Computer Vision

TP1. Intriguing Properties

TP2. Domain Adaptation

TP3. Self-Supervised Learning 1: Rotation Prediction

TP4. Self-Supervised Learning 2: Contrastive Learning

TP5. Vision Transformers

TP6. Masked Auto-Encoding

TP7. Variational Auto-Encoder

TP8. Interpretability

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Representation Learning for Computer Vision

TP1. Intriguing Properties

TP2. Domain Adaptation

TP3. Self-Supervised Learning 1: Rotation Prediction

TP4. Self-Supervised Learning 2: Contrastive Learning

TP5. Vision Transformers

TP6. Masked Auto-Encoding

TP7. Variational Auto-Encoder

TP8. Interpretability

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages