Skip to content

ridhwanrazaliwork/Computer-Vision-Video-Lectures

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 
 
 

Repository files navigation

Computer Vision Video Lectures

A curated list of free, high-quality, university-level courses with video lectures related to the field of Computer Vision.

Checkbox Indicate the progress of our learning. [ ] [x]

Table of Contents

Signal Processing

  • Signals and Systems 6.003 (MIT), Prof. Dennis Freeman [ ] [Course]

    Signals and Systems 6.003 covers the fundamentals of signal and system analysis, focusing on representations of discrete-time and continuous-time signals (singularity functions, complex exponentials and geometrics, Fourier representations, Laplace and Z transforms, sampling) and representations of linear, time-invariant systems (difference and differential equations, block diagrams, system functions, poles and zeros, convolution, impulse and step responses, frequency responses). Applications are drawn broadly from engineering and physics, including feedback and control, communications, and signal processing.

  • Digital Signal Processing ECSE-4530 (Rensselaer Polytechnic Institute), Richard Radke [ ] [Course] [YouTube]

    This course provides a comprehensive treatment of the theory, design, and implementation of digital signal processing algorithms. In the first half of the course, we emphasize frequency-domain and Z-transform analysis. In the second half of the course, we investigate advanced topics in signal processing, including multirate signal processing, filter design, adaptive filtering, quantizer design, and power spectrum estimation. The course is fairly application-independent, to provide a strong theoretical foundation for future study in communications, control, or image processing. This course was originally offered at the graduate level but retooled in 2009 to be senior-level.

Image and Video Processing

  • Image and Video Processing: From Mars to Hollywood with a Stop at the Hospital (Duke University), Prof. Guillermo Sapiro [ ] [Course] [YouTube]

    In this course, you will learn the science behind how digital images and video are made, altered, stored, and used. We will look at the vast world of digital imaging, from how computers and digital cameras form images to how digital special effects are used in Hollywood movies to how the Mars Rover was able to send photographs across millions of miles of space.

    The course starts by looking at how the human visual system works and then teaches you about the engineering, mathematics, and computer science that makes digital images work. You will learn the basic algorithms used for adjusting images, explore JPEG and MPEG standards for encoding and compressing video images, and go on to learn about image segmentation, noise removal and filtering. Finally, we will end with image processing techniques used in medicine.

  • Introduction to Digital Image Processing ECSE-4540 (Rensselaer Polytechnic Institute), Richard Radke [ ] [Course] [YouTube]

    An introduction to the field of image processing, covering both analytical and implementation aspects. Topics include the human visual system, cameras and image formation, image sampling and quantization, spatial- and frequency-domain image enhancement, filter design, image restoration, image coding and compression, morphological image processing, color image processing, image segmentation, and image reconstruction. Real-world examples and assignments drawn from consumer digital imaging, security and surveillance, and medical image processing. This course forms a good basis for our extensive graduate image processing and computer vision courses.

  • Fundamentals of Digital Image and Video Processing (Northwestern University), Prof. Aggelos K. Katsaggelos [ ] [Course]

    This course will cover the fundamentals of image and video processing. We will provide a mathematical framework to describe and analyze images and videos as two- and three-dimensional signals in the spatial, spatio-temporal, and frequency domains. In this class not only will you learn the theory behind fundamental processing tasks including image/video enhancement, recovery, and compression - but you will also learn how to perform these key processing tasks in practice using state-of-the-art techniques and tools. We will introduce and use a wide variety of such tools – from optimization toolboxes to statistical techniques. Emphasis on the special role sparsity plays in modern image and video processing will also be given. In all cases, example images and videos pertaining to specific application domains will be utilized.

  • Image and Multidimensional Signal Processing EENG 510 (Colorado School of Mines), William Hoff [ ] [Course] [YouTube]

    This course provides the student with the theoretical background to allow them to apply state of the art image and multi-dimensional signal processing techniques. The course teaches students to solve practical problems involving the processing of multidimensional data such as imagery, video sequences, and volumetric data. The types of problems students are expected to solve are automated mensuration from multidimensional data, and the restoration, reconstruction, or compression of multidimensional data. The tools used in solving these problems include a variety of feature extraction methods, filtering techniques, segmentation techniques, and transform methods.

  • Digital Image Processing (IIT Kanpur), Prof. P.K. Biswas [ ] [Course] [YouTube]

  • Image Processing and Analysis ECS 173 (UC Davis), Prof. Owen Carmichael [ ] [Course] [YouTube]

    Techniques for automated extraction of high-level information from images generated by cameras, three-dimensional surface sensors, and medical devices. Typical applications include detection of objects in various types of images and describing populations of biological specimens appearing in medical imagery.

  • Digital Image Processing EE225B (UC Berkeley), Prof. Avideh Zakhor [ ] [Course]

    This course covers the following topics: 2-D sequences and systems, separable systems, projection slice thm, reconstruction from projections and partial Fourier information, Z transform, different equations, recursive computability, 2D DFT and FFT, 2D FIR filter design; human eye, perception, psychophysical vision properties, photometry and colorimetry, optics and image systems; image enhancement, image restoration, geometrical image modification, morphological image processing, halftoning, edge detection, image compression: scalar quantization, lossless coding, huffman coding, arithmetic coding dictionary techniques, waveform and transform coding DCT, KLT, Hadammard, multiresolution coding pyramid, subband coding, Fractal coding, vector quantization, motion estimation and compensation, standards: JPEG, MPEG, H.xxx, pre- and post-processing, scalable image and video coding, image and video communication over noisy channels.

  • Digital Image Processing I EE637 (Purdue University), Prof. Charles A. Bouman [ ] [Course] [YouTube]

    Introduction to digital image processing techniques for enhancement, compression, restoration, reconstruction, and analysis. Lecture and laboratory experiments covering a wide range of topics including 2-D signals and systems, image analysis, image segmentation; achromatic vision, color image processing, color imaging systems, image sharpening, interpolation, decimation, linear and nonlinear filtering, printing and display of images; image compression, image restoration, and tomography.

  • Quantitative Big Imaging: From Images to Statistics (ETH Zurich), K. S. Mader, M. Stampanoni [ ] [Course] [YouTube] [GitHub]

    The lecture focuses on the challenging task of extracting robust, quantitative metrics from imaging data and is intended to bridge the gap between pure signal processing and the experimental science of imaging. The course will focus on techniques, scalability, and science-driven analysis.

Introductory Computer Vision

  • First Principles of Computer Vision, Shree Nayar [ ] [Website] [YouTube]

    First Principles of Computer Vision is a lecture series presented by Shree Nayar who is faculty in the Computer Science Department, School of Engineering and Applied Sciences, Columbia University. Computer Vision is the enterprise of building machines that “see.” This series focuses on the physical and mathematical underpinnings of vision and has been designed for students, practitioners, and enthusiasts who have no prior knowledge of computer vision.

  • Computer Vision CAP5415 (UCF), Dr. Mubarak Shah [ ] [Course 2012] [Course 2014] [YouTube 2012] [YouTube 2014]

    The course is introductory level. It will cover the basic topics of computer vision, and introduce some fundamental approaches for computer vision research.

  • Computer Vision EENG 512 (Colorado School of Mines), William Hoff [ ] [YouTube]

    This course provides an overview of this field, starting with image formation and low level image processing. We then go into detail on the theory and techniques for extracting features from images, measuring shape and location, and recognizing objects.

  • 3D Computer Vision CS4277/CS5477 (National University of Singapore), Gim Hee Lee [ ] [YouTube]

    This is an introductory course on 3D Computer Vision which was recorded for online learning at NUS due to COVID-19. The topics covered include: Lecture 1: 2D and 1D projective geometry. Lecture 2: Rigid body motion and 3D projective geometry. Lecture 3: Circular points and Absolute conic. Lecture 4: Robust homography estimation. Lecture 5: Camera models and calibration. Lecture 6: Single view metrology. Lecture 7: The fundamental and essential matrices. Lecture 8: Absolute pose estimation from points or lines. Lecture 9: Three-view geometry from points and/or lines. Lecture 10: Structure-from-Motion (SfM) and bundle adjustment. Lecture 11: Two-view and multi-view stereo. Lecture 12: Generalized cameras. Lecture 13: Auto-Calibration.

  • Multiple View Geometry in Computer Vision (IT Sligo), Sean Mullery [ ] [YouTube]

  • Computer Vision (IIT Kanpur), Prof. Jayanta Mukhopadhyay [ ] [Course]

    The course will have a comprehensive coverage of theory and computation related to imaging geometry, and scene understanding. It will also provide exposure to clustering, classification and deep learning techniques applied in this area.

  • Computer Vision CS-442 (EPFL), Pascal Fua [ ] [Course]

    The students will be introduced to the basic techniques of the field of Computer Vision. They will learn to apply Image Processing techniques where appropriate. We will concentrate on the black and white and color images acquired using standard video cameras. We will introduce basic processing techniques, such as edge detection, segmentation, texture characterization, and shape recognition.

  • Computer Vision CS 543 (University of Illinois), Derek Hoiem [ ] [Course] [Recordings]

    In this course, we will cover many of the basic concepts and algorithms of computer vision: single-view and multi-view geometry, lighting, linear filters, texture, interest points, tracking, RANSAC, K-means clustering, segmentation, EM algorithm, recognition, and so on. In homeworks, you will put many of these concepts into practice. As this is a survey course, we will not go into great depth on any topic, but at the end of the course, you should be prepared for any further vision-related investigation and application.

  • Computer Vision for Visual Effects ECSE-6969, Richard Radke [ ] [Course] [YouTube]

    This course emphasizes research topics that underlie the advanced visual effects that are becoming increasingly common in commercials, music videos and movies. Topics include classical computer vision algorithms used on a regular basis in Hollywood (such as blue-screen matting, structure from motion, optical flow, and feature tracking) and exciting recent developments that form the basis for future effects (such as natural image matting, multi-image compositing, image retargeting, and view synthesis). We also discuss the technologies behind motion capture and three-dimensional data acquisition. Analysis of behind-the-scenes videos and in-depth interviews with Hollywood visual effects artists tie the mathematical concepts to real-world filmmaking.

  • Image processing and Computer Vision (CBCSL), Aleix M. Martinez [ ] [YouTube]

  • The Ancient Secrets of Computer Vision (University of Washington), Joseph Redmon [ ] [Course] [YouTube]

    This class is a general introduction to computer vision. It covers standard techniques in image processing like filtering, edge detection, stereo, flow, etc. (old-school vision), as well as newer, machine-learning based computer vision.

    • Introduction to Computer Vision Free Udacity course (Georgia Institute of Technology), Prof. Aaron Bobick, Irfan Essa, and Arpan Chakraborty [ ] [Course]

    This course provides a comprehensive introduction to computer vision, covering fundamental concepts and techniques across multiple modules. It begins with image processing, including images as functions, filtering, convolution, edge detection, and the Hough transform for detecting lines and circles. The course explores camera models and calibration, addressing perspective imaging, stereo geometry, epipolar geometry, and projective geometry concepts like homographies, essential, and fundamental matrices. It delves into visual features, including corner detection, SIFT descriptors, and robust matching with RANSAC. Additional topics include photometry, shape from shading, and motion analysis with dense flow and Lucas-Kanade methods. Tracking is covered through inference, Kalman filters, and particle filters, while recognition includes generative and discriminative models, SVMs, boosting, and video analysis with HMMs. The course concludes with color spaces, segmentation techniques, 3D perception, and an overview of the human vision system, providing a solid foundation in both theoretical and practical aspects of computer vision.

    • Technion EE 046746 - Computer Vision (Technion), Tal Daniel • Elias Nehme • Dalia Urbach • Anat Levin [ ] [Course]

    This course provides a hands-on introduction to computer vision and deep learning, guiding students through setting up a Python-based environment with Anaconda, PyTorch, and Google Colab for GPU support, and covering fundamental image processing techniques using NumPy, Matplotlib, and OpenCV for tasks like thresholding and blurring. It explores deep learning basics with multi-layer perceptrons (MLPs) and convolutional neural networks (CNNs), addressing datasets like MNIST, Fashion-MNIST, and CIFAR-10, alongside concepts like regularization, dropout, data augmentation, and CNN challenges (e.g., adversarial attacks). Advanced topics include edge and line detection (Canny, Hough transform, RANSAC), semantic segmentation (e.g., FCN, Mask R-CNN, DeepLab), generative adversarial networks (GANs) (e.g., Conditional GANs, CycleGAN), and deep object detection (e.g., R-CNN, YOLO, SSD). The course also covers camera models, epipolar geometry, stereo imaging, 3D deep learning (e.g., PointNet, Voxnet), deep object tracking (e.g., GOTURN, Deep SORT), deep uncertainty, and computational imaging, including monocular depth estimation and compressive sensing, offering a blend of theoretical foundations and practical applications.

    • Edge-computer-vision (Pragmatic AI Labs) [ ] [Course]

    This course introduces applied computer vision through a structured exploration of modern techniques and technologies over six lessons. It begins with an overview of the syllabus and a lecture on computing trends driving advancements in computer vision, such as increased computational power and data availability. Lesson 2 delves into emerging computer vision technologies, highlighting cutting-edge developments. Lesson 3 focuses on computer vision APIs, teaching how to leverage pre-built AI tools for vision tasks. Lesson 4 covers AutoML, with practical examples using Ludwig and a tutorial on Google Cloud Platform (GCP) for automated machine learning model development. Lesson 5 explores running computer vision ML models on edge devices, addressing efficient deployment in resource-constrained environments. The course concludes with final presentations in Lesson 6, where students showcase their projects, applying concepts learned throughout the course to real-world computer vision challenges.

    • Computer Vision (CS 763) - Spring 2018 (Pragmatic AI Labs), Arjun Jain [ ] [Course]

    This course module provides a comprehensive overview of computer vision principles and deep learning applications, focusing on both geometric and data-driven approaches. It covers camera geometry, including camera calibration, vanishing points, transformations, and homographies, to understand image formation. Image registration is explored through RANSAC for robust point-matching and an overview of SIFT for feature detection. The data-driven paradigm of deep learning is introduced, detailing feed-forward networks, back-propagation, and convolutional neural networks (CNNs), alongside generative adversarial networks (GANs) for generative tasks. Practical deep learning applications include face detection, CNN compression, and siamese/triplet networks for face recognition. The module also addresses classical algorithms like shape from shading, optical flow with the Kanade-Lucas-Tomasi algorithm, and their applications. Photometric stereo is covered for deriving object shapes from multiple lighting conditions, with applications in illumination-invariant face recognition and face relighting. Finally, stereo vision is explored through epipolar geometry, the fundamental matrix, shape from stereo, and structure from motion, bridging traditional and modern computer vision techniques.

    • Computer Vision CSCI 1430 Spring 2025(Brown University), James Tompkin [ ] [Course]

    This course provides an introduction to computer vision, including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification, scene understanding, and deep learning with neural networks.We will develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment, tracking, boundary detection, and recognition. We will develop the intuitions and mathematics of the methods in class, and then learn about the difference between theory and practice in homeworks.

Advanced Computer Vision

  • Advanced Computer Vision CAP6412 (UCF), Dr. Mubarak Shah [ ] [Course 2019] [YouTube]

    This is an Advanced Computer Vision which will expose graduate students to the cutting-edge research. In each class we will discuss one recent research paper related to active areas of current research, in particular employing Deep Learning. Computer vision has been very active area of research for many decades and researchers have been working on solving important challenging problems. During the last few years, Deep Learning involving Artificial Neural Networks has been disruptive force in computer vision. Employing deep learning, tremendous progress has been made in a very short time in solving difficult problems and very impressive results have obtained in image and video classification, localization, semantic segmentation, etc. New techniques, datasets, hardware and software libraries are emerging almost every day. Deep Computer vision is impacting research in Robotics, Natural Language understanding, Computer Graphics, multi-modal analysis etc.

  • Computer Vision I: Variational Methods (TU München), Prof. Daniel Cremers [ ] [Course] [YouTube]

    Variational Methods are among the most classical techniques for optimization of cost functions in higher dimension. Many challenges in Computer Vision and in other domains of research can be formulated as variational methods. Examples include denoising, deblurring, image segmentation, tracking, optical flow estimation, depth estimation from stereo images or 3D reconstruction from multiple views.

    In this class, I will introduce the basic concepts of variational methods, the Euler-Lagrange calculus and partial differential equations. I will discuss how respective computer vision and image analysis challenges can be cast as variational problems and how they can be efficiently solved. Towards the end of the class, I will discuss convex formulations and convex relaxations which allow to compute optimal or near-optimal solutions in the variational setting.

  • Computer Vision II: Multiple View Geometry (TU München), Prof. Daniel Cremers [ ] [Course] [YouTube]

    The lecture introduces the basic concepts of image formation - perspective projection and camera motion. The goal is to reconstruct the three-dimensional world and the camera motion from multiple images. To this end, one determines correspondences between points in various images and respective constraints that allow to compute motion and 3D structure. A particular emphasis of the lecture is on mathematical descriptions of rigid body motion and of perspective projection. For estimating camera motion and 3D geometry we will make use of both spectral methods and methods of nonlinear optimization.

  • Advanced Computer Vision (CBCSL), Aleix M. Martinez [ ] [YouTube]

  • Graduate Summer School on Computer Vision (IPAM at UCLA) [ ] [Course]

  • Photogrammetry I & II (University of Bonn), Cyrill Stachniss [ ] [Course] [YouTube]

  • Mobile Sensing And Robotics I (University of Bonn), Cyrill Stachniss [ ] [Course]

  • Mobile Sensing And Robotics II (University of Bonn), Cyrill Stachniss [ ] [Course] [YouTube]

  • Robot Mapping (University of Bonn), Cyrill Stachniss [ ] [Course] [YouTube]

    The lecture will cover different topics and techniques in the context of environment modeling with mobile robots. We will cover techniques such as SLAM with the family of Kalman filters, information filters, particle filters. We will furthermore investigate graph-based approaches, least-squares error minimization, techniques for place recognition and appearance-based mapping, and data association.

  • Biometrics (IIT Kanpur), Prof. Phalguni Gupta [ ] [Course] [YouTube]

    Introduction of Biometric traits and its aim, image processing basics, basic image operations, filtering, enhancement, sharpening, edge detection, smoothening, enhancement, thresholding, localization. Fourier Series, DFT, inverse of DFT. Biometric system, identification and verification. FAR/FRR, system design issues. Positive/negative identification. Biometric system security, authentication protocols, matching score distribution, ROC curve, DET curve, FAR/FRR curve. Expected overall error, EER, biometric myths and misrepresentations. Selection of suitable biometric. Biometric attributes, Zephyr charts, types of multi biometrics. Verification on multimodel system, normalization strategy, Fusion methods, Multimodel identification. Biometric system security, Biometric system vulnerabilities, circumvention, covert acquisition, quality control, template generation, interoperability, data storage. Recognition systems: Face,Signature, Fingerprint,Ear, Iris etc.

  • Deep Vision and Graphics (Yandex), YSDA fall'24 [ ] [Course]

    This 12-week course provides a comprehensive exploration of deep learning for computer vision, starting with foundational concepts like neural network basics, optimization, and backpropagation, and progressing to advanced topics such as convolutional neural networks (CNNs), transformers, and non-convolutional architectures like mixers and FFT convolutions. It covers practical applications including object detection, semantic segmentation, instance/panoptic segmentation, 2D/3D pose estimation, and representation learning tasks like face recognition and self-supervised learning. The course also dives into generative models, including generative adversarial networks (GANs), diffusion models, generative transformers, latent models (e.g., VQ-VAE, CLIP, DALL-E), and flow models. Advanced topics like shape/motion estimation, optical flow, neural radiance fields, and new view synthesis are addressed, blending theoretical insights with cutting-edge techniques in visualization, adversarial examples, and neural rendering.

    • Community Computer Vision Course (Hugging Face) [ ] [Course]

    This course provides a comprehensive journey through computer vision, starting with Unit 1: Fundamentals, which introduces the need, basics, and applications of computer vision, covering image formation, preprocessing, and feature extraction. Unit 2: Convolutional Neural Networks (CNNs) explores CNN architectures, pre-trained models, transfer learning, and fine-tuning. Unit 3: Vision Transformers examines transformer architectures like Swin, DETR, and CVT, comparing them to CNNs and their adaptation techniques. Unit 4: Multimodal Models focuses on text-vision fusion with models like CLIP, GroupViT, and BLIPM, emphasizing image-to-text and text-to-image tasks. Unit 5: Generative Models covers GANs, VAEs, and diffusion models for tasks like text-to-image and inpainting. Unit 6: Basic Computer Vision Tasks addresses image classification, object detection (YOLO), and segmentation (SAM), including relevant metrics. Unit 7: Video Processing explores temporal continuity, motion estimation, and video challenges. Unit 8: 3D Vision delves into scene rendering and reconstruction with NeRF and GQN. Unit 9: Model Optimization covers model compression, distillation, pruning, and TinyML for deployment. Unit 10: Synthetic Data Creation examines point clouds and diffusion models for synthetic datasets. Unit 11: Zero-Shot Learning explores generalization and transfer learning for tasks like zero-shot recognition. Unit 12: Ethics and Biases addresses ethical concerns, bias evaluation, and mitigation strategies in AI. Unit 13: Emerging Trends highlights innovative architectures like Retentive Network, Hiera, Hyena, and I-JEPA, offering a forward-looking perspective on computer vision advancements.

    • Computer Vision (University of Tübingen), Prof. Dr. Andreas Geiger [ ] [Course]

    This course, spanning April to July, provides a comprehensive exploration of computer vision and deep learning, blending theoretical foundations with practical applications. It begins with a math recap for deep learning and an introduction to computer vision history and image formation (geometric and photometric, including transformations and sensing pipelines). Subsequent lectures cover structure-from-motion, including two-frame methods, factorization, and bundle adjustment, followed by stereo reconstruction with block matching, siamese networks, and end-to-end learning. The course delves into probabilistic graphical models like Markov Random Fields, factor graphs, and belief propagation, with applications in stereo and optical flow, and extends to conditional random fields and deep structured models. Further topics include shape-from-shading, photometric stereo, volumetric fusion, and advanced coordinate-based networks like neural radiance fields (NeRF) and generative radiance fields. The course concludes with recognition tasks (image classification, semantic segmentation, object detection), self-supervised learning with contrastive learning and pretext tasks, and diverse topics like input optimization, compositional models, human body models, and deepfakes, supported by practical exercises throughout.

    • Advances in Computer Vision (MIT) [ ] [Course]

    This course, spanning February to May, provides a comprehensive exploration of computer vision and its applications, organized into three modules. Module 0: Introduction outlines the course, historical context, and the question "What is vision?" Module 1: Geometry, 3D, and 4D covers image formation via pinhole cameras, projective geometry, and image filtering (e.g., convolutions, Laplacian pyramids), alongside representation theory (steerable bases, invariant operators), geometric deep learning (equivariance, group convolutions), optical flow (including RAFT), point tracking, scene flow, SIFT, multi-view geometry (epipolar geometry, eight-point algorithm, bundle adjustment), and differentiable rendering (neural fields, Gaussian splatting, novel view synthesis), with a guest lecture on deep learning for 3D reconstruction. Module 2: Unsupervised Representation Learning and Generative Modeling explores representation learning (compression, denoising), generative modeling with diffusion models (including score matching, classifier-free guidance, and spectral perspectives), sequence generative models (auto-regressive models, diffusion forcing), and self-supervised learning, highlighted by a guest lecture. Module 3: Vision for Embodied Agents introduces robotic perception, vision-based robot control, and imitation learning from demonstrations. The course integrates practical exercises, problem sets, and a final project, blending classical vision techniques with advanced deep learning methods.

    * **OpenCV university courses (MIT)**  [ ]
    

[[Course]] (https://opencv.org/university/free-courses/)

Vision Language Model Bootcamp, PyTorch Bootcamp, TensorFlow Bootcamp, OpenCV Bootcamp

Deep Learning for Computer Vision

  • CS231n Convolutional Neural Networks for Visual Recognition (Stanford) [ ] [Course] [YouTube]

    This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

  • Deep Learning for Computer Vision (University of Michigan), Justin Johnson [ ] [Course]

    This course is a deep dive into details of neural-network based deep learning methods for computer vision. During this course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. We will cover learning algorithms, neural network architectures, and practical engineering tricks for training and fine-tuning networks for visual recognition tasks.

  • Convolutional Neural Networks, Prof. Andrew Ng [ ] [Course]

    This course will teach you how to build convolutional neural networks and apply it to image data. Thanks to deep learning, computer vision is working far better than just two years ago, and this is enabling numerous exciting applications ranging from safe autonomous driving, to accurate face recognition, to automatic reading of radiology images.

  • Convolutional Networks, Ian Goodfellow [ ] [YouTube]

  • Deep-Learning-For-Computer-Vision, NPTEL [ ] [Course]

    This 12-week course offers a comprehensive dive into computer vision and deep learning, starting with image formation, capture, and representation, alongside linear filtering, correlation, and convolution in Week 1. Week 2 focuses on visual features like edge, blob, and corner detection, exploring SIFT, SURF, HoG, and LBP. Week 3 covers visual matching with bag-of-words, VLAD, RANSAC, Hough transform, pyramid matching, and optical flow. Week 4 reviews deep learning fundamentals, including multi-layer perceptrons and backpropagation. Weeks 5 and 6 introduce convolutional neural networks (CNNs), their evolution (AlexNet, VGG, ResNets), and visualization techniques like Deep Dream, Neural Style Transfer, and Grad-CAM. Week 7 applies CNNs to recognition, verification (Siamese Networks, Triplet Loss), object detection (R-CNN, YOLO, SSD), and segmentation (FCN, U-Net, Mask R-CNN). Week 8 explores recurrent neural networks (RNNs) and CNN+RNN models for video understanding and action recognition. Week 9 introduces attention models, including spatial transformers, image captioning, and visual QA. Weeks 10 and 11 cover deep generative models like GANs, VAEs, CycleGANs, and PixelRNNs, with applications in image editing, superresolution, and 3D object generation. Week 12 concludes with recent trends like zero-shot, self-supervised learning, and reinforcement learning in vision, providing a robust foundation in both classical and cutting-edge computer vision techniques.

  • computer-vision-and-deep-learning-course, University of Oviedo [ ] [Course]

    The Second Quarter University Extension Course on computer vision offered by the University of Oviedo (online) focuses on equipping students with practical skills in digital image processing and deep learning, as outlined in a repository of Jupyter Notebooks and resources. It introduces the current landscape of computer vision within the artificial intelligence era, emphasizing tools and strategies for problem-solving with diverse data sources. The course covers OpenCV and Python for foundational image handling, basic image treatment, and advanced image processing techniques. It also includes machine learning with Scikit-learn and deep learning frameworks like TensorFlow, Keras, and PyTorch. Aimed at a broad audience, including those with computer science experience (potentially at a graduate level) unfamiliar with OpenCV and students from other fields interested in exploring computer vision, the course provides a hands-on approach to mastering essential methods and tools.

Human Vision and Perception

  • Sensory Systems 9.04 (MIT), Prof. Peter H. Schiller, Prof. M. Christian Brown [ ] [Course] [YouTube]

    This course examines the neural bases of sensory perception. The focus is on physiological and anatomical studies of the mammalian nervous system as well as behavioral studies of animals and humans. Topics include visual pattern, color and depth perception, auditory responses and sound localization, and somatosensory perception.

  • Visual Perception and the Brain (Duke University), Dale Purves [ ] [Course]

    Learners will be introduced to the problems that vision faces, using perception as a guide. The course will consider how what we see is generated by the visual system, what the central problem for vision is, and what visual perception indicates about how the brain works. The evidence will be drawn from neuroscience, psychology, the history of vision science and what philosophy has contributed. Although the discussions will be informed by visual system anatomy and physiology, the focus is on perception. We see the physical world in a strange way, and goal is to understand why.

  • High-level Vision (CBCSL) [ ] [YouTube]

Machine Learning

  • Machine Learning CS229 (Stanford), Prof. Andrew Ng [ ] [Course] [YouTube]

    This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs; VC theory; large margins); reinforcement learning and adaptive control. The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing.

  • Machine Learning CS156 (Caltech), Prof. Yaser Abu-Mostafa [ ] [Course] [YouTube]

    This is an introductory course by Caltech Professor Yaser Abu-Mostafa on machine learning that covers the basic theory, algorithms, and applications. Machine learning (ML) enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML techniques are widely applied in engineering, science, finance, and commerce to build systems for which we do not have full mathematical specification (and that covers a lot of systems). The course balances theory and practice, and covers the mathematical as well as the heuristic aspects.

  • Machine Learning for Computer Vision (Heidelberg University), Prof. Fred Hamprecht [ ] [Course] [YouTube]

    This course covers advanced machine learning methods allowing for so-called "structured prediction". The goal is to make multiple predictions that interact in a nontrivial way; and we take these interactions into account both during training and at test time.

  • Machine Learning for Robotics and Computer Vision (TU München), Dr. Rudolph Triebel [ ] [Course] [YouTube]

    In this lecture, the students will be introduced into the most frequently used machine learning methods in computer vision and robotics applications. The major aim of the lecture is to obtain a broad overview of existing methods, and to understand their motivations and main ideas in the context of computer vision and pattern recognition.

  • Machine Learning for Intelligent Systems CS4780 (Cornell), Prof. Killian Weiberger [ ] [Course] [YouTube]

    The goal of this course is to give an introduction to the field of machine learning. The course will teach you basic skills to decide which learning algorithm to use for what problem, code up your own learning algorithm and evaluate and debug it.

  • Introduction to Machine Learning and Pattern Recognition (CBCSL), Aleix M. Martinez [ ] [YouTube]

  • Applied Machine Learning COMS W4995 (Columbia), Andreas C. Müller [ ] [Course] [YouTube]

    This class offers a hands-on approach to machine learning and data science. The class discusses the application of machine learning methods like SVMs, Random Forests, Gradient Boosting and neural networks on real world dataset, including data preparation, model selection and evaluation. This class complements COMS W4721 in that it relies entirely on available open source implementations in scikit-learn and tensor flow for all implementations. Apart from applying models, we will also discuss software development tools and practices relevant to productionizing machine learning models.

  • Probabilistic and Statistical Machine Learning (University of Tübingen), Prof. Philipp Hennig, Prof. U. von Luxburg [ ] [Course] [YouTube]

    The focus of the lecture is on both algorithmic and theoretic aspects of machine learning. We will cover many of the standard algorithms and learn about the general principles and theoretic results for building good machine learning algorithms. Topics range from well-established results to very recent results.

  • Introduction to Machine Learning for Coders (fast.ai), Jeremy Howard [ ] [Course] [YouTube]

    Taught by Jeremy Howard (Kaggle's #1 competitor 2 years running, and founder of Enlitic). Learn the most important machine learning models, including how to create them yourself from scratch, as well as key skills in data preparation, model validation, and building data products.There are around 24 hours of lessons, and you should plan to spend around 8 hours a week for 12 weeks to complete the material. The course is based on lessons recorded at the University of San Francisco for the Masters of Science in Data Science program. We assume that you have at least one year of coding experience, and either remember what you learned in high school math, or are prepared to do some independent study to refresh your knowledge.

  • Introduction to Machine Learning ECE 5984 (Virginia Tech), Prof. Dhruv Batra [ ] [Course] [YouTube]

Deep Learning

  • Deep Learning CS230 (Stanford), Prof. Andrew Ng, Kian Katanforoosh [ ] [Course] [YouTube]

    Deep Learning is one of the most highly sought after skills in AI. In this course, you will learn the foundations of Deep Learning, understand how to build neural networks, and learn how to lead successful machine learning projects. You will learn about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more.

  • Deep Learning Specialization, Prof. Andrew Ng, Kian Katanforoosh [ ] [Course]

    In five courses, you will learn the foundations of Deep Learning, understand how to build neural networks, and learn how to lead successful machine learning projects. You will learn about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more. You will work on case studies from healthcare, autonomous driving, sign language reading, music generation, and natural language processing. You will master not only the theory, but also see how it is applied in industry. You will practice all these ideas in Python and in TensorFlow, which we will teach.

  • Deep Learning EE-559 (EPFL), François Fleuret [ ] [Course]

    This course is a thorough introduction to deep-learning, with examples in the PyTorch framework: machine learning objectives and main challenges, tensor operations, automatic differentiation, gradient descent, deep-learning specific techniques (batchnorm, dropout, residual networks), image understanding, generative models, adversarial generative models, recurrent models, attention models, NLP.

  • Introduction to Deep Learning 6.S191 (MIT), Alexander Amini and Ava Soleimany [ ] [Course] [YouTube]

    MIT's introductory course on deep learning methods with applications to computer vision, natural language processing, biology, and more! Students will gain foundational knowledge of deep learning algorithms and get practical experience in building neural networks in TensorFlow. Course concludes with a project proposal competition with feedback from staff and panel of industry sponsors. Prerequisites assume calculus (i.e. taking derivatives) and linear algebra (i.e. matrix multiplication), we'll try to explain everything else along the way! Experience in Python is helpful but not necessary.

  • Practical Deep Learning for Coders (fast.ai), Jeremy Howard [ ] [Course] [YouTube]

    Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD.

  • Deep Learning for Perception ECE 6504 (Virginia Tech), Prof. Dhruv Batra [ ] [Course] [YouTube]

    This course will expose students to cutting-edge research — starting from a refresher in basics of neural networks, to recent developments.

  • Deep Learning and Artificial Intelligence Lectures (MIT) [ ] [Course] [YouTube]

  • Introduction to Deep Learning 11-785 (Carnegie Mellon University) [ ] [Course] [YouTube]

    In this course we will learn about the basics of deep neural networks, and their applications to various AI tasks. By the end of the course, it is expected that students will have significant familiarity with the subject, and be able to apply Deep Learning to a variety of tasks. They will also be positioned to understand much of the current literature on the topic and extend their knowledge through further study.

    • 6.7960 Deep Learning (MIT EECS) [ ] [Course]

    Fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, graph nets, transformers), geometry and invariances in deep learning, backpropagation and automatic differentiation, learning theory and generalization in high-dimensions, and applications to computer vision, natural language processing, and robotics.

Computer Graphics

  • Computer Graphics CMU 15-462/662 (Carnegie Mellon University) [ ] [Website] [YouTube]

    Lecture videos for the introductory Computer Graphics class at Carnegie Mellon University.

  • Computer Graphics (Utrecht University), Wolfgang Huerst [ ] [YouTube]

    Recordings from an introductory lecture about computer graphics given by Wolfgang Hürst, Utrecht University, The Netherlands, from April 2012 till June 2012.

  • Computer Graphics ECS175 (UC Davis), Prof. Kenneth Joy [ ] [YouTube]

    Computer Graphics (ECS175) teaches the basic principles of 3-dimensional computer graphics. The focus will be the elementary mathematics techniques for positioning objects in three dimensional space, the geometric optics necessary to determine how light bounces off surfaces, and the ways to utilize a computer system and methods to implement the algorithms and techniques necessary to produce basic 3-dimensional illustrations. Detailed topics will include the following: transformational geometry, positioning of virtual cameras and light sources, hierarchical modeling of complex objects, rendering of complex models, shading algorithms, and methods for rendering and shading curved objects.

  • Computer Graphics CS184 (UC Berkeley), Ravi Ramamoorthi [ ] [Course]

    This course is an introduction to the foundations of 3-dimensional computer graphics. Topics covered include 2D and 3D transformations, interactive 3D graphics programming with OpenGL, shading and lighting models, geometric modeling using Bézier and B-Spline curves, computer graphics rendering including ray tracing and global illumination, signal processing for anti-aliasing and texture mapping, and animation and inverse kinematics. There will be an emphasis on both the mathematical and geometric aspects of graphics, as well as the ability to write complete 3D graphics programs.

  • Rendering / Ray Tracing Course (TU Wien), Károly Zsolnai-Fehér [ ] [Course] [YouTube]

    This course aims to give an overview of basic and state-of-the-art methods of rendering. Offline methods such as ray and path tracing, photon mapping and many other algorithms are introduced and various refinement are explained. The basics of the involved physics, such as geometric optics, surface and media interaction with light and camera models are outlined. The apparatus of Monte Carlo methods is introduced which is heavily used in several algorithms and its refinement in the form of stratified sampling and the Metropolis-Hastings method is explained.

About

A curated list of free, high-quality, university-level courses with video lectures related to the field of Computer Vision.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors