CS231n: Deep Learning for Computer Vision

Stanford’s 10-week CS231n dives from first principles to state-of-the-art vision research, starting with image-classification basics, loss functions and optimization, then building from fully-connected nets to modern CNNs, residual and vision-transformer architectures. Lectures span training tricks, regularization, visualization, transfer learning, detection, segmentation, video, 3-D and generative models. Three hands-on PyTorch assignments guide students from k-NN/SVM through deep CNNs and network visualization, and a capstone project lets teams train large-scale models on a vision task of their choice, graduating with the skills to design, debug and deploy real-world deep-learning pipelines.

Visit Website

Introduction

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the final assignment will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice. Through multiple hands-on assignments and the final course project, students will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.

Back

Information

Websitecs231n.github.io
AuthorsFei-Fei Li
Published date2015/01/01

More Items

The First Law of Complexodynamics

2011

Scott Aaronson

This post explores why physical systems’ “complexity” rises, peaks, then falls over time, unlike entropy, which always increases. Using Kolmogorov complexity and the notion of “sophistication,” the author proposes a formal way to capture this pattern, introducing the idea of “complextropy” — a complexity measure that’s low in both highly ordered and fully random states but peaks during intermediate, evolving phases. He suggests using computational resource bounds to make the measure meaningful and proposes both theoretical and empirical (e.g., using file compression) approaches to test this idea, acknowledging it as an open problem.

foundation blog 30u30 tutorial

Understanding LSTM Networks

2015

Christopher Olah

This tutorial explains how Long Short-Term Memory (LSTM) networks address the limitations of traditional Recurrent Neural Networks (RNNs), particularly their difficulty in learning long-term dependencies due to issues like vanishing gradients. LSTMs introduce a cell state that acts as a conveyor belt, allowing information to flow unchanged, and utilize gates (input, forget, and output) to regulate the addition, removal, and output of information. This architecture enables LSTMs to effectively capture and maintain long-term dependencies in sequential data

foundation blog 30u30 tutorial

The Unreasonable Effectiveness of Recurrent Neural Networks

2015

Andrej Karpathy

This tutorial explores the surprising capabilities of Recurrent Neural Networks (RNNs), particularly in generating coherent text character by character. It delves into how RNNs, especially when implemented with Long Short-Term Memory (LSTM) units, can learn complex patterns and structures in data, enabling them to produce outputs that mimic the style and syntax of the training material. The discussion includes the architecture of RNNs, their ability to handle sequences of varying lengths, and the challenges associated with training them, such as the vanishing gradient problem. Through various examples, the tutorial illustrates the potential of RNNs in tasks like language modeling and sequence prediction.

30u30 foundation blog tutorial