AIAny - Computer Vision Tutorials

X-AnyLabeling

2023

Wei Wang, CVHub

X-AnyLabeling is a powerful annotation tool integrated with an AI engine for fast and automatic labeling. Designed for multi-modal data engineers, it offers industrial-grade solutions for complex tasks. Supports images and videos, GPU acceleration, custom models, one-click inference for all task images, and import/export formats like COCO, VOC, YOLO. Handles classification, detection, segmentation, captioning, rotation, tracking, estimation, OCR, VQA, grounding, etc., with various annotation styles including polygons, rectangles, rotated boxes.

github ai-tools vision ai-image ai-video+4

MiniMind-V

2024

Jingyao Gong (jingyaogong)

MiniMind-V is an open-source tiny visual-language model (VLM) project that demonstrates how to train a 26M-parameter multimodal VLM from scratch quickly and cheaply (example: ~1 hour / single 3090 GPU and very low rental cost). The repo provides end-to-end code for data cleaning, pretraining, supervised fine-tuning (SFT), evaluation and demo, using CLIP as the visual encoder and MiniMind as the base LLM.

vision pytorch github llm ai-train+2

CS231n: Deep Learning for Computer Vision

2015

Fei-Fei Li

Stanford’s 10-week CS231n dives from first principles to state-of-the-art vision research, starting with image-classification basics, loss functions and optimization, then building from fully-connected nets to modern CNNs, residual and vision-transformer architectures. Lectures span training tricks, regularization, visualization, transfer learning, detection, segmentation, video, 3-D and generative models. Three hands-on PyTorch assignments guide students from k-NN/SVM through deep CNNs and network visualization, and a capstone project lets teams train large-scale models on a vision task of their choice, graduating with the skills to design, debug and deploy real-world deep-learning pipelines.

foundation vision 30u30 course tutorial

Category

Explore by categories

All

AI Leaderboard

AI Agent Tutorials

AI Coding Tutorials

AI Agent Papers

Chatbot

Machine Learning Foundation Books

AI Train

AI Deploy

AI Client

Machine Learning Foundation Papers

Machine Learning Foundation Tutorials

AI Image Demos

AI Agent

Large Language Model Tutorials

Large Language Model Papers

Machine Learning Engineering Papers

Computer Vision Tutorials

Computer Vision Papers

Natural Language Processing Papers

Reinforcement Learning Papers

Speech Technology Papers

AI API

AI Coding

AI Image

AI Video

MLOps

MCP Client

MCP Server

AI Video Papers

AI Audio

AI Infra

X-AnyLabeling

MiniMind-V

CS231n: Deep Learning for Computer Vision