nanochat is a full-stack, minimal codebase for training, fine-tuning, evaluating, and deploying a ChatGPT-like large language model (LLM) from scratch on a single 8xH100 GPU node for under $100.
MiniMind is an open-source GitHub project that enables users to train a 26M-parameter tiny LLM from scratch in just 2 hours with a cost of 3 RMB. It provides native PyTorch implementations for Tokenizer training, pretraining, supervised fine-tuning (SFT), LoRA, DPO, PPO/GRPO reinforcement learning, and MoE architecture with vision multimodal extensions. It includes high-quality open datasets, supports single-GPU training, and is compatible with Transformers, llama.cpp, and other frameworks, ideal for LLM beginners.
The OpenAI Cookbook is an open-source GitHub repository from OpenAI that provides example code, guides, and recipes for using the OpenAI API. It contains practical examples covering prompt engineering, text generation, embeddings, retrieval-augmented generation (RAG), image generation, fine-tuning, integrations, and more. Most examples are in Python and designed to help developers learn and integrate the API quickly.
A free 21-lesson course by Microsoft Cloud Advocates teaching beginners the fundamentals of building Generative AI applications. It covers concepts like Generative AI and LLMs, prompt engineering, chat apps, image generation, security, and more, with Python and TypeScript code samples supporting Azure OpenAI, OpenAI API, and open-source models. Features multi-language support, video intros, extra resources, and a Discord community.
An open-source GitHub repository by Sebastian Raschka that contains the official code for the book "Build A Large Language Model (From Scratch)". It provides step-by-step PyTorch implementations to build, pretrain, and finetune a GPT-like LLM for educational purposes, along with exercises, bonus material, and companion video content.
The Claude Cookbooks is a GitHub repository by Anthropic featuring a collection of notebooks and recipes that demonstrate fun and effective ways to use the Claude AI model. It offers copy-paste Python code snippets for developers to integrate into projects, covering topics like classification, retrieval-augmented generation, tool use, third-party integrations, multimodal capabilities, and advanced techniques.
Awesome LLM Apps is a curated open-source repository collecting awesome LLM applications built with RAG, AI Agents, Multi-agent Teams, MCP, Voice Agents, and more, using models from OpenAI, Anthropic, Gemini, xAI, and open-source alternatives like Qwen or Llama that can run locally.
AI Engineering Hub is a comprehensive GitHub repository offering in-depth tutorials and 93+ production-ready projects on LLMs, RAGs, AI agents, and real-world AI applications for all skill levels.
An interactive prompt engineering tutorial released by Anthropic. The GitHub repository provides a step-by-step course (9 chapters + appendix) with lessons and hands-on exercises for building and troubleshooting prompts for Claude. It uses Claude 3 Haiku for examples, includes example playgrounds and an answer key, and is targeted at people who want to learn practical prompt design and common failure modes.
Official code repository for the O'Reilly book "Hands-On Large Language Models" by Jay Alammar and Maarten Grootendorst. It provides runnable notebooks, visual explanations, and practical examples across chapters covering tokens and embeddings, transformer internals, text classification, semantic search, fine-tuning, multimodal models, and more. Recommended to run in Google Colab for easy setup.
Foundations of LLMs is an open-source book by the ZJU-LLMs team that teaches fundamentals and advanced topics of large language models. It covers language model basics, LLM architecture evolution, prompt engineering, parameter-efficient fine-tuning, model editing, and retrieval-augmented generation. The repo provides chapter PDFs, paper lists, and is updated monthly.
This tutorial offers a detailed, line-by-line PyTorch implementation of the Transformer model introduced in "Attention Is All You Need." It elucidates the model's architecture—comprising encoder-decoder structures with multi-head self-attention and feed-forward layers—enhancing understanding through annotated code and explanations. This resource serves as both an educational tool and a practical guide for implementing and comprehending Transformer-based models.