LogoAIAny
  • Search
  • Collection
  • Category
  • Tag
  • Blog
LogoAIAny

Tag

Explore by tags

LogoAIAny

Learn Anything about AI in one site.

support@aiany.app
Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
Company
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2025 All Rights Reserved.
  • All

  • 30u30

  • ASR

  • ChatGPT

  • GNN

  • IDE

  • RAG

  • ai-agent

  • ai-api

  • ai-api-management

  • ai-client

  • ai-coding

  • ai-demos

  • ai-development

  • ai-framework

  • ai-image

  • ai-image-demos

  • ai-inference

  • ai-leaderboard

  • ai-library

  • ai-rank

  • ai-serving

  • ai-tools

  • ai-train

  • ai-video

  • ai-workflow

  • AIGC

  • alibaba

  • amazon

  • anthropic

  • audio

  • blog

  • book

  • bytedance

  • chatbot

  • chemistry

  • claude

  • course

  • deepmind

  • deepseek

  • engineering

  • foundation

  • foundation-model

  • gemini

  • google

  • gradient-booting

  • grok

  • huggingface

  • LLM

  • math

  • mcp

  • mcp-client

  • mcp-server

  • meta-ai

  • microsoft

  • mlops

  • NLP

  • nvidia

  • openai

  • paper

  • physics

  • plugin

  • RL

  • science

  • sora

  • translation

  • tutorial

  • vibe-coding

  • video

  • vision

  • xAI

  • xai

Icon for item

Nano Banana

2025
Google DeepMind, Google AI Studio

Nano Banana (aka Gemini 2.5 Flash Image) is a state-of-the-art image generation and editing model from Google. It enables users to blend multiple images, maintain character consistency, perform targeted transformations via natural language, use world knowledge in image editing, and includes invisible SynthID watermarking to mark AI-generated or edited images.

ai-toolsai-imagevision
Icon for item

Seedream

2025
ByteDance Seed

As a new-generation image creation model, Seedream 4.0 integrates image generation and image editing capabilities into a single, unified architecture. It supports multimodal inputs, reference images, and can produce high-definition images up to 4K with fast inference speed.

ai-toolsai-imagevision
Icon for item

Veo

2024
Google DeepMind

Veo is a state-of-the-art video generation model developed by Google DeepMind, designed to empower filmmakers and storytellers.

ai-toolsai-videovision
Icon for item

FLUX.1

2024
Black Forest Labs

Amazing AI models from the Black Forest.

ai-toolsai-imagevision
Icon for item

Midjourney

2022
Midjourney, Inc.

An independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.

ai-toolsai-imagevision
Icon for item

Runway

2023
Runway AI, Inc.

With Runway Gen-4, you are now able to precisely generate consistent characters, locations and objects across scenes. Simply set your look and feel and the model will maintain coherent world environments while preserving the distinctive style, mood and cinematographic elements of each frame. Then, regenerate those elements from multiple perspectives and positions within your scenes.

ai-toolsai-videovision
Icon for item

KlingAI

2024
Kuaishou Technology

Kling AI, tools for creating imaginative images and videos, based on state-of-art generative AI methods.

ai-toolsai-imageai-videovision
Icon for item

Sora

2024
OpenAI

OpenAI's video generation model.Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.

ai-toolsai-videovision

ImageNet Classification with Deep Convolutional Neural Networks

2012
Alex Krizhevsky, Ilya Sutskever +1

The 2012 paper “ImageNet Classification with Deep Convolutional Neural Networks” by Krizhevsky, Sutskever, and Hinton introduced AlexNet, a deep CNN that dramatically improved image classification accuracy on ImageNet, halving the top-5 error rate from \~26% to \~15%. Its innovations — like ReLU activations, dropout, GPU training, and data augmentation — sparked the deep learning revolution, laying the foundation for modern computer vision and advancing AI across industries.

vision30u30paperfoundation

Generative Adversarial Networks

2014
Ian J. Goodfellow, Jean Pouget-Abadie +6

The 2014 paper “Generative Adversarial Nets” (GAN) by Ian Goodfellow et al. introduced a groundbreaking framework where two neural networks — a generator and a discriminator — compete in a minimax game: the generator tries to produce realistic data, while the discriminator tries to distinguish real from fake. This approach avoids Markov chains and approximate inference, relying solely on backpropagation. GANs revolutionized generative modeling, enabling realistic image, text, and audio generation, sparking massive advances in AI creativity, deepfake technology, and research on adversarial training and robustness.

visionAIGCpaperfoundation

CS231n: Deep Learning for Computer Vision

2015
Fei-Fei Li

Stanford’s 10-week CS231n dives from first principles to state-of-the-art vision research, starting with image-classification basics, loss functions and optimization, then building from fully-connected nets to modern CNNs, residual and vision-transformer architectures. Lectures span training tricks, regularization, visualization, transfer learning, detection, segmentation, video, 3-D and generative models. Three hands-on PyTorch assignments guide students from k-NN/SVM through deep CNNs and network visualization, and a capstone project lets teams train large-scale models on a vision task of their choice, graduating with the skills to design, debug and deploy real-world deep-learning pipelines.

foundationvision30u30coursetutorial

Multi-Scale Context Aggregation by Dilated Convolutions

2015
Fisher Yu, Vladlen Koltun

This paper introduces a novel module for semantic segmentation using dilated convolutions, which enables exponential expansion of the receptive field without losing resolution. By aggregating multi-scale contextual information efficiently, the proposed context module significantly improves dense prediction accuracy when integrated into existing architectures. The work has had a lasting impact on dense prediction and semantic segmentation, laying the foundation for many modern segmentation models.

30u30papervision
  • Previous
  • 1
  • 2
  • Next