gitmyhub

Awesome-Diffusion-Models

HTML ★ 12k updated 1y ago

A collection of resources and papers on Diffusion Models

A curated index of research papers, blog posts, videos, and runnable notebooks on diffusion models, the AI technology behind image, audio, and content generation tools.

setup: easycomplexity 1/5

Diffusion models are a class of AI that generate new images, audio, and other content by learning how to gradually add and remove random noise from data. This repository is a curated index of research papers, blog posts, videos, lectures, and runnable tutorials on the topic, organized into sections so that someone can find resources matching their background and area of interest.

The collection spans a wide range of applications. On the vision side, papers cover image generation, classification, segmentation, image translation, medical imaging, 3D vision, and adversarial robustness. For audio, the list includes work on generation, voice conversion, speech enhancement, sound separation, and text-to-speech synthesis. There are also dedicated sections on natural language, time series forecasting, graph generation, molecular design, and reinforcement learning, showing how broadly the underlying technique has been applied across different fields.

Beyond technical papers, the repository links to introductory blog posts written to make the core ideas accessible without assuming a graduate-level math background. Companion resources include YouTube videos, recorded university lectures, and Jupyter notebooks that let someone run diffusion experiments directly in the browser. A separate companion website hosts a version of the list that may be more current than what appears on the GitHub page.

The list is maintained under an MIT license. It is not a software package and contains no runnable code of its own. Its purpose is to serve as a reference index for anyone who wants to understand where diffusion model research stands, find a starting point for learning, or locate a specific paper on a particular sub-topic.

The full README is longer than what was shown.

Where it fits