gitmyhub

anomaly-detection-resources

Python ★ 9.3k updated 3mo ago

Anomaly detection related books, papers, videos, and toolboxes. Last update late 2025 for LLM and VLM works!

A curated reading list of academic papers, textbooks, lectures, benchmark datasets, and open-source Python libraries covering anomaly and outlier detection across tabular, time series, and graph data.

Pythonsetup: easycomplexity 1/5

This repository is a curated reading list for anyone who wants to learn about anomaly detection, the practice of finding data points that behave unusually compared to everything else. The field goes by two names: anomaly detection and outlier detection. They mean the same thing. Examples of where it matters include catching fraudulent credit card charges, spotting network intrusions, and flagging defective parts coming off a production line.

The repository does not contain runnable code. It is a structured collection of references: textbooks, academic papers, recorded lectures and seminars, datasets you can use for testing, and links to open-source software libraries. There are also pointers to the major research conferences and journals where new work in this area gets published. The author, a researcher at USC, maintains it alongside a separate Python library called PyOD and a benchmarking suite called ADBench, both of which appear in the resource links.

The paper section is extensive and organized by topic. Separate sections cover methods that use large language models for anomaly detection, techniques that work with limited labeled data, approaches based on neural networks, methods designed for high-dimensional data, time series detection, graph-based detection, and surveys that give broad overviews of the field. Each entry lists the paper title, the venue where it was published, the year, and links to the PDF and code where available.

The toolbox section groups software by data type. There are libraries for tabular data outlier detection, tools specifically for time series, tools for graph-structured data, and a real-time option built around Elasticsearch. The datasets section lists public collections commonly used to evaluate detection methods.

This is a reference resource, not a starter project. If you are trying to understand what anomaly detection is, where it is used, what the main techniques are, or which software exists to do it, this list is a useful map of the territory. The repository was last updated in late 2025 with new entries covering large language model and vision-language model approaches. The full README is longer than what was shown.

Where it fits