-
VAR
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Jupyter Notebook ★ 8.7k 7mo agoExplain → -
ByteTrack
[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Python ★ 6.5k 2y agoExplain → -
LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Python ★ 2.0k 1y agoExplain → -
Infinity
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Python ★ 1.6k 2mo agoExplain → -
GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
Python ★ 1.2k 1y agoExplain → -
Waver
Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.
★ 942 9mo agoExplain → -
InfinityStar
[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation
Python ★ 767 2mo agoExplain → -
Liquid
(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators
Python ★ 643 19d agoExplain → -
VNext ▣
Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and IDOL(ECCV Oral))
Python ★ 617 2y agoExplain → -
Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Python ★ 587 2y agoExplain → -
UniTok
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
Python ★ 527 7mo agoExplain → -
FlashVideo
[AAAI-2026]FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
Python ★ 459 1y agoExplain → -
Alive
[Tech Report] Alive: A Unified Audio-Video Generation Model
★ 457 2mo agoExplain → -
OmniTokenizer
[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.
Python ★ 324 1y agoExplain → -
UniRef
[ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces
Python ★ 238 1y agoExplain → -
GenerateU
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
Python ★ 196 1y agoExplain → -
vaex
🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook
Python ★ 107 2y agoExplain → -
BitVAE
official training and inference code of bitwise tokenizer
Python ★ 72 1y agoExplain → -
.github
No description.
★ 0 7mo agoExplain → -
flashvideo-page
No description.
HTML ★ 0 1y agoExplain → -
infinity.project
No description.
HTML ★ 0 1y agoExplain →
No repos match these filters.