-
Grounded-Segment-Anything ★ PINNED
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Jupyter Notebook ★ 18k 1y agoExplain → -
detrex ★ PINNED
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
Python ★ 2.3k 9mo agoExplain → -
GroundingDINO ★ PINNED
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Python ★ 10k 1y agoExplain → -
OpenSeeD ★ PINNED
[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"
Python ★ 759 2y agoExplain → -
MaskDINO ★ PINNED
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
Python ★ 1.5k 2y agoExplain → -
DINO ★ PINNED
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Python ★ 2.8k 1y agoExplain → -
Grounded-SAM-2
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Jupyter Notebook ★ 3.6k 7mo agoExplain → -
DWPose
"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)
Python ★ 2.8k 2y agoExplain → -
T-Rex
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Python ★ 2.7k 8mo agoExplain → -
Rex-Omni
[CVPR2026] Detect Anything via Next Point Prediction
Jupyter Notebook ★ 1.5k 3mo agoExplain → -
awesome-detection-transformer
Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)
★ 1.4k 1y agoExplain → -
DINO-X-API
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
Python ★ 1.4k 11mo agoExplain → -
Grounding-DINO-1.5-API
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
Python ★ 1.1k 1y agoExplain → -
Motion-X
[NeurIPS 2023] Official implementation of the paper "Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset"
Python ★ 874 1y agoExplain → -
X-Pose
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
Python ★ 810 1y agoExplain → -
OSX
[CVPR 2023] Official implementation of the paper "One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer"
Python ★ 793 1y agoExplain → -
DN-DETR
[CVPR 2022 Oral] Official implementation of DN-DETR
Python ★ 605 2y agoExplain → -
DAB-DETR
[ICLR 2022] Official implementation of the paper "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR"
Jupyter Notebook ★ 579 3y agoExplain → -
MotionLLM
[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos
Python ★ 386 1y agoExplain → -
HumanTOMATO
[ICML 2024] 🍅HumanTOMATO: Text-aligned Whole-body Motion Generation
Python ★ 364 2y agoExplain → -
HumanSD
[ICCV 2023] The official implementation of paper "HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation"
Python ★ 307 2y agoExplain → -
HumanArt
[CVPR 2023] The official implementation of CVPR 2023 paper "Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes"
★ 282 2y agoExplain → -
TAPTR
[ECCV 2024 & NeurIPS 2024 & ICLR 2026] Official implementation of the paper TAPTR & TAPTRv2 & TAPTRv3
★ 280 4mo agoExplain → -
deepdataspace
The Go-To Choice for CV Data Visualization, Annotation, and Model Analysis.
TypeScript ★ 263 2mo agoExplain → -
Stable-DINO
[ICCV 2023] Official implementation of the paper "Detection Transformer with Stable Matching"
Python ★ 242 2y agoExplain → -
ChatRex
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Python ★ 214 8mo agoExplain → -
Lite-DETR
[CVPR 2023] Official implementation of the paper "Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR"
Python ★ 209 3y agoExplain → -
DreamWaltz
[NeurIPS 2023] Official implementation of the paper "DreamWaltz: Make a Scene with Complex 3D Animatable Avatars".
Python ★ 190 1y agoExplain → -
ED-Pose
[ICLR 2023] Official implementation of the paper "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "
Python ★ 188 2y agoExplain → -
3D-deformable-attention
[ICCV 2023] Official implementation of the paper "DFA3D: 3D Deformable Attention For 2D-to-3D Feature Lifting"
Python ★ 186 1y agoExplain → -
RexSeek
[ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
Python ★ 184 8mo agoExplain → -
Rex-Thinker
[ICLR-2026] Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning
Python ★ 151 11mo agoExplain → -
MP-Former
[CVPR 2023] Official implementation of the paper: MP-Former: Mask-Piloted Transformer for Image Segmentation
Python ★ 142 2y agoExplain → -
SceneMaker
[CVPR 2026] Implementation of paper "SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model"
Python ★ 137 1mo agoExplain → -
DINO-X-MCP
Official DINO-X Model Context Protocol (MCP) server that empowers LLMs with real-world visual perception through image object detection, localization, and captioning APIs.
TypeScript ★ 112 3d agoExplain → -
Click-Pose
[ICCV 2023] Official implementation of the paper "Neural Interactive Keypoint Detection"
Python ★ 88 2y agoExplain → -
DiffHOI
Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"
Python ★ 68 2y agoExplain → -
SegDINO3D
[AAAI 2026] Official implementation of the paper ”SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features“
Python ★ 59 5mo agoExplain → -
DQ-DETR
[AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
★ 59 3y agoExplain → -
DisCo-CLIP
Official PyTorch implementation of the paper "DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training".
Python ★ 59 2y agoExplain → -
V-Reflection
Related code, checkpoints and project page for V-Reflection
Python ★ 58 2mo agoExplain → -
LipsFormer
No description.
Python ★ 44 3y agoExplain → -
TOSS
[ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"
Python ★ 24 2y agoExplain → -
hana
Implementation and checkpoints of Imagen, Google's text-to-image synthesis neural network, in Pytorch
Python ★ 18 3y agoExplain → -
MotionCLR
[Arxiv 2024] MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms
Python ★ 17 1y agoExplain → -
SegVGGT
Official implementation of the paper "SegVGGT: Joint 3D Reconstruction and Instance Segmentation from Multi-View Images"
Python ★ 14 1mo agoExplain → -
IYFC
No description.
C++ ★ 10 2y agoExplain → -
detrex-storage
No description.
★ 4 1y agoExplain → -
HandOSweb
No description.
HTML ★ 2 1y agoExplain →
No repos match these filters.