ARC Lab, Tencent PCG ORG

@TencentARC ·arc.tencent.com

86 repos
2.8k followers
0 following

Python 89%
Jupyter Notebook 8%
JavaScript 1%
HTML 1%

Members

yxgeee
wbhu
thuzhaowang
wondervictor
xinntao
bluestyle97
yeliudev
ARCer
xt4d

All public repos (86)

Show forks Show archived Sort

GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Python tool that restores blurry, damaged, or low-resolution face photos to sharp, detailed images using AI trained on realistic faces.

Python ★ 37k 1y ago
Explain →
PhotoMaker

PhotoMaker [CVPR 2024]

An AI tool from Tencent that generates new photos of a real person in any scene or style from just a few reference photos, keeping their face consistent across images without any model training step.

Jupyter Notebook ★ 10k 1y ago
Explain →
InstantMesh

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

Python ★ 4.4k 1y ago
Explain →
T2I-Adapter

T2I-Adapter

Python ★ 3.8k 2y ago
Explain →
Pixal3D

[SIGGRAPH 2026] Pixal3D: Pixel-Aligned 3D Generation from Images

Research project from Tencent ARC and Tsinghua that turns a single image into a textured 3D model by back-projecting each pixel into 3D space. SIGGRAPH 2026 paper, ships training code and a Gradio demo.

Python ★ 1.8k 28d ago
Explain →
BrushNet

[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"

Python ★ 1.7k 1y ago
Explain →
MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]

Python ★ 1.5k 1y ago
Explain →
SEED-Voken

SEED-Voken: A Series of Powerful Visual Tokenizers

Python ★ 1.0k 6mo ago
Explain →
SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model

Python ★ 885 1y ago
Explain →
MasaCtrl

[ICCV 2023] Consistent Image Synthesis and Editing

Python ★ 845 1y ago
Explain →
VideoPainter

[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"

Python ★ 611 1y ago
Explain →
BrushEdit

[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"

Python ★ 588 9mo ago
Explain →
ToonComposer

[ICLR 2026] Streamlining Cartoon Production with Generative Post-Keyframing

Python ★ 575 10mo ago
Explain →
LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

Python ★ 513 2y ago
Explain →
StereoCrafter

A framework to convert any 2D videos to immersive stereoscopic 3D

Python ★ 492 25d ago
Explain →
ColorFlow

The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization". ColorFlow：基于检索增强的图像序列上色

Python ★ 462 6mo ago
Explain →
GeometryCrafter

[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Python ★ 449 8mo ago
Explain →
Mix-of-Show

NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models

Python ★ 430 2y ago
Explain →
RollingForcing

[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time

Python ★ 424 7mo ago
Explain →
VerseCrafter

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

Python ★ 400 3mo ago
Explain →
SmartEdit

Official code of SmartEdit [CVPR-2024 Highlight]

Python ★ 375 2y ago
Explain →
AnimeSR

Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"

Python ★ 368 2y ago
Explain →
VQFR

ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

Python ★ 355 3y ago
Explain →
AnimeGamer

[ICCV 2025] AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Python ★ 346 1y ago
Explain →
DiTCtrl

[CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation"

Python ★ 325 1y ago
Explain →
AudioStory

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Jupyter Notebook ★ 302 9mo ago
Explain →
CustomNet

No description.

Python ★ 290 1y ago
Explain →
FreeSplatter

[ICCV 2025] FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction

JavaScript ★ 240 10mo ago
Explain →
UMT

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Python ★ 240 2y ago
Explain →
Track4World

[ECCV 2026] Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels

Python ★ 238 3d ago
Explain →
ARC-Hunyuan-Video-7B

Structured Video Comprehension of Real-World Shorts

Python ★ 238 9mo ago
Explain →
TokLIP

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Python ★ 236 10mo ago
Explain →
ViT-Lens

[CVPR 2024] ViT-Lens: Towards Omni-modal Representations

Python ★ 190 1y ago
Explain →
Moto

[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos

Python ★ 179 8mo ago
Explain →
MM-RealSR

Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"

Python ★ 179 2y ago
Explain →
MotionCrafter

[CVPR 2026 Highlight🔥] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

Python ★ 169 9d ago
Explain →
IC-Custom

[ICLR'26] IC-Custom: Diverse Image Customization via In-Context Learning

Python ★ 162 9mo ago
Explain →
GenCompositor

[ICLR 2026] GenCompositor: Generative Video Compositing with Diffusion Transformer

Python ★ 158 3mo ago
Explain →
ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"

Python ★ 155 1y ago
Explain →
TimeLens

[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs

Python ★ 146 1mo ago
Explain →
DeSRA

Official codes for DeSRA (ICML 2023)

Python ★ 143 2y ago
Explain →
MCQ

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Python ★ 142 3y ago
Explain →
MindOmni

[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO

Python ★ 140 8mo ago
Explain →
DI-PCG

Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".

Python ★ 138 1y ago
Explain →
NVComposer

[CVPR 2025] Boosting Generative Novel View Synthesis with Sparse and Unposed Images

Python ★ 130 1y ago
Explain →
CubeComposer

[CVPR 2026] Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video

Python ★ 123 2mo ago
Explain →
FAIG

NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution

Python ★ 121 3y ago
Explain →
FluxKits

No description.

Python ★ 110 1y ago
Explain →
ArcNerf

Nerf and extensions in all

Jupyter Notebook ★ 108 3y ago
Explain →
SEED-Bench-R1

No description.

Python ★ 101 1y ago
Explain →
mllm-npu

mllm-npu: training multimodal large language models on Ascend NPUs

Python ★ 95 1y ago
Explain →
Video-Holmes

[ECCV 2026] Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?

Python ★ 94 11mo ago
Explain →
Divot

Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)

Python ★ 87 1y ago
Explain →
GRPO-CARE

[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning

Python ★ 84 1y ago
Explain →
SurfelNeRF

SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes

★ 79 2y ago
Explain →
RepSR

Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"

★ 76 4y ago
Explain →
DSR_Suite

No description.

Jupyter Notebook ★ 74 2mo ago
Explain →
HOSNeRF

HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video

Python ★ 70 2y ago
Explain →
FastRealVSR

Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"

★ 60 3y ago
Explain →
GVT

Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".

Python ★ 60 3y ago
Explain →
ConMIM

Official codes for ConMIM (ICLR 2023)

Python ★ 59 3y ago
Explain →
TVTS

Turning to Video for Transcript Sorting

Jupyter Notebook ★ 50 2y ago
Explain →
BEBR

Official code for "Binary embedding based retrieval at Tencent"

Python ★ 45 2y ago
Explain →
ARC-Chapter

Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

★ 43 7mo ago
Explain →
ViSFT

No description.

Python ★ 39 2y ago
Explain →
SGAT4PASS

[IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

Python ★ 37 3y ago
Explain →
pi-Tuning

Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.

Python ★ 34 2y ago
Explain →
BTS

BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild

★ 34 2y ago
Explain →
FLM

Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)

Python ★ 33 3y ago
Explain →
Efficient-VSR-Training

Codes for "Accelerating the Training of Video Super-Resolution"

★ 31 4y ago
Explain →
DTN

Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.

Python ★ 30 4y ago
Explain →
OpenCompatible

OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.

Python ★ 26 4y ago
Explain →
Plot2Code

No description.

Python ★ 24 1y ago
Explain →
BlobCtrl

[SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing

Python ★ 24 7mo ago
Explain →
SFDA

No description.

Python ★ 22 3y ago
Explain →
OmniScript

OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video

★ 17 1mo ago
Explain →
TaCA

Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".

★ 17 3y ago
Explain →
common_trainer

Common template for pytorch project. Easy to extent and modify for new project.

Python ★ 14 3y ago
Explain →
TransFusion

The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.

★ 10 4y ago
Explain →
ArcVis

Visualization of 3d and 2d components interactively.

Jupyter Notebook ★ 7 3y ago
Explain →
BasicVQ-GEN

No description.

★ 7 3y ago
Explain →
Sculpt4D

Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers. CVPR‘2026

Python ★ 6 15d ago
Explain →
VTLayout

No description.

★ 4 2y ago
Explain →
vllm

vllm for ARC-Hunyuan-Video-7B

Python ★ 3 8mo ago
Explain →
TencentARC.github.io

No description.

HTML ★ 1 10mo ago
Explain →
Forward-Warp ⑂

An optical flow forward warp's lib with backpropagation using pytorch.

Python ★ 0 1mo ago
Explain →

No repos match these filters.

Made with gitmyhub, a BitVibe Labs product. · Explanations powered by explaingit.

GitHub is a trademark of GitHub, Inc. gitmyhub is independent and not affiliated with or endorsed by GitHub. Public data is shown via the GitHub API.