gitmyhub

video-SALMONN-2

Python ★ 197 updated 3mo ago

video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is developed by the Department of Electronic Engineering at Tsinghua University and ByteDance.

No plain-English explanation yet — one is being written right now. Check back in a minute.