Hi there, I'm Ye Liu I'm a Ph.D. Candidate at the Department of Computing, The Hong Kong Polytechnic University. Research Interests Multi-modal Large Language Models Agentic Visual Reasoning Video Understanding…
Hi there, I'm Ye Liu
I'm a Ph.D. Candidate at the Department of Computing, The Hong Kong Polytechnic University.
Research Interests
- Multi-modal Large Language Models
- Agentic Visual Reasoning
- Video Understanding
GitHub Statistics
-
LemonJournal
A WeChat mini program demo based on Wafer2 framework - 微信小程序Demo: 柠檬手帐 - 图片编辑应用,支持图片和文字的移动、旋转、缩放、保存编辑状态并生成预览图
JavaScript ★ 394 4y agoExplain → -
VideoMind
🧠 VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning (ICLR 2026)
Python ★ 341 4mo agoExplain → -
R2-Tuning
🌀 R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)
Python ★ 92 1y agoExplain → -
CATNet
🛰️ Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images (TNNLS 2024)
Python ★ 88 2y agoExplain → -
SeatKiller-GUI
A GUI library seat management tool for Wuhan University - 武汉大学图书馆抢座软件,支持定时抢座、捡漏模式、座位改签、邮件提醒、座位锁定
C# ★ 87 3y agoExplain → -
FakeHanMove
一个让你躺在床上也能跑完汉姆的小工具
Python ★ 37 4y agoExplain → -
ConsNet
🚴♂️ ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection (MM 2020)
Python ★ 35 11mo agoExplain → -
nncore
📦 A lightweight machine learning toolkit for researchers, providing common model design & learning functionalities.
Python ★ 29 11mo agoExplain → -
SeatKiller
A library seat booking script for Wuhan University.
Python ★ 26 4y agoExplain → -
weapp-inputbox
A modularized modal inputbox component for WeChat mini programs.
JavaScript ★ 16 4y agoExplain → -
FEEDIE
An intelligent feeding robot with voice control and visual servo support.
C++ ★ 8 4y agoExplain → -
UMT
🎬 UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection (CVPR 2022)
★ 7 4y agoExplain → -
geoviewer
An online, lightweight, and user-friendly geographical information visualizer.
JavaScript ★ 7 4y agoExplain → -
easyresume
A cross-platform and responsive online resume management system.
PHP ★ 5 4y agoExplain → -
geoviewer-server
Server side code of GeoViewer.
PLpgSQL ★ 3 4y agoExplain → -
yeliudev
No description.
★ 0 6mo agoExplain → -
Awesome-Multimodal-Large-Language-Models ⑂
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
★ 0 1y agoExplain → -
bb-8
An online dynamic BB-8 robot in STAR WARS made with CSS.
CSS ★ 0 3y agoExplain → -
navigator-plus-plus
A cross-platform ESRI e00 file visualizer & shortest path generator.
C++ ★ 0 4y agoExplain → -
trust-game
A GUI trust game experiment based on PyQt5.
Python ★ 0 5y agoExplain → -
PathGenerator
A cross-platform ESRI e00 file visualizer & shortest path generator.
Java ★ 0 4y agoExplain → -
FoodLeader
A WeChat mini program demo for restaurant recommendation.
JavaScript ★ 0 4y agoExplain → -
BmpEditor
A command line tool for raster data transformation.
C++ ★ 0 4y agoExplain →
No repos match these filters.