xzf-thu

@xzf-thu

8 repos
90 followers
1 following

Python 83%
HTML 17%

117 contributions in the last year

6-day longest streak

Jun 2025

89101112131415161718192021222324252627282930

Jul 2025

12345678910111213141516171819202122232425262728293031

Aug 2025

12345678910111213141516171819202122232425262728293031

Sep 2025

123456789101112131415161718192021222324252627282930

Oct 2025

12345678910111213141516171819202122232425262728293031

Nov 2025

123456789101112131415161718192021222324252627282930

Dec 2025

12345678910111213141516171819202122232425262728293031

Jan 2026

12345678910111213141516171819202122232425262728293031

Feb 2026

12345678910111213141516171819202122232425262728

Mar 2026

12345678910111213141516171819202122232425262728293031

Apr 2026

123456789101112131415161718192021222324252627282930

May 2026

12345678910111213141516171819202122232425262728293031

Jun 2026

123456789101112

All public repos (8)

Show forks Show archived

Audio-Reasoner ★ PINNED

The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.

Python ★ 297 1y ago
Explain →
Mega-ASR ★ PINNED

First foundation ASR built for the real world - 7 atomic acoustic conditions, 54 compound scenarios, 2.6M samples, and up to ~30% gains over SOTA where every other model falls apart. **You'll come back to MEGA-ASR, after the rest fail in the wild. ⭐**

Tsinghua speech recognition foundation model tuned for noisy, far-field, in-the-wild audio, claiming up to 30 percent lower WER than Whisper and Qwen3-ASR on hard clips.

Python ★ 978 9d ago
Explain →
Mini-Omni-Reasoner ★ PINNED

Mini-Omni-Reasoner: a real-time speech reasoning framework that interleaves silent reasoning tokens with spoken response tokens (“thinking-in-speaking”), exploiting the LLM–audio throughput gap to keep speech fluent and low-latency while maintaining structured internal reasoning.

★ 165 9mo ago
Explain →
Pask ★ PINNED

Towards Self-Evolving Proactive AI with Perpetual Memory

Python ★ 197 1mo ago
Explain →
Audio-Interaction

No description.

Python ★ 352 8d ago
Explain →
Voices-in-the-Wild-Bench

No description.

Bilingual Chinese and English benchmark of 5000 noisy real and synthetic audio clips with a Python toolkit to score ASR models like Whisper and Canary.

Python ★ 24 21d ago
Explain →
xzf-thu.github.io

I'm Xie Zhifei

★ 0 2d ago
Explain →
MMRC

Measuring Massive-Computational Math Reasoning with Code in LLMs

HTML ★ 0 7mo ago
Explain →