umiswing

@umiswing

35 repos
30 followers
122 following

C 33%
Python 17%
Emacs Lisp 17%
Cuda 17%
HTML 17%

200 contributions in the last year

6-day longest streak

‹ swipe through months ›

Jun 2025

SMTWTFS123456789101112131415161718192021222324252627282930

Jul 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Aug 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Sep 2025

SMTWTFS123456789101112131415161718192021222324252627282930

Oct 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Nov 2025

SMTWTFS123456789101112131415161718192021222324252627282930

Dec 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Jan 2026

SMTWTFS12345678910111213141516171819202122232425262728293031

Feb 2026

SMTWTFS12345678910111213141516171819202122232425262728

Mar 2026

SMTWTFS12345678910111213141516171819202122232425262728293031

Apr 2026

SMTWTFS123456789101112131415161718192021222324252627282930

May 2026

SMTWTFS12345678910111213141516171819202122232425262728293031

Jun 2026

SMTWTFS123456789101112131415161718192021222324252627282930

Less More

All public repos (35)

Show forks Show archived Sort

Paddle ★ PINNED ⑂

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

C++ ★ 0 26d ago
Explain →
.emacs.d ★ PINNED

No description.

Emacs Lisp ★ 0 2y ago
Explain →
test_flashmask

No description.

Python ★ 3 1mo ago
Explain →
flash-attention ⑂

Fast and memory-efficient exact attention

★ 1 1mo ago
Explain →
auto-gpu-kernel ⑂

Winner 🏆 (Agent-only) MLSys 2026 - FlashInfer AI Kernel Generation Contest for the DeepSeek Sparse Attention (DSA) track with an average speedup of 34.93x

★ 0 1mo ago
Explain →
ncu-report-skill ⑂

No description.

★ 0 27d ago
Explain →
KernelWiki ⑂

No description.

★ 0 27d ago
Explain →
kernel-design-agents ⑂

No description.

★ 0 27d ago
Explain →
PaddleFleet ⑂

Core Functional Library for Distributed Training

★ 0 2d ago
Explain →
claude-code ⑂

An independent Python feature port of Claude Code, entirely rewritting from scratch using oh-my-codex. Educational Purpose only.

★ 0 2mo ago
Explain →
claude-code-analysis ⑂

Comprehensive reverse-engineering analysis of Claude Code's internal architecture, modules, and design patterns

★ 0 2mo ago
Explain →
PaddleFormers ⑂

PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.

★ 0 4d ago
Explain →
NeMo ⑂

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

★ 0 1y ago
Explain →
PaddleNLP ⑂

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

★ 0 9mo ago
Explain →
DocumentSASS ⑂

Unofficial description of the CUDA assembly (SASS) instruction sets.

★ 0 3y ago
Explain →
flux ⑂

A fast communication-overlapping library for tensor parallelism on GPUs.

★ 0 1y ago
Explain →
PaddleApiTest ⑂

No description.

★ 0 2y ago
Explain →
PaddleFlashattnTest ⑂

Additional tests of flash attention api in paddle

★ 0 1y ago
Explain →
ppl.llm.kernel.cuda ⑂

No description.

★ 0 2y ago
Explain →
DissectingTensorCores ⑂

No description.

★ 0 2y ago
Explain →
nv_isa_solver ⑂

No description.

★ 0 1y ago
Explain →
maxas ⑂

Assembler for NVIDIA Maxwell architecture

★ 0 3y ago
Explain →
NiuTrans.NMT ⑂

A Fast Neural Machine Translation System. It is developed in C++ and resorts to NiuTensor for fast tensor APIs.

★ 0 2y ago
Explain →
NiuTrans.ST ⑂

No description.

★ 0 2y ago
Explain →
st

No description.

C ★ 0 2y ago
Explain →
dwm

No description.

C ★ 0 2y ago
Explain →
emacs-abyss-theme ⑂

A dark theme for Emacs

★ 0 8y ago
Explain →
cudnn_test

No description.

Cuda ★ 0 2y ago
Explain →
umiswing.github.io

No description.

HTML ★ 0 2y ago
Explain →
draw.io

No description.

★ 0 3y ago
Explain →
emacs-catppuccin ⑂

🍄 Soothing pastel theme for Emacs

Emacs Lisp ★ 0 3y ago
Explain →
cutlassProfilerUsage

No description.

★ 0 3y ago
Explain →
YHs_Sample ⑂

Yinghan's Code Sample

★ 0 3y ago
Explain →
how-to-optimize-gemm ⑂

No description.

★ 0 4y ago
Explain →
How_to_optimize_in_GPU ⑂

This is a series of GPU optimization topics. Here we will introduce how to optimize the program on the GPU in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

★ 0 4y ago
Explain →

No repos match these filters.