4-day longest streak
🔭 I'm a RLer + NLPer/2 + MLSyser/2. <!-- hijkzzz/hijkzzz is a ✨ _special_ ✨ repository because its README.md (this file) appears on your GitHub profile. Here are some ideas…
🔭 I'm a RLer + NLPer/2 + MLSyser/2.
<!-- 
hijkzzz/hijkzzz is a ✨ _special_ ✨ repository because its README.md (this file) appears on your GitHub profile.
Here are some ideas to get you started:
- 🔭 I’m currently working on ...
- 🔭 I’m currently working on ...
- 🌱 I’m currently learning ...
- 👯 I’m looking to collaborate on ...
- 🤔 I’m looking for help with ...
- 💬 Ask me about ...
- 📫 How to reach me: ...
- 😄 Pronouns: ...
- ⚡ Fun fact: ...
-
Awesome-LLM-Strawberry ★ PINNED
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
★ 6.9k 6mo agoExplain → -
pymarl2 ★ PINNED
Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
Python ★ 711 2y agoExplain → -
alpha-zero-gomoku ★ PINNED
A Multi-threaded Implementation of AlphaZero (C++)
Python ★ 387 3y agoExplain → -
RL ★ PINNED ⑂
Scalable toolkit for efficient model reinforcement
Python ★ 0 2mo agoExplain → -
vllm ★ PINNED ⑂
A high-throughput and memory-efficient inference and serving engine for LLMs
★ 0 1y agoExplain → -
cuda-neural-network
Convolutional Neural Network with CUDA (MNIST 99.23%)
C++ ★ 197 4y agoExplain → -
deep-reinforcement-learning-notes
Deep Reinforcement Learning Notes
★ 119 7y agoExplain → -
mini-os-kernel
A mini Unix-Like OS kernel
C ★ 100 7y agoExplain → -
reinforcement-learning-wechat-jump
Reinforcement Learning for WeChat Jump
Python ★ 93 7y agoExplain → -
mini-interpreter
A Simple Scripting Language
Go ★ 80 7y agoExplain → -
noisy-mappo
Multi-agent PPO with noise (97% win rates on Hard scenarios of SMAC)
Python ★ 77 3y agoExplain → -
prisma
Prisma
Python ★ 71 7y agoExplain → -
dht-crawler
A DHT Crawler based on Goroutine
Go ★ 65 7y agoExplain → -
web-server
A Web Server designed with Reactor I/O Model
C++ ★ 64 7y agoExplain → -
deep-learning-notes
Deep Learning Notes
★ 51 6y agoExplain → -
reinforcement-learning-trading-robot
Trading Robot based on LSTM-PPO
Python ★ 30 6y agoExplain → -
awesome-RLHF ⑂
A curated list of reinforcement learning with human feedback resources (continually updated)
★ 4 1y agoExplain → -
Awesome-LLM-Inference ⑂
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
★ 3 1y agoExplain → -
hijkzzz.github.io
Homepage
HTML ★ 3 1y agoExplain → -
Awesome-LLM-Long-Context-Modeling ⑂
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
★ 2 1y agoExplain → -
leetcode
LeetCode & LintCode
C++ ★ 2 7y agoExplain → -
ring-flash-attention ⑂
Ring attention implementation with flash attention
Python ★ 1 11mo agoExplain → -
vllm-project.github.io ⑂
No description.
HTML ★ 1 1y agoExplain → -
verl ⑂
verl: Volcano Engine Reinforcement Learning for LLMs
Python ★ 0 11mo agoExplain → -
flashinfer ⑂
FlashInfer: Kernel Library for LLM Serving
★ 0 9mo agoExplain → -
hijkzzz
No description.
★ 0 1y agoExplain → -
NTU-Thesis-LaTeX-Template ⑂
🎓 Unofficial LaTeX templates for your graduate thesis (both master's theses and doctoral dissertations) at National Taiwan University. 國立臺灣大學碩博士學位論文 LaTeX 模板
★ 0 5y agoExplain → -
mame-street-fighter-3-ai
Reinforcement Learning for Street Fighter III: 3rd Strike
Python ★ 0 6y agoExplain → -
reinforcement-learning.pytorch
Reinforcement Learning Library
Python ★ 0 6y agoExplain →
No repos match these filters.