Qiang He

@sweetice ·Tuebingen, Germany ·sweetice.github.io

PhD @ Ruhr University Bochum

104 repos
263 followers
26 following

Python 50%
HTML 30%
JavaScript 10%
Jupyter Notebook 10%

41 contributions in the last year

7-day longest streak

‹ swipe through months ›

Jun 2025

SMTWTFS123456789101112131415161718192021222324252627282930

Jul 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Aug 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Sep 2025

SMTWTFS123456789101112131415161718192021222324252627282930

Oct 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Nov 2025

SMTWTFS123456789101112131415161718192021222324252627282930

Dec 2025

SMTWTFS12345678910111213141516171819202122232425262728293031

Jan 2026

SMTWTFS12345678910111213141516171819202122232425262728293031

Feb 2026

SMTWTFS12345678910111213141516171819202122232425262728

Mar 2026

SMTWTFS12345678910111213141516171819202122232425262728293031

Apr 2026

SMTWTFS123456789101112131415161718192021222324252627282930

May 2026

SMTWTFS12345678910111213141516171819202122232425262728293031

Jun 2026

SMTWTFS123456789101112131415161718192021222324252627282930

Jul 2026

SMTWTFS12345678910111213141516171819202122232425262728293031

Less More

All public repos (104)

Show forks Show archived Sort

Deep-reinforcement-learning-with-pytorch ★ PINNED

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Python ★ 4.6k 3y ago
Explain →
PEER-CVPR23

Authors' implementation of PEER

Python ★ 11 3y ago
Explain →
learning-to-communicate-pytorch ⑂

Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Python ★ 5 7y ago
Explain →
RL-Adventure ⑂

Pytorch Implementation of DQN / DDQN / Prioritized replay/ noisy networks/ distributional values/ Rainbow/ hierarchical RL

Jupyter Notebook ★ 3 7y ago
Explain →
PyTorch-GAN ⑂

PyTorch implementations of Generative Adversarial Networks.

Python ★ 2 7y ago
Explain →
reinforcement-learning-algorithms ⑂

This repository contains most of classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, A3C, PPO, TRPO. (More algorithms are still in progress)

Python ★ 2 7y ago
Explain →
sweetice_stop_0708_2025.github.io

No description.

HTML ★ 1 1y ago
Explain →
BEER-ICLR2024

The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation".

Python ★ 1 2y ago
Explain →
ERC-ECML-23

Anonymous code for ICML submission 45

Python ★ 1 2y ago
Explain →
Algorithm_Interview_Notes-Chinese ⑂

2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记

Python ★ 1 7y ago
Explain →
POPO

Code for POPO: Pessimistic Offline Policy Optimization

★ 1 5y ago
Explain →
Distributional-Soft-Actor-Critic ⑂

No description.

★ 1 6y ago
Explain →
Paper-plotting ⑂

Plotting of captured Tensorboard runs for seeded RL comparison

★ 1 6y ago
Explain →
VirtualTaobao ⑂

Virtual-Taobao simulators with OpenAI Gym interface

Python ★ 1 7y ago
Explain →
baselines ⑂

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python ★ 1 7y ago
Explain →
qhe97.github.io ⑂

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

★ 0 5y ago
Explain →
sweetice.github.io_old

No description.

HTML ★ 0 4y ago
Explain →
Online-RLHF ⑂

A recipe for online RLHF.

★ 0 2y ago
Explain →
tianshou ⑂

An elegant, flexible, and superfast PyTorch deep Reinforcement Learning platform.

★ 0 6y ago
Explain →
llama ⑂

Inference code for LLaMA models

★ 0 3y ago
Explain →
MEPE

Official implementation of MEPE

Python ★ 0 3y ago
Explain →
trl ⑂

Train transformer language models with reinforcement learning.

★ 0 2y ago
Explain →
LLM4Arxiv ⑂

No description.

★ 0 2y ago
Explain →
reward-surfaces ⑂

No description.

★ 0 4y ago
Explain →
ColossalAI ⑂

Making large AI models cheaper, faster and more accessible

★ 0 3y ago
Explain →
dalai_llama ⑂

The simplest way to run LLaMA on your local machine

★ 0 3y ago
Explain →
stanford_alpaca ⑂

Code and documentation to train Stanford's Alpaca models, and generate the data.

★ 0 3y ago
Explain →
voltron-robotics ⑂

Voltron: Language-Driven Representation Learning for Robotics

★ 0 3y ago
Explain →
RWKV-LM ⑂

RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

★ 0 3y ago
Explain →
tqc_pytorch_1epo ⑂

Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/

★ 0 5y ago
Explain →
deep-successor-features-for-transfer ⑂

A reusable framework for successor features for transfer in deep reinforcement learning using keras.

★ 0 5y ago
Explain →
learned-fourier-features ⑂

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

★ 0 3y ago
Explain →
ffn_geyang ⑂

Public Repo for the paper "Overcoming The Spectral-Bias of Neural Value Approximation"

★ 0 3y ago
Explain →
LibMTL ⑂

A PyTorch Library for Multi-Task Learning

★ 0 3y ago
Explain →
sweetice.github.io_abondon

No description.

JavaScript ★ 0 3y ago
Explain →
neural-approx-ss-lfi ⑂

Codes for ICLR 21 paper: Neural Approximate Sufficient Statistics for Implicit Models

★ 0 5y ago
Explain →
revisiting-ppo ⑂

No description.

★ 0 5y ago
Explain →
Mirror-Descent-Policy-Optimization ⑂

Mirror Descent Policy Optimization

★ 0 5y ago
Explain →
gulf ⑂

GULF: GUided Learning through successive Functional gradient optimization (author implementation of DPCNN included)

★ 0 5y ago
Explain →
drqv2 ⑂

DrQ-v2: Improved Data-Augmented Reinforcement Learning

★ 0 5y ago
Explain →
mpo ⑂

PyTorch Implementation of the Maximum a Posteriori Policy Optimisation

★ 0 5y ago
Explain →
rlkit ⑂

Collection of reinforcement learning algorithms

Python ★ 0 7y ago
Explain →
adaptive_estimators ⑂

Code for ICLR 2019 paper "Adaptive Estimators Show Information Compression in Deep Neural Networks" (https://openreview.net/forum?id=SkeZisA5t7)

★ 0 6y ago
Explain →
snrl ⑂

No description.

★ 0 5y ago
Explain →
dice_rl ⑂

No description.

★ 0 5y ago
Explain →
TD3_BC ⑂

Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

★ 0 5y ago
Explain →
Evolutionary-Reinforcement-Learning ⑂

Codebase for Evolutionary Reinforcement Learning (ERL) from the paper "Evolution-Guided Policy Gradients in Reinforcement Learning" published at NeurIPS 2018

★ 0 5y ago
Explain →
pderl ⑂

Code for "Proximal Distilled Evolutionary Reinforcement Learning", accepted at AAAI 2020

★ 0 5y ago
Explain →
stein_ksd ⑂

No description.

★ 0 8y ago
Explain →
boots ⑂

No description.

★ 0 6y ago
Explain →
simple-complexities.github.io ⑂

Ameya's math and CS blog.

★ 0 6y ago
Explain →
previous_blog

blog pages

HTML ★ 0 5y ago
Explain →
cfg-gan-pt ⑂

CFG-GAN (Composite Functional Gradient learning of GAN) in pyTorch

★ 0 6y ago
Explain →
CQL ⑂

Code for conservative Q-learning

★ 0 5y ago
Explain →
mopo ⑂

Code for MOPO: Model-based Offline Policy Optimization

★ 0 5y ago
Explain →
BAIL ⑂

No description.

★ 0 5y ago
Explain →
d4rl-pybullet ⑂

Datasets for Data-Driven Deep Reinforcement Learning with Pybullet environments

★ 0 6y ago
Explain →
google-research ⑂

Google Research

★ 0 6y ago
Explain →
Streamlined-Off-Policy-Learning ⑂

ICRL 2020

★ 0 6y ago
Explain →
BEAR ⑂

Code for Stabilizing Off-Policy RL via Bootstrapping Error Reduction

Python ★ 0 6y ago
Explain →
code-for-paper ⑂

No description.

★ 0 6y ago
Explain →
d4pg-pytorch ⑂

PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617)

★ 0 6y ago
Explain →
Mine_pytorch ⑂

MINE: Mutual Information Neural Estimation in pytorch

★ 0 7y ago
Explain →
BCQ ⑂

PyTorch implementation of BCQ for "Off-Policy Deep Reinforcement Learning without Exploration"

★ 0 6y ago
Explain →
neural-processes ⑂

Pytorch implementation of Neural Processes for functions and images :fireworks:

★ 0 7y ago
Explain →
Machine-Learning-Session ⑂

No description.

★ 0 7y ago
Explain →
Probabilistic-Programming-and-Bayesian-Methods-for-Hackers ⑂

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

★ 0 6y ago
Explain →
probmods2 ⑂

probmods 2: electric boogaloo

★ 0 7y ago
Explain →
machine-learning-notes ⑂

My continuously updated Machine Learning, Probabilistic Models and Deep Learning notes and demos (1000+ slides) 我不间断更新的机器学习，概率模型和深度学习的讲义(1000+页)和视频链接

Jupyter Notebook ★ 0 7y ago
Explain →
self-imitation-learning ⑂

ICML 2018 Self-Imitation Learning

Python ★ 0 8y ago
Explain →
MathsDL-spring18 ⑂

Topics course Mathematics of Deep Learning, NYU, Spring 18

★ 0 8y ago
Explain →
loss-landscape ⑂

Code for visualizing the loss landscape of neural nets

Python ★ 0 7y ago
Explain →
aftershocks_issues ⑂

Issues with Deep Learning of Aftershocks by DeVries

Jupyter Notebook ★ 0 7y ago
Explain →
noreward-rl ⑂

[ICML 2017] TensorFlow code for Curiosity-driven Exploration for Deep Reinforcement Learning

Python ★ 0 7y ago
Explain →
pytorch-noreward-rl ⑂

pytorch implementation of Curiosity-driven Exploration by Self-supervised Prediction

Python ★ 0 7y ago
Explain →
random-network-distillation ⑂

Code for the paper "Exploration by Random Network Distillation"

Python ★ 0 7y ago
Explain →
gym-super-mario-bros ⑂

An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES

Python ★ 0 7y ago
Explain →
Super-Mario-Bros-RL ⑂

This project explores deep reinforcement learning, hybrid actor-critic approach with A3C/PPO combined with curiosity for playing Super Mario Bros

Jupyter Notebook ★ 0 7y ago
Explain →
Recommenders ⑂

Recommender Systems

Jupyter Notebook ★ 0 7y ago
Explain →
feudal-montezuma ⑂

Pytorch implementation of "FeUdal Networks for Hierarchical Reinforcement Learning" for Montezuma's Revenge

Python ★ 0 7y ago
Explain →
pytorch-trpo ⑂

PyTorch implementation of Trust Region Policy Optimization

Python ★ 0 7y ago
Explain →
SLM-Lab ⑂

Modular Deep Reinforcement Learning framework in PyTorch.

Python ★ 0 7y ago
Explain →
stable-baselines ⑂

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Python ★ 0 7y ago
Explain →
go-explore ⑂

Code for Go-Explore: a New Approach for Hard-Exploration Problems

Python ★ 0 7y ago
Explain →
modular_rl ⑂

Implementation of TRPO and related algorithms

Python ★ 0 8y ago
Explain →
RL-Gallery ⑂

A gallery for reinforcement learning, including frameworks, tutorials, papers, implementations, applications, etc.

★ 0 7y ago
Explain →
tushare ⑂

TuShare is a utility for crawling historical data of China stocks

Python ★ 0 7y ago
Explain →
spinningup ⑂

An educational resource to help anyone learn deep reinforcement learning.

Python ★ 0 7y ago
Explain →
learning-to-communicate ⑂

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

Lua ★ 0 7y ago
Explain →
ItChat ⑂

A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信，三十行即可自定义个人号机器人。

Python ★ 0 7y ago
Explain →
Lihang ⑂

Statistical learning methods, 统计学习方法 [李航] 值得反复读. [笔记, 代码, notebook, 参考文献, Errata]

Python ★ 0 7y ago
Explain →
RL-Adventure-2 ⑂

PyTorch0.4 implementation of: actor critic / proximal policy optimization / acer / ddpg / twin dueling ddpg / soft actor critic / generative adversarial imitation learning / hindsight experience replay

Jupyter Notebook ★ 0 8y ago
Explain →
softlearning ⑂

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains.

Python ★ 0 7y ago
Explain →
PyTorch-RL ⑂

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

Python ★ 0 7y ago
Explain →
glow ⑂

Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"

Python ★ 0 7y ago
Explain →
models ⑂

Models and examples built with TensorFlow

Python ★ 0 7y ago
Explain →
stanford-cs-229-machine-learning ⑂

VIP cheatsheets for Stanford's CS 229 Machine Learning

★ 0 7y ago
Explain →
TD3 ⑂

PyTorch implementation of TD3 and DDPG for OpenAI gym tasks

Python ★ 0 7y ago
Explain →
copy_blog ⑂

BY Blog ->

HTML ★ 0 7y ago
Explain →
MARL-Papers ⑂

Paper list of multi-agent reinforcement learning (MARL)

★ 0 8y ago
Explain →
reinforcement-learning-an-introduction ⑂

Python Implementation of Reinforcement Learning: An Introduction

Python ★ 0 8y ago
Explain →
Titanic

Predict survival on the Titanic

Jupyter Notebook ★ 0 8y ago
Explain →
tensorflow-tutorial ⑂

Example TensorFlow codes and Caicloud TensorFlow as a Service dev environment.

Jupyter Notebook ★ 0 9y ago
Explain →
tensorflow ⑂

Computation using data flow graphs for scalable machine learning

C++ ★ 0 9y ago
Explain →

No repos match these filters.