7-day longest streak
-
Deep-reinforcement-learning-with-pytorch ★ PINNED
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
Python ★ 4.6k 3y agoExplain → -
PEER-CVPR23
Authors' implementation of PEER
Python ★ 11 3y agoExplain → -
learning-to-communicate-pytorch ⑂
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Python ★ 5 7y agoExplain → -
RL-Adventure ⑂
Pytorch Implementation of DQN / DDQN / Prioritized replay/ noisy networks/ distributional values/ Rainbow/ hierarchical RL
Jupyter Notebook ★ 3 7y agoExplain → -
PyTorch-GAN ⑂
PyTorch implementations of Generative Adversarial Networks.
Python ★ 2 7y agoExplain → -
reinforcement-learning-algorithms ⑂
This repository contains most of classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, A3C, PPO, TRPO. (More algorithms are still in progress)
Python ★ 2 7y agoExplain → -
sweetice_stop_0708_2025.github.io
No description.
HTML ★ 1 1y agoExplain → -
BEER-ICLR2024
The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation".
Python ★ 1 2y agoExplain → -
ERC-ECML-23
Anonymous code for ICML submission 45
Python ★ 1 2y agoExplain → -
Algorithm_Interview_Notes-Chinese ⑂
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
Python ★ 1 7y agoExplain → -
POPO
Code for POPO: Pessimistic Offline Policy Optimization
★ 1 5y agoExplain → -
Distributional-Soft-Actor-Critic ⑂
No description.
★ 1 6y agoExplain → -
Paper-plotting ⑂
Plotting of captured Tensorboard runs for seeded RL comparison
★ 1 6y agoExplain → -
VirtualTaobao ⑂
Virtual-Taobao simulators with OpenAI Gym interface
Python ★ 1 7y agoExplain → -
baselines ⑂
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Python ★ 1 7y agoExplain → -
qhe97.github.io ⑂
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
★ 0 5y agoExplain → -
sweetice.github.io_old
No description.
HTML ★ 0 4y agoExplain → -
Online-RLHF ⑂
A recipe for online RLHF.
★ 0 2y agoExplain → -
tianshou ⑂
An elegant, flexible, and superfast PyTorch deep Reinforcement Learning platform.
★ 0 6y agoExplain → -
llama ⑂
Inference code for LLaMA models
★ 0 3y agoExplain → -
MEPE
Official implementation of MEPE
Python ★ 0 3y agoExplain → -
trl ⑂
Train transformer language models with reinforcement learning.
★ 0 2y agoExplain → -
LLM4Arxiv ⑂
No description.
★ 0 2y agoExplain → -
reward-surfaces ⑂
No description.
★ 0 4y agoExplain → -
ColossalAI ⑂
Making large AI models cheaper, faster and more accessible
★ 0 3y agoExplain → -
dalai_llama ⑂
The simplest way to run LLaMA on your local machine
★ 0 3y agoExplain → -
stanford_alpaca ⑂
Code and documentation to train Stanford's Alpaca models, and generate the data.
★ 0 3y agoExplain → -
voltron-robotics ⑂
Voltron: Language-Driven Representation Learning for Robotics
★ 0 3y agoExplain → -
RWKV-LM ⑂
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
★ 0 3y agoExplain → -
tqc_pytorch_1epo ⑂
Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/
★ 0 5y agoExplain → -
deep-successor-features-for-transfer ⑂
A reusable framework for successor features for transfer in deep reinforcement learning using keras.
★ 0 5y agoExplain → -
learned-fourier-features ⑂
Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"
★ 0 3y agoExplain → -
ffn_geyang ⑂
Public Repo for the paper "Overcoming The Spectral-Bias of Neural Value Approximation"
★ 0 3y agoExplain → -
LibMTL ⑂
A PyTorch Library for Multi-Task Learning
★ 0 3y agoExplain → -
sweetice.github.io_abondon
No description.
JavaScript ★ 0 3y agoExplain → -
neural-approx-ss-lfi ⑂
Codes for ICLR 21 paper: Neural Approximate Sufficient Statistics for Implicit Models
★ 0 5y agoExplain → -
revisiting-ppo ⑂
No description.
★ 0 5y agoExplain → -
Mirror-Descent-Policy-Optimization ⑂
Mirror Descent Policy Optimization
★ 0 5y agoExplain → -
gulf ⑂
GULF: GUided Learning through successive Functional gradient optimization (author implementation of DPCNN included)
★ 0 5y agoExplain → -
drqv2 ⑂
DrQ-v2: Improved Data-Augmented Reinforcement Learning
★ 0 5y agoExplain → -
mpo ⑂
PyTorch Implementation of the Maximum a Posteriori Policy Optimisation
★ 0 5y agoExplain → -
rlkit ⑂
Collection of reinforcement learning algorithms
Python ★ 0 7y agoExplain → -
adaptive_estimators ⑂
Code for ICLR 2019 paper "Adaptive Estimators Show Information Compression in Deep Neural Networks" (https://openreview.net/forum?id=SkeZisA5t7)
★ 0 6y agoExplain → -
snrl ⑂
No description.
★ 0 5y agoExplain → -
dice_rl ⑂
No description.
★ 0 5y agoExplain → -
TD3_BC ⑂
Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL
★ 0 5y agoExplain → -
Evolutionary-Reinforcement-Learning ⑂
Codebase for Evolutionary Reinforcement Learning (ERL) from the paper "Evolution-Guided Policy Gradients in Reinforcement Learning" published at NeurIPS 2018
★ 0 5y agoExplain → -
pderl ⑂
Code for "Proximal Distilled Evolutionary Reinforcement Learning", accepted at AAAI 2020
★ 0 5y agoExplain → -
stein_ksd ⑂
No description.
★ 0 8y agoExplain → -
boots ⑂
No description.
★ 0 6y agoExplain → -
simple-complexities.github.io ⑂
Ameya's math and CS blog.
★ 0 6y agoExplain → -
previous_blog
blog pages
HTML ★ 0 5y agoExplain → -
cfg-gan-pt ⑂
CFG-GAN (Composite Functional Gradient learning of GAN) in pyTorch
★ 0 6y agoExplain → -
CQL ⑂
Code for conservative Q-learning
★ 0 5y agoExplain → -
mopo ⑂
Code for MOPO: Model-based Offline Policy Optimization
★ 0 5y agoExplain → -
BAIL ⑂
No description.
★ 0 5y agoExplain → -
d4rl-pybullet ⑂
Datasets for Data-Driven Deep Reinforcement Learning with Pybullet environments
★ 0 6y agoExplain → -
google-research ⑂
Google Research
★ 0 6y agoExplain → -
Streamlined-Off-Policy-Learning ⑂
ICRL 2020
★ 0 6y agoExplain → -
BEAR ⑂
Code for Stabilizing Off-Policy RL via Bootstrapping Error Reduction
Python ★ 0 6y agoExplain → -
code-for-paper ⑂
No description.
★ 0 6y agoExplain → -
d4pg-pytorch ⑂
PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617)
★ 0 6y agoExplain → -
Mine_pytorch ⑂
MINE: Mutual Information Neural Estimation in pytorch
★ 0 7y agoExplain → -
BCQ ⑂
PyTorch implementation of BCQ for "Off-Policy Deep Reinforcement Learning without Exploration"
★ 0 6y agoExplain → -
neural-processes ⑂
Pytorch implementation of Neural Processes for functions and images :fireworks:
★ 0 7y agoExplain → -
Machine-Learning-Session ⑂
No description.
★ 0 7y agoExplain → -
Probabilistic-Programming-and-Bayesian-Methods-for-Hackers ⑂
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
★ 0 6y agoExplain → -
probmods2 ⑂
probmods 2: electric boogaloo
★ 0 7y agoExplain → -
machine-learning-notes ⑂
My continuously updated Machine Learning, Probabilistic Models and Deep Learning notes and demos (1000+ slides) 我不间断更新的机器学习,概率模型和深度学习的讲义(1000+页)和视频链接
Jupyter Notebook ★ 0 7y agoExplain → -
self-imitation-learning ⑂
ICML 2018 Self-Imitation Learning
Python ★ 0 8y agoExplain → -
MathsDL-spring18 ⑂
Topics course Mathematics of Deep Learning, NYU, Spring 18
★ 0 8y agoExplain → -
loss-landscape ⑂
Code for visualizing the loss landscape of neural nets
Python ★ 0 7y agoExplain → -
aftershocks_issues ⑂
Issues with Deep Learning of Aftershocks by DeVries
Jupyter Notebook ★ 0 7y agoExplain → -
noreward-rl ⑂
[ICML 2017] TensorFlow code for Curiosity-driven Exploration for Deep Reinforcement Learning
Python ★ 0 7y agoExplain → -
pytorch-noreward-rl ⑂
pytorch implementation of Curiosity-driven Exploration by Self-supervised Prediction
Python ★ 0 7y agoExplain → -
random-network-distillation ⑂
Code for the paper "Exploration by Random Network Distillation"
Python ★ 0 7y agoExplain → -
gym-super-mario-bros ⑂
An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES
Python ★ 0 7y agoExplain → -
Super-Mario-Bros-RL ⑂
This project explores deep reinforcement learning, hybrid actor-critic approach with A3C/PPO combined with curiosity for playing Super Mario Bros
Jupyter Notebook ★ 0 7y agoExplain → -
Recommenders ⑂
Recommender Systems
Jupyter Notebook ★ 0 7y agoExplain → -
feudal-montezuma ⑂
Pytorch implementation of "FeUdal Networks for Hierarchical Reinforcement Learning" for Montezuma's Revenge
Python ★ 0 7y agoExplain → -
pytorch-trpo ⑂
PyTorch implementation of Trust Region Policy Optimization
Python ★ 0 7y agoExplain → -
SLM-Lab ⑂
Modular Deep Reinforcement Learning framework in PyTorch.
Python ★ 0 7y agoExplain → -
stable-baselines ⑂
Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms
Python ★ 0 7y agoExplain → -
go-explore ⑂
Code for Go-Explore: a New Approach for Hard-Exploration Problems
Python ★ 0 7y agoExplain → -
modular_rl ⑂
Implementation of TRPO and related algorithms
Python ★ 0 8y agoExplain → -
RL-Gallery ⑂
A gallery for reinforcement learning, including frameworks, tutorials, papers, implementations, applications, etc.
★ 0 7y agoExplain → -
tushare ⑂
TuShare is a utility for crawling historical data of China stocks
Python ★ 0 7y agoExplain → -
spinningup ⑂
An educational resource to help anyone learn deep reinforcement learning.
Python ★ 0 7y agoExplain → -
learning-to-communicate ⑂
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
Lua ★ 0 7y agoExplain → -
ItChat ⑂
A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信,三十行即可自定义个人号机器人。
Python ★ 0 7y agoExplain → -
Lihang ⑂
Statistical learning methods, 统计学习方法 [李航] 值得反复读. [笔记, 代码, notebook, 参考文献, Errata]
Python ★ 0 7y agoExplain → -
RL-Adventure-2 ⑂
PyTorch0.4 implementation of: actor critic / proximal policy optimization / acer / ddpg / twin dueling ddpg / soft actor critic / generative adversarial imitation learning / hindsight experience replay
Jupyter Notebook ★ 0 8y agoExplain → -
softlearning ⑂
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains.
Python ★ 0 7y agoExplain → -
PyTorch-RL ⑂
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
Python ★ 0 7y agoExplain → -
glow ⑂
Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"
Python ★ 0 7y agoExplain → -
models ⑂
Models and examples built with TensorFlow
Python ★ 0 7y agoExplain → -
stanford-cs-229-machine-learning ⑂
VIP cheatsheets for Stanford's CS 229 Machine Learning
★ 0 7y agoExplain → -
TD3 ⑂
PyTorch implementation of TD3 and DDPG for OpenAI gym tasks
Python ★ 0 7y agoExplain → -
copy_blog ⑂
BY Blog ->
HTML ★ 0 7y agoExplain → -
MARL-Papers ⑂
Paper list of multi-agent reinforcement learning (MARL)
★ 0 8y agoExplain → -
reinforcement-learning-an-introduction ⑂
Python Implementation of Reinforcement Learning: An Introduction
Python ★ 0 8y agoExplain → -
Titanic
Predict survival on the Titanic
Jupyter Notebook ★ 0 8y agoExplain → -
tensorflow-tutorial ⑂
Example TensorFlow codes and Caicloud TensorFlow as a Service dev environment.
Jupyter Notebook ★ 0 9y agoExplain → -
tensorflow ⑂
Computation using data flow graphs for scalable machine learning
C++ ★ 0 9y agoExplain →
No repos match these filters.