17-day longest streak
Hi, I'm Asim AI Research Engineer based in Cape Town, South Africa 🇿🇦 I specialize in Multi-Agent Reinforcement Learning and LLM Agents & Engineering Currently at InstaDeep working on MARL…
Hi, I'm Asim
AI Research Engineer based in Cape Town, South Africa 🇿🇦
I specialize in Multi-Agent Reinforcement Learning and LLM Agents & Engineering
Currently at InstaDeep working on MARL research, I'm currently focused on combining Contrastive Goal Conditioned Reinforcement Learnining and Unsupervised Environment Design (UED) in Multi Agent settings.
🎓 MSc in AI from University of Cape Town & AIMS South Africa (Google DeepMind Scholar)
---
🔬 What I Work On
- Multi-Agent RL — Contrastive learning, goal-conditioned RL, and curriculum strategies in JAX
- LLM Agents — Autonomous agents for ML engineering, scientific discovery, and code generation
- Inference-Time Scaling — Making open-source LLMs competitive with proprietary models
- LLM Engineering — Fine-tuning, RLHF (PPO/GRPO/DPO), vLLM serving, distributed training
Skills
I'm good withPython JAX/Flax PyTorch vLLM HuggingFace TRL Unsloth LangGraph/LangSmith TPU/GPU



-
ITS-bench ★ PINNED ⑂
Bench-marking Inference time scaling strategies on MLE-bench for measuring how well AI agents perform at machine learning engineering
Python ★ 0 1y agoExplain → -
Arabic-to-Swahili-Machine-Translation ★ PINNED
Graduation Project
Jupyter Notebook ★ 0 2y agoExplain → -
Bayesian-Deep-Active-Learning ★ PINNED
Active Learning experiments using Bayesian neural networks (BNNs)
Jupyter Notebook ★ 0 1y agoExplain → -
tunix-jax-llms ★ PINNED
No description.
Python ★ 0 8mo agoExplain → -
aide-agent ★ PINNED
automatic tree search llm based agent
Python ★ 0 8mo agoExplain → -
Comparing-Uncertainty-Methods-Active-Learning
Using Information theory to quantify uncertainty in deep learning models
Python ★ 1 1y agoExplain → -
IndabaX-2022
The content of the Introductory sessions at IndabaX
Jupyter Notebook ★ 1 3y agoExplain → -
asimawad.github.io
Personal portfolio - Asim Osman
HTML ★ 0 3d agoExplain → -
Contrastive-PPO
No description.
Python ★ 0 3d agoExplain → -
JaxGCRL-marl ⑂
Online Goal-Conditioned Reinforcement Learning in JAX. ICLR 2025 Spotlight.
Python ★ 0 1mo agoExplain → -
Asimawad
No description.
★ 0 2mo agoExplain → -
Recommender-Sys-Engine-with-Collaborative-Filtering
This project presents the development of a col- laborative filtering-based recommender system using Alternating Least Squares (ALS), with la- tent factor embeddings for users, movies along- side with their biases
Jupyter Notebook ★ 0 4mo agoExplain → -
movie-recommender-api
Movie Recommender API - FastAPI + Collaborative Filtering
Python ★ 0 4mo agoExplain → -
DUKKAN
No description.
Java ★ 0 4mo agoExplain → -
Assembly-Code
No description.
Assembly ★ 0 4mo agoExplain → -
Computer-Graphics-code-CPP
No description.
C++ ★ 0 4mo agoExplain → -
Cryptography-and-Security-Technology
python implementation of SHA, AES and Vigenere Security algorithms
Python ★ 0 4mo agoExplain → -
Attempt-at-Feature-Visualization-NMA
Neuromatch-Academy-Project-Code
Jupyter Notebook ★ 0 4mo agoExplain → -
web-newsletter
No description.
★ 0 4mo agoExplain → -
My-progress
No description.
Jupyter Notebook ★ 0 4mo agoExplain → -
Web-Development
Web Development projects
HTML ★ 0 2y agoExplain → -
SemEval2026-task9 ⑂
No description.
★ 0 8mo agoExplain → -
VinePPO
This repository contains an experimental implementation of **Fine-Grained Credit Assignment for RL Training (CAL)** on top of Google's Tunix framework. This research explores token-level reward assignment to improve training stability and sample efficiency in reinforcement learning for large language models.
Python ★ 0 7mo agoExplain → -
JaxMARL-crl ⑂
Multi-Agent Reinforcement Learning with JAX
★ 0 2mo agoExplain → -
llm-workshop-aims-2025 ⑂
Materials and notes for the Science and Engineering of Large Language Models workshop at AIMS South Africa, Cape Town, 2025.
★ 0 7mo agoExplain → -
Mava-ssl ⑂
🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
★ 0 7mo agoExplain → -
gc-marl-ued ⑂
No description.
★ 0 5mo agoExplain → -
minimax-ued ⑂
Efficient baselines for autocurricula in JAX.
★ 0 1y agoExplain → -
nanochat ⑂
The best ChatGPT that $100 can buy.
★ 0 8mo agoExplain → -
HighPerfLLMs ⑂
No description.
★ 0 1y agoExplain → -
DeepSpeed ⑂
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
★ 0 10mo agoExplain → -
indaba-pracs-2025 ⑂
No description.
★ 0 10mo agoExplain → -
jax ⑂
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
★ 0 10mo agoExplain → -
Stoix ⑂
🏛️A research-friendly codebase for fast experimentation of single-agent reinforcement learning in JAX • End-to-End JAX RL
★ 0 10mo agoExplain → -
VinePPO-Starters-Pack ⑂
No description.
★ 0 10mo agoExplain → -
ITS-Thesis ⑂
Thesis Essay
TeX ★ 0 1y agoExplain → -
aide-ds ⑂
AIDE: the Machine Learning CodeGen Agent
Python ★ 0 1y agoExplain → -
Reinforcement-Learning-Cartpole-DQN
The implementation incorporates various critical reinforcement learning techniques, including Q-learning, epsilon-greedy exploration, and neural network-based function approximation.
Jupyter Notebook ★ 0 1y agoExplain → -
NeuroSimInf ⑂
Course repository for Simulation and Inference for Neuroscience
Jupyter Notebook ★ 0 1y agoExplain → -
XOR-Neural-Network-from-Scratch
This project demonstrates the step-by-step implementation of a neural network from scratch using Python, covering data preparation, core functions, training, and visualization.
Jupyter Notebook ★ 0 1y agoExplain → -
Machine-Learning-Overtraining-
This project explores binary classification methods using decision stumps and decision trees, focusing on their performance, overtraining tendencies
Jupyter Notebook ★ 0 1y agoExplain → -
Machine-Learning-Calorimeter-Showers-Classifier
The challenge is to build a classifier that distinguishes electron showers (signal) from hadron showers (background) based on their depth and width.
Jupyter Notebook ★ 0 1y agoExplain → -
Reinforcement-Learning-LunarLander-REINFORCE
This project demonstrates the application of the REINFORCE algorithm, a policy gradient method, to solve the LunarLander-v2 environment using reinforcement learning.
Jupyter Notebook ★ 0 1y agoExplain → -
REINFORCE-Algorithm-CartPole
This project implements the REINFORCE algorithm to solve the CartPole-v1 environment. The algorithm uses policy gradients to optimize the agent’s performance by reinforcing actions that lead to higher returns.
Jupyter Notebook ★ 0 1y agoExplain → -
Reinforcement-Learning-LunarLander-DQN
No description.
Jupyter Notebook ★ 0 1y agoExplain → -
Vehicle-Re-Identification-Using-the-VeRi-Dataset
This project aims to develop and evaluate a vehicle re-identification system using the VeRi dataset. Leveraging a pre-trained ResNet50 architecture and triplet loss, we designed a system to learn embeddings for vehicle images. Two triplet mining strategies—random and semi-hard
Jupyter Notebook ★ 0 1y agoExplain → -
Fully-Convolutional-Networks-for-Image-Denoising
This project demonstrates the implementation of a fully convolutional network for image denoising using the color version of the LFWcrop dataset. It aims to reconstruct clean images from their noisy counterparts by leveraging a small yet effective autoencoder architecture.
Jupyter Notebook ★ 0 1y agoExplain → -
Computer-Vision-Transfer-Learning
training a classification model for CIFAR-10, but this time with transfer learning, inestigation model focus with sailancy maps
Jupyter Notebook ★ 0 1y agoExplain → -
Active-Learning-Core-Concepts
This project explores the principles of active learning through the Forest Covertype Dataset, using various query strategies to demonstrate its impact on model performance and data efficiency.
Jupyter Notebook ★ 0 1y agoExplain → -
Computer-Vision-Basics-CIFAR10
Implementing and evaluating various Convolutional Neural Network (CNN) architectures to classify images from the CIFAR-10 dataset. Through systematic experimentation, we explore the effects of model complexity, regularization techniques, and data augmentation on classification accuracy.
Jupyter Notebook ★ 0 1y agoExplain → -
COVID19NPISecondWave ⑂
No description.
★ 0 4y agoExplain → -
The-coupon-collector
No description.
Python ★ 0 1y agoExplain → -
Field-Monitoring
Using satellite imagery and computer vision, farmers can map fields, track crop health, and receive actionable recommendations via a user-friendly dashboard
Python ★ 0 2y agoExplain → -
Crop-Monitoring
Project on crop monitoring-Amundata
Python ★ 0 2y agoExplain → -
agriAI ⑂
Using satellite imagery and computer vision, farmers can map fields, track crop health, and receive actionable recommendations via a user-friendly dashboard.
★ 0 2y agoExplain → -
Competitve-Programming
practice with python on competitive programming
HTML ★ 0 2y agoExplain →
No repos match these filters.