2-day longest streak
Hi there 👋 I'm Derek, a professor at NYU, teaching Machine Learning in Financial Engineering, and the founder of Sov.ai and Vedex.ai. Previously a researcher at University of Cambridge, Oxford…
Hi there 👋
I'm Derek, a professor at NYU, teaching Machine Learning in Financial Engineering, and the founder of Sov.ai and Vedex.ai.- Previously a researcher at University of Cambridge, Oxford University, and the Alan Turing Institute.
- Developing applications to optimize asset management and alpha discovery for large quantitative funds.

<!-- feed start -->
- Jan 09 - Quant Letter: January 2024, Week-2
- Jan 03 - Quant Letter: January 2024, Week-1
- Dec 20 - Quant Letter: December 2023, Week-3
- 👥 Researcher at the @alan-turing-institute
- 📓 Associate member @oxford-man-institute
- 🧭 Founder of the @firmai open-source project.
- 👥 Curator for ML-Quant resources.
- 🦌 GitStars rank of 658 from 104,530,383.
Packages
Everything listed here is available under Unlicense
My research has been used by large institutional banks and quantitative hedge funds (see SSRN).
Firmai - Open Source packages:
- DeltaPy (29,888 Downloads) — First tabular data augmentation package in Python (market data) [code][report]
- PandaPy (43,442 Downloads) — Pandas alternative that mimics ‘Structs’ in the C Language (market data) [code][report]
- AtsPy (113,303 Downloads) — First automated time series package in Python (alternative data) [code][report]
- DataGene (4,246 Downloads) — First package assessing dataset similarity (market data) [code][report]
- MLAM (35,409 Downloads) — First repository for machine learning in asset management (market data) [code][report]
Other packages under this license include MTSS-GAN [code][report], the first multivariate conditional time series generator, FairPut [code][report], a FAIR package using LightGBM, and PandasVault [code], an advanced Pandas repository.
🌟 We Are Growing!
We're seeking to collaborate with motivated, independent PhD graduates or doctoral students on approximately seven new projects in 2024. If you’re interested in contributing to cutting-edge investment insights and data analysis, please get in touch! This could be in colaboration with a university or as independent study.
🚀 About Sov.ai
Sov.ai is at the forefront of integrating advanced machine learning techniques with financial data analysis to revolutionize investment strategies. We are working with three of the top 10 quantitative hedge funds, and with many mid-sized and boutique firms.
Our platform leverages diverse data sources and innovative algorithms to deliver actionable insights that drive smarter investment decisions.
By joining Sov.ai, you'll be part of a dynamic research team dedicated to pushing the boundaries of what's possible in finance through technology. Before expressing your interest, please be aware that the research will be predominantly challenging and experimental in nature.
🔍 Research and Project Opportunities
We offer a wide range of projects that cater to various interests and expertise within machine learning and finance. Some of the exciting recent projects include:
- Predictive Modeling with GitHub Logs: Develop models to predict market trends and investment opportunities using GitHub activity and developer data.
- Satallite Data Analysis: Explore non-traditional data sources such as social media sentiment, satellite imagery, or web traffic to enhance financial forecasting.
- Data Imputation Techniques: Investigate new methods for handling missing or incomplete data to improve the robustness and accuracy of our models.
🌐 Why Join Sov.ai?
- Innovative Environment: Engage with the latest technologies and methodologies in machine learning and finance.
- Collaborative Team: Work alongside a team of experts passionate about driving innovation in investment insights.
- Flexible Projects: Tailor your research to align with your interests and expertise, with the freedom to explore new ideas.
- Experienced Researchers: Experts previously from NYU, Columbia, Oxford-Man Institute, Alan Turing Institute, and Cambridge.
- Post Research: Connect with alumni that has moved on to DRW, Citadel Securities, Virtu Financial, Akuna Capital, HRT.
🤝 How to Apply
If you’re excited about leveraging your expertise in machine learning and finance to drive impactful research and projects, we’d love to hear from you! Please reach out to us at [[email protected]](mailto:[email protected]) with your resume and a brief description of your research interests.
Join us in shaping the future of investment insights and making a meaningful impact in the world of finance!
<!--- - 👁️ Advisor at ... --->



.. .-- .-. .. - . .- - - .... . .--. .- .-. .-.. --- ..- .-.
-
deltapy ★ PINNED
DeltaPy - Tabular Data Augmentation (by @firmai)
Jupyter Notebook ★ 557 2y agoExplain → -
python-business-analytics ★ PINNED
Python solutions to solve practical business problems.
Jupyter Notebook ★ 530 1y agoExplain → -
machine-learning-asset-management ★ PINNED
Machine Learning in Asset Management (by @firmai)
Jupyter Notebook ★ 1.7k 4y agoExplain → -
financial-machine-learning ★ PINNED
A curated list of practical financial machine learning tools and applications.
Python ★ 8.7k 1y agoExplain → -
pandapy ★ PINNED
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)
Python ★ 547 4y agoExplain → -
atspy ★ PINNED
AtsPy: Automated Time Series Models in Python (by @firmai)
Python ★ 520 3y agoExplain → -
industry-machine-learning
A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
Jupyter Notebook ★ 7.5k 1y agoExplain → -
awesome-google-colab
Google Colaboratory Notebooks and Repositories (by @firmai)
Jupyter Notebook ★ 1.5k 4y agoExplain → -
data-science-career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
★ 1.0k 1y agoExplain → -
business-machine-learning
A curated list of practical business machine learning (BML) and business data science (BDS) applications for Accounting, Customer, Employee, Legal, Management and Operations (by @firmai)
Jupyter Notebook ★ 819 1y agoExplain → -
pandasvault
Advanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).
Python ★ 434 4y agoExplain → -
datagene
DataGene - Identify How Similar TS Datasets Are to One Another (by @firmai)
Jupyter Notebook ★ 205 4y agoExplain → -
interactive-corporate-report
ICR - Automated and Intelligent Company Report Built in Python (by @firmai)
Jupyter Notebook ★ 184 3y agoExplain → -
business-analytics-and-mathematics-python-book
Advanced Business Analytics and Mathematics with Python (by @firmai)
★ 139 5y agoExplain → -
tsgan
Time-series Generative Adversarial Networks (fork from the ML-AIM research group on bitbucket))
Python ★ 122 4y agoExplain → -
scrapers
Scrapers from a project in 2018. Yelp, Spyfu, Similarweb, Morningstar, Linkedin, Instagram, Inside, Glassdoor, Facebook, Eat24, Doordash, Angellist.
Python ★ 97 7y agoExplain → -
mtss-gan ▣
MTSS-GAN: Multivariate Time Series Simulation with Generative Adversarial Networks (by @firmai)
★ 92 5y agoExplain → -
techniques
Jupyter Notebook and Python business intelligence tools and techniques. [Raw upload]
Jupyter Notebook ★ 86 3y agoExplain → -
ml-fairness-framework
FairPut - Machine Learning Fairness Framework with LightGBM — Explainability, Robustness, Fairness (by @firmai)
Jupyter Notebook ★ 72 4y agoExplain → -
research ⑂
Notebooks based on financial machine learning.
★ 61 6y agoExplain → -
firmai
No description.
★ 22 8h agoExplain → -
machine-learning-for-trading ⑂
Code for Machine Learning for Algorithmic Trading, 2nd edition.
★ 21 3y agoExplain → -
business-datasets
A selection of business datasets
★ 18 7y agoExplain → -
firmai.github.io
Open Business Analytics and Data Science Research
JavaScript ★ 17 5y agoExplain → -
business-machine-learning-vendors
A directory of the top business machine learning vendors
★ 16 5y agoExplain → -
financial-machine-learning-regulation
A look at regulatory challenges and recommendation in the age of AI. Investigating topics like monopoly formation, machine learning auditability, bias mitigation strategies and automated regulatory monitoring.
★ 14 7y agoExplain → -
python-for-finance
No description.
Jupyter Notebook ★ 14 7y agoExplain → -
DeepLearningForTimeSeriesForecasting ⑂
A tutorial demonstrating how to implement deep learning models for time series forecasting
★ 12 6y agoExplain → -
reddit-data-science-project-ideas
Reddit Data Science Project Ideas
★ 11 6y agoExplain → -
Datacamp-Courses ⑂
Some course contents
★ 9 6y agoExplain → -
Awesome-Quant-Machine-Learning-Trading ⑂
Quant/Algorithm trading resources with an emphasis on Machine Learning
★ 8 4y agoExplain → -
simple-machine-learning-glossary
Simple Machine Learning and Data Science Definitions without Copyright
★ 8 6y agoExplain → -
mlfinlab ⑂
MlFinlab helps portfolio managers and traders who want to leverage the power of machine learning by providing reproducible, interpretable, and easy to use tools.
★ 8 6y agoExplain → -
numfin
Numpy for Finance Examples
★ 7 6y agoExplain → -
awesome-quant ⑂
A curated list of insanely awesome libraries, packages and resources for Quants (Quantitative Finance)
Python ★ 7 4y agoExplain → -
awesome-ai-in-finance ⑂
🔬 A curated list of awesome machine learning strategies & tools in financial market.
★ 7 4y agoExplain → -
Quant-Finance-Resources ⑂
Courses, Articles and many more which can help beginners or professionals.
★ 7 4y agoExplain → -
tabular-data-generators
A Collection of Cross-Sectional and Time-Series Generators
★ 6 6y agoExplain → -
tflm
Advanced Transformations and Interactions for Linear Models using Hybrid Machine Learning Models and SHapley Additive exPlanations
Python ★ 6 6y agoExplain → -
xaib
XAIB - Explainable AI in Business
Jupyter Notebook ★ 6 6y agoExplain → -
quant-finance-seminars
Weekly Quant Finance Seminars
★ 6 4y agoExplain → -
universal-portfolios ⑂
Collection of algorithms for online portfolio selection
Jupyter Notebook ★ 6 7y agoExplain → -
awesome-deep-trading ⑂
List of awesome resources for machine learning-based algorithmic trading
★ 5 4y agoExplain → -
text_summurization_abstractive_methods ⑂
Multiple implementations for abstractive text summurization , using google colab
★ 5 6y agoExplain → -
kaggle_learn ⑂
Functions used in kaggle competitions: data preprocessing/feature engineering/model training etc. Many of these functions are collected from kaggle community, credits are belong to the authors :)
★ 5 6y agoExplain → -
functime ⑂
Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.
Python ★ 4 1y agoExplain → -
financial-pde-discovery
Financial PDE Discovery using Machine Learning
★ 4 6y agoExplain → -
datastat
Dataset Statistics to Compare Real or Training Data with Generated or Test Data
★ 4 6y agoExplain → -
random-assets
No description.
Jupyter Notebook ★ 4 4y agoExplain → -
google-colab-website
FirmAI Labs - World's First Google Colab Website
★ 4 5y agoExplain → -
ImputeGAP ⑂
ImputeGAP: A library of Imputation Techniques for Time Series Data
Jupyter Notebook ★ 3 8mo agoExplain → -
ffood
FFOOD - Framework for Feature and Observation Outlier Detection using ML-based Residual Analysis Methods
Python ★ 3 6y agoExplain → -
firmai_analytics
Website
HTML ★ 3 4y agoExplain → -
experimental-statistics
A repository of experimental statistical techniques that improve on "well-accepted" solutions.
★ 3 7y agoExplain → -
teoxoy ⑂
github profile readme
★ 2 9h agoExplain → -
AIF360 ⑂
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
Jupyter Notebook ★ 2 1y agoExplain → -
random-assets-two
No description.
HTML ★ 2 8mo agoExplain → -
SovPy
No description.
Python ★ 2 3y agoExplain → -
citieslover-public
Allows users to scrape select web pages and upload them to S3 bucket.
★ 2 1y agoExplain → -
art-of-readme ⑂
:love_letter: Things I've learned about writing good READMEs.
★ 2 5y agoExplain → -
edgar10q-dataset ⑂
Dataset introduced in the paper "Zero-Shot Open Information Extraction using Question Generation and Reading Comprehension"
★ 2 3y agoExplain → -
numfy
Fast Vectorised NumPy Functions for Finance
★ 2 6y agoExplain → -
demo-ml-ai-invest
No description.
★ 2 4y agoExplain → -
CompanyLogs ⑂
Introduction to Data Science course project
★ 1 7y agoExplain → -
helper-functions
Helper Functions
Python ★ 1 9mo agoExplain → -
profitable_free_apps ⑂
Analyzing free apps in the Google Play store and the App store to see which kind would be profitable for a company.
★ 1 6y agoExplain → -
ShapEng
Shapley Feature Engineering
Jupyter Notebook ★ 1 6y agoExplain → -
Satellite-Image-Time-Series-Datasets ⑂
This page presents a list of satellite imagery datasets with a temporal dimension, mainly satellite image time series (SITS) and satellite videos, for various computer vision and deep learning tasks. It covers multi-temporal datasets with more than two acquisitions but not bi-temporal datasets.
★ 1 1y agoExplain → -
deepscatter ⑂
Zoomable, animated scatterplots in the browser that scales over a billion points
★ 1 1y agoExplain → -
penglab ⑂
Abuse of Google Colab for fun and profit. 🐧
★ 1 6y agoExplain → -
ts-datapoint-selection-vis ⑂
Data Point Selection for Line Chart Visualization: analysis notebooks and implementation details
★ 1 3y agoExplain → -
office-location-project ⑂
This application, developed in Python, helps you to choose the right place for open a new office, based on the preferences of the company's workers.
★ 1 6y agoExplain → -
Multicriteria-Portfolio-Construction-with-Python ⑂
Source code for Multicriteria Portfolio Construction with Python
★ 1 5y agoExplain → -
fastpages ⑂
An easy to use blogging platform, with enhanced support for Jupyter Notebooks.
★ 1 6y agoExplain → -
plotsfinml
No description.
HTML ★ 1 4y agoExplain → -
sov.ai
AI Asset Management Research
★ 1 4y agoExplain → -
claytonjhamilton ⑂
My profile README repo. Share some love with a :star:!
★ 1 5y agoExplain → -
iampavangandhi ⑂
Hey 👋, Glad to see you here! Check out this repository to learn more about me 🤓. You can also use it to make your awesome GitHub README ✨ (Don't Just Fork, Star Too 😅)
★ 1 5y agoExplain → -
Business-Analysis-of-a-Clothing-Brand ⑂
Analyzing the Customer database of a clothing brand and finding relation between dependent parameters and amount spent. Also analyzing, which area the company should focus on.
★ 1 6y agoExplain → -
balance ⑂
Balancing Algorithms for Stochastic Inventory Control
★ 1 8y agoExplain → -
DILATE ⑂
Code for our NeurIPS 2019 paper "Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models"
★ 1 6y agoExplain → -
tsfresh ⑂
Automatic extraction of relevant features adapted for DeltaPy
Jupyter Notebook ★ 1 6y agoExplain → -
contributor
Medium Contributor Guidelines
★ 1 6y agoExplain → -
fairdata
A Python package that implements model-agnostic pre-and post-processing to mitigate unfairness in machine learning prediction
★ 1 6y agoExplain → -
bit
Forked template from Christoph Molnar, testing out website integration
HTML ★ 1 7y agoExplain → -
admin
No description.
★ 1 7y agoExplain → -
Fin-Fact ⑂
A Benchmark Dataset for Multimodal Scientific Fact Checking
★ 0 1y agoExplain → -
Corporate-Defaults---ML-Models ⑂
One-year PD models that combine structural and ML methods in Python
★ 0 5y agoExplain → -
devjobs ⑂
Backend for DevJobs. This is where the magic happens.
★ 0 1y agoExplain → -
mochiday ⑂
🎬 Open-source solution to find the latest SWE jobs that are easy to apply
★ 0 1y agoExplain → -
alphalens-reloaded ⑂
Performance analysis of predictive (alpha) stock factors
★ 0 2y agoExplain → -
test-repo
No description.
★ 0 2y agoExplain → -
scraper_senate-lobbying-disclosures ⑂
No description.
★ 0 2y agoExplain → -
MFLES ⑂
No description.
★ 0 2y agoExplain → -
docs
No description.
MDX ★ 0 2y agoExplain → -
video
No description.
★ 0 4y agoExplain → -
Vue2BaremetricsCalendar ⑂
No description.
★ 0 5y agoExplain → -
github-badges ⑂
Star / Fork badges for your GitHub Repository!
★ 0 6y agoExplain → -
xcaq-1 ⑂
No description.
★ 0 6y agoExplain → -
xcaq
My personal repo.
★ 0 5y agoExplain → -
gargakshit ⑂
My GitHub README 📜 updated automatically using actions ⚡!
★ 0 5y agoExplain → -
traffic2badge ⑂
Traffic to badge action usage template. Use repositories Insights/traffic data to generate badges that include views and clones.
★ 0 5y agoExplain → -
novatorem ⑂
Dynamic realtime profile ReadMe linked with spotify
Python ★ 0 5y agoExplain → -
filipporeds ⑂
✨ uwa, special ✨
★ 0 5y agoExplain → -
guilyx ⑂
A readme with github actions generated statistics
★ 0 5y agoExplain → -
martonlederer ⑂
No description.
★ 0 5y agoExplain → -
anmol098 ⑂
If you are forking please do not forget to star the repo
★ 0 5y agoExplain → -
MacroPower ⑂
Profile with dynamic realtime coding stats. Please star if you like it!
★ 0 5y agoExplain → -
kittinan ⑂
No description.
★ 0 5y agoExplain → -
rusty-sj ⑂
GitHub Profile README.md
★ 0 5y agoExplain → -
abhisheknaiidu ⑂
👀
★ 0 5y agoExplain → -
the-parlour
Long Articles
★ 0 5y agoExplain → -
snowde.github.io ⑂
No description.
★ 0 5y agoExplain → -
oostindische
No description.
HTML ★ 0 5y agoExplain → -
oostindische-dev
No description.
HTML ★ 0 3y agoExplain → -
private
No description.
Jupyter Notebook ★ 0 6y agoExplain → -
fairxgb
FairXGB - Fair eXplainable Gradient Boosting Method
★ 0 6y agoExplain → -
vault
No description.
Jupyter Notebook ★ 0 3y agoExplain → -
taib
TAIB - Trustable AI in Business
★ 0 6y agoExplain → -
ibaa ⑂
A public available dataset for using market sentiment for financial asset allocation.
★ 0 7y agoExplain → -
Zindi_Challenge_Sendy_Logistics ⑂
Zindi Challenge for Sendy Logistics Company in Nairobi Final Submissions
★ 0 6y agoExplain → -
Budget-Optimization-in-Ecommerce-using-Market-Mix-Modelling ⑂
To create a market mix model for ElecKart (an e-commerce firm from Ontario, Canada) for several products categories - to observe the actual impact of various marketing variables over the past and recommend the optimal budget allocation for different marketing levers for the next year. Built several Linear Regression models like Additive, Multiplicative, Koyck & Distributive Lag to identify the important KPIs that influence the company revenue and their contributions towards the revenue. The main data set is available below:
★ 0 6y agoExplain → -
finding_best_markets_advertise ⑂
Theoretical Scenario: Finding the best markets to advertise in as an e-learning company offering courses in programming.
★ 0 6y agoExplain → -
CompaniesGeolocation ⑂
To place the 'new company offices' in the best place for the company to grow.
★ 0 6y agoExplain → -
walter-p-moore-data-challenge ⑂
This is the data challenge held by Walter P Moore, which an international company providing engineering services. The data is their detailed project information in time series manner and the goal is to predict the profitability of a project
★ 0 6y agoExplain → -
mentalhealthprediction ⑂
I cleaned over 125 columns and curated final 25 relevant columns, for the model used XG Boost + Hyperparameter + Standard Scaler to classify and predict mental health at work. The dataset were real survey from apprx 1000 workers in Tech Companies.
★ 0 6y agoExplain → -
Northwind-Company-Findings ⑂
No description.
★ 0 6y agoExplain →
No repos match these filters.