44-day current streak·75-day longest streak
I created pandas and co-created Apache Arrow and Ibis. Founder at Kenn Software · Part-time: GP at Composed Ventures, AI and Python at Posit · Apache Member · Blog ·…
I created pandas and co-created Apache Arrow and Ibis.
Founder at Kenn Software · Part-time: GP at Composed Ventures, AI and Python at Posit · Apache Member · Blog · Previously: Two Sigma, Cloudera, DataPad
What I'm working on
ProjectStarsDescription
roborev
Continuous code review for AI coding agents. Runs in the background, reviews every commit as agents write, and surfaces issues in seconds — before they compound.
middleman
Local-first GitHub dashboard for maintainers to triage, review, and merge PRs and issues across repos.
agentsview
Local coding agent session viewer for Claude, Codex, and Gemini with analytics dashboard and full text search. agentsview usage is also a 100x faster replacement for ccusage.
msgvault
Archive a lifetime of email and chat locally. Full Gmail backup with search, DuckDB-powered analytics, an interactive TUI, and an MCP server for AI queries -- all entirely offline.
moneyflow
Personal finance data interface for power users, supporting backends like Monarch Money and YNAB.
Spicy Takes
20+ prolific tech writers (Paul Graham, Martin Fowler, and others) analyzed by LLMs. Every post gets a TL;DR, quotations, and a spiciness rating.
VibePulse
Simple macOS menubar app to monitor your Claude Code and Codex token consumption.
kata
Local-first issue tracking for AI-assisted software work, with an agent-friendly CLI and human-facing TUI.
Major past projects (created or less active now)
ProjectRoleStarsDescription
Positron
Contributor
A next-generation data science IDE built on VS Code, supporting Python and R.
pandas
Creator
The most widely used data analysis library in Python.
Apache Arrow
Co-creator
Language-independent columnar memory format for analytics.
Ibis
Creator
Portable Python dataframe API for any backend.
---
My book Python for Data Analysis is the most widely used introduction to the Python data stack -- pandas, NumPy, IPython, Jupyter. Now in its 3rd edition.
 I support open-source data science through NumFOCUS and donate at least $1000/year.
-
arrow ★ PINNED ⑂
Mirror of Apache Arrow
C++ ★ 8 1y agoExplain → -
pydata-book
Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
Jupyter Notebook ★ 25k 8mo agoExplain → -
feather
Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow
JavaScript ★ 2.8k 6mo agoExplain → -
pandas2 ▣
Design documents and code for the pandas 2.0 effort.
Python ★ 305 7y agoExplain → -
moneyflow
Moneyflow: Personal Finance Data Interface for Power Users (supporting backends like Monarch Money, YNAB)
Python ★ 257 5d agoExplain → -
vbench
vbench: A tool for benchmarking your code through time, for showing performance improvement or regressions
Python ★ 244 8y agoExplain → -
pandas ⑂
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Python ★ 148 7y agoExplain → -
archived-agent-session-viewer ▣
Browse, search, and revisit your AI coding sessions.
Python ★ 88 4mo agoExplain → -
vibepulse
Simple macOS menubar app to monitor your Claude Code and Codex token consumption using ccusage
Swift ★ 50 10h agoExplain → -
vldb-2019-apache-arrow-workshop
Materials for Apache Arrow workshop at VLDB 2019
Jupyter Notebook ★ 41 6y agoExplain → -
statlib
Bayesian State Space and Dynamic Models
Python ★ 34 14y agoExplain → -
spicytakes.org
No description.
Python ★ 21 1h agoExplain → -
dataframe-protocol
An Python object protocol for projects to interchange data frame-like data without forcing pandas.DataFrame as the intermediary
★ 15 6y agoExplain → -
strata-sj-2015
Materials for PyData at Strata/Hadoop World San Jose 2015
Python ★ 11 11y agoExplain → -
llm-arithmetic-benchmark
No description.
Python ★ 11 7mo agoExplain → -
ipython ⑂
Official IPython repository
Python ★ 10 11y agoExplain → -
bitmaps-vs-sentinels
No description.
Jupyter Notebook ★ 9 7y agoExplain → -
statsmodels ⑂
main repo of statsmodels
Python ★ 6 14y agoExplain → -
numpy ⑂
Numpy main repository
C ★ 5 15y agoExplain → -
wesm
No description.
★ 4 1mo agoExplain → -
Slides-SciPyConf-2018 ⑂
A repository for public storage of slides given at the 17th Python in Science Conferences (2018)
Jupyter Notebook ★ 4 8y agoExplain → -
mapd-core ⑂
The MapD Core database
C++ ★ 4 8y agoExplain → -
arrow-plasma-object-store
Plasma Object Store code for proposed import to Apache Arrow
C++ ★ 4 9y agoExplain → -
scikit-learn ⑂
scikit-learn main repo
C ★ 4 15y agoExplain → -
zipline ⑂
Zipline, a Pythonic Algorithmic Trading Library
Python ★ 4 13y agoExplain → -
read-table
Working on IO utilities for loading structured data into Python
★ 4 15y agoExplain → -
freshell ⑂
No description.
★ 3 4mo agoExplain → -
jira-wrangle
Convert JIRA dump into something more analyzable
Python ★ 3 12y agoExplain → -
crossbow
No description.
★ 3 7y agoExplain → -
r_vs_py ⑂
Simple comparison of Python and R for a basic OLS analysis
Python ★ 3 15y agoExplain → -
gmail-backup ⑂
A Python script to download all your mail from Gmail to your local hard drive.
Python ★ 3 12y agoExplain → -
pymc ⑂
Bayesian inference in Python
FORTRAN ★ 3 15y agoExplain → -
tokyo ⑂
A Cython wrapper to BLAS and LAPACK
Python ★ 3 15y agoExplain → -
fye_2010
No description.
★ 3 15y agoExplain → -
pyodbc ⑂
Python ODBC bridge
C++ ★ 3 14y agoExplain → -
transcript-playbook
No description.
Python ★ 2 5mo agoExplain → -
argh
No description.
Python ★ 2 1y agoExplain → -
velox ⑂
A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
★ 2 4y agoExplain → -
kudu ⑂
Mirror of Apache Kudu
★ 2 6y agoExplain → -
conbench ⑂
General purpose, language-agnostic Continuous Benchmarking (CB) framework
★ 2 6y agoExplain → -
fastparquet ⑂
python implementation of the parquet columnar file format.
Python ★ 2 7y agoExplain → -
arrow-io-test
Continuous integration for the trickier bits in Apache Arrow
Shell ★ 2 9y agoExplain → -
tidy-data ⑂
A paper on data tidying
R ★ 2 15y agoExplain → -
textreader ⑂
Yet another text file reader for numpy.
C ★ 2 14y agoExplain → -
nose-ipdb ⑂
A nose plugin to use iPDB instead of PDB
Python ★ 2 14y agoExplain → -
tkmx-client ⑂
No description.
★ 1 2mo agoExplain → -
infrastructure-puppet ⑂
Apache Infrastructure Puppet
★ 1 6y agoExplain → -
pymapd ⑂
No description.
Python ★ 1 8y agoExplain → -
arrow-rs ⑂
Official Rust implementation of Apache Arrow
Rust ★ 1 5y agoExplain → -
orc-feedstock ⑂
A conda-smithy repository for orc.
Shell ★ 1 7y agoExplain → -
turbodbc ⑂
Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
C++ ★ 1 6y agoExplain → -
arrow-site ⑂
Mirror of Apache Arrow site
Python ★ 1 5y agoExplain → -
arrow-activity
No description.
Python ★ 1 1y agoExplain → -
pandas-governance ⑂
Project governance documents for the pandas Project
★ 1 9y agoExplain → -
PyTables ⑂
PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. This a git-svn clone of the Pro veresion recently released under a BSD-flavored license by Francesc Alted!
Python ★ 1 15y agoExplain → -
grpc-cpp-feedstock ⑂
A conda-smithy repository for grpc-cpp.
Shell ★ 1 7y agoExplain → -
distributed ⑂
A distributed task scheduler for Dask
Python ★ 1 7y agoExplain → -
tmp-parquet-merge
No description.
C++ ★ 1 7y agoExplain → -
dedupe ⑂
A free python library for accurate and scalelable deduplication and entity-resolution.
C ★ 1 13y agoExplain → -
charlton
Describing statistical models in Python
Python ★ 1 15y agoExplain → -
pyarrow-windows-wheels
No description.
Batchfile ★ 1 9y agoExplain → -
ibis ⑂
Productivity-centric Python big data framework for high performance at Hadoop-scale, with first-class integration with Impala. Co-founded by the creator of pandas
Python ★ 1 9y agoExplain → -
datarray ⑂
Prototyping numpy arrays with named axes for data management. Docs are available at URL below
Python ★ 1 15y agoExplain → -
cyhello
Minimal Cython project
★ 1 14y agoExplain → -
scipy_proceedings ⑂
Tools used to generate the SciPy conference proceedings
Python ★ 1 14y agoExplain → -
core ⑂
there are failing tests. please find any bugs you may have introduced, fix and submit.
C ★ 1 14y agoExplain → -
pymaging ⑂
Pure Python imaging library with Python 2.6, 2.7, 3.1 and 3.2 support
Python ★ 1 14y agoExplain → -
scipy ⑂
Scipy main repository
C ★ 1 14y agoExplain → -
yasnippets-latex ⑂
LaTeX snippets for use with the yasnippet Emacs plugin
★ 1 15y agoExplain → -
maclocal-api ⑂
'afm' command cli: macOS server and single prompt mode that exposes Apple's Foundation and MLX Models and other APIs running on your Mac through a single aggregated OpenAI-compatible API endpoint. Supports Apple Vision and single command (non-server) inference with piping as well . Now with Web Browser and local AI API aggregator
Swift ★ 0 5d agoExplain → -
benchmarks ⑂
Language-independent Continuous Benchmarking (CB) for Apache Arrow
★ 0 9d agoExplain → -
arrowbench ⑂
R package for benchmarking
★ 0 9d agoExplain → -
arrow-benchmarks-ci ⑂
Benchmarks CI for Apache Arrow project
★ 0 8d agoExplain → -
conbench-tmp
Temporary Conbench new-vision documentation preview
HTML ★ 0 13d agoExplain → -
claudechic ⑂
A stylish terminal UI for Claude Code
Python ★ 0 4mo agoExplain → -
video-assets
Video assets for GitHub README embedding
★ 0 5mo agoExplain → -
earl-uk-2025
No description.
HTML ★ 0 8mo agoExplain → -
monarchmoney ⑂
Python API for Monarch Money
★ 0 1y agoExplain → -
arrow-testing ⑂
Auxiliary testing files for Apache Arrow
★ 0 6y agoExplain → -
duckdb ⑂
DuckDB is an embeddable SQL OLAP Database Management System
C++ ★ 0 4y agoExplain → -
arrow-datafusion ⑂
Apache Arrow DataFusion and Ballista query engines
★ 0 5y agoExplain → -
flatbuffers ⑂
FlatBuffers: Memory Efficient Serialization Library
C++ ★ 0 6y agoExplain → -
dataframe_spec ⑂
No description.
★ 0 6y agoExplain → -
xxHash ⑂
Extremely fast non-cryptographic hash algorithm
★ 0 6y agoExplain → -
pelican-bootstrap3 ⑂
Bootstrap 3 theme for Pelican
CSS ★ 0 4y agoExplain → -
hello
No description.
★ 0 11y agoExplain → -
cyavro ⑂
Cython based wrapper for libavro
Python ★ 0 10y agoExplain → -
hpat ⑂
No description.
Python ★ 0 8y agoExplain → -
libprotobuf-feedstock ⑂
A conda-smithy repository for libprotobuf.
Shell ★ 0 7y agoExplain → -
benchmark-feedstock ⑂
A conda-smithy repository for benchmark.
Shell ★ 0 7y agoExplain → -
benchmark ⑂
A microbenchmark support library
C++ ★ 0 7y agoExplain → -
zstd-feedstock ⑂
A conda-smithy repository for zstd.
Shell ★ 0 7y agoExplain → -
c-ares-feedstock ⑂
A conda-smithy repository for c-ares.
Shell ★ 0 7y agoExplain → -
libgdf ⑂
C GPU Dataframe Library
Cuda ★ 0 8y agoExplain → -
re2-feedstock ⑂
A conda-smithy repository for re2.
Shell ★ 0 7y agoExplain → -
parquet-cpp ⑂
Mirror of Apache Parquet
C++ ★ 0 7y agoExplain → -
gtest-feedstock ⑂
A conda-smithy repository for gtest.
Shell ★ 0 8y agoExplain → -
arrow-dist ⑂
Apache Arrow
Ruby ★ 0 8y agoExplain → -
parquet-format ⑂
Mirror of Apache Parquet
Java ★ 0 8y agoExplain → -
setuptools_scm ⑂
the blessed package to manage your versions by scm tags
Python ★ 0 8y agoExplain → -
arrow-1 ⑂
Graphistry's TypeScript implementation of the Apache Arrow columnar data format
TypeScript ★ 0 9y agoExplain → -
libhdfs3-downstream
Downstream copy of libhdfs3 for simpler packaging in conda-forge. Please submit changes to https://github.com/apache/incubator-hawq
C++ ★ 0 9y agoExplain → -
lz4-c-feedstock ⑂
A conda-smithy repository for lz4-c.
Shell ★ 0 9y agoExplain → -
toolchain-build
No description.
★ 0 9y agoExplain → -
jemalloc-feedstock ⑂
A conda-smithy repository for jemalloc.
Shell ★ 0 9y agoExplain → -
turbodbc-feedstock ⑂
A conda-smithy repository for turbodbc.
Shell ★ 0 9y agoExplain → -
protobuf-feedstock ⑂
A conda-smithy repository for protobuf.
Shell ★ 0 9y agoExplain → -
gflags-feedstock ⑂
A conda-smithy repository for gflags.
Shell ★ 0 9y agoExplain → -
dask ⑂
Versatile parallel programming with task scheduling
Python ★ 0 9y agoExplain → -
libndtypes2 ⑂
Datashape C library
C ★ 0 9y agoExplain → -
brotli-feedstock ⑂
A conda-smithy repository for brotli.
Shell ★ 0 8y agoExplain → -
spark ⑂
Mirror of Apache Spark
Scala ★ 0 9y agoExplain → -
ibis-framework-feedstock ⑂
A conda-smithy repository for ibis-framework.
Shell ★ 0 9y agoExplain → -
impyla ⑂
Pure Python client for Impala & Hive using HiveServer2
Python ★ 0 9y agoExplain → -
arrow-cpp-feedstock ⑂
A conda-smithy repository for arrow-cpp.
Shell ★ 0 8y agoExplain → -
pyarrow-feedstock ⑂
A conda-smithy repository for pyarrow.
Shell ★ 0 8y agoExplain → -
bottleneck-feedstock ⑂
A conda-smithy repository for bottleneck.
Shell ★ 0 9y agoExplain → -
parquet-cpp-feedstock ⑂
A conda-smithy repository for parquet-cpp.
Shell ★ 0 8y agoExplain → -
feather-format-feedstock ⑂
A conda-smithy repository for feather-format.
Python ★ 0 9y agoExplain → -
impyla-feedstock ⑂
A conda-smithy repository for impyla.
Python ★ 0 9y agoExplain → -
zlib-feedstock ⑂
A conda-smithy repository for zlib.
Python ★ 0 10y agoExplain → -
snappy-feedstock ⑂
A conda-smithy repository for snappy.
Python ★ 0 9y agoExplain → -
boost-feedstock ⑂
A conda-smithy repository for boost.
Python ★ 0 10y agoExplain → -
staged-recipes ⑂
A place to submit conda recipes before they become fully fledged conda-forge feedstocks
Python ★ 0 8y agoExplain → -
conda ⑂
OS-agnostic, system-level binary package manager and ecosystem
Python ★ 0 10y agoExplain → -
grin ⑂
A grep program configured the way I like it.
Python ★ 0 10y agoExplain → -
native-toolchain ⑂
No description.
Shell ★ 0 10y agoExplain → -
avro ⑂
Mirror of Apache Avro
Java ★ 0 10y agoExplain → -
cysqlite3 ⑂
No description.
Python ★ 0 10y agoExplain → -
bootswatch ⑂
Themes for Bootstrap
HTML ★ 0 10y agoExplain → -
pytest-ipdb ⑂
Provides ipdb on failures for py.test.
Python ★ 0 11y agoExplain → -
pelican-octopress-theme ⑂
Octopress default theme copied for pelican
CSS ★ 0 11y agoExplain → -
hdfs ⑂
API and command line interface for HDFS
Python ★ 0 11y agoExplain → -
nose ⑂
nose is nicer testing for python
Python ★ 0 12y agoExplain → -
drawarray
No description.
★ 0 13y agoExplain → -
DVL ⑂
Dynamic Visualization LEGO
JavaScript ★ 0 13y agoExplain →
No repos match these filters.