20-day longest streak
-
xgboost4j-spark-scalability
a benchmark to test scalability of xgboost4j-spark and relevant projects
Scala ★ 22 6y agoExplain → -
KittenWhisker
debugging performance issues for Spark applications
C ★ 9 7y agoExplain → -
XGBoostExperiments
repo containing XGBoost-based ML project for various purposes
Scala ★ 7 7y agoExplain → -
HappyHadooping
an automatic tool to deploy Hadoop on EC2
Shell ★ 6 13y agoExplain → -
HederaInFloodlight
Implementation of Hedera based on Floodlight
Java ★ 3 12y agoExplain → -
mininet_stuffs
a fat tree topology developed within mininet env
Python ★ 2 12y agoExplain → -
LoadWeaver
a flexible and lightweight workload generator for Hadoop 1.x
Java ★ 2 12y agoExplain → -
lerobot-explorer
No description.
Jupyter Notebook ★ 1 5mo agoExplain → -
spark ⑂
Mirror of Apache Spark
Scala ★ 1 6mo agoExplain → -
Self-Learning-Notebooks
RLLearning
HTML ★ 1 7y agoExplain → -
docker-scripts
docker-scripts for daily dev
Shell ★ 1 11y agoExplain → -
SparkNet ⑂
Distributed Neural Networks for Spark
C++ ★ 1 10y agoExplain → -
LongTermFairScheduler
LongTermFairScheduler
Java ★ 1 13y agoExplain → -
Isaac-GR00T ⑂
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
★ 0 6mo agoExplain → -
ray-playground
local ray env for learning
Python ★ 0 5mo agoExplain → -
learning_ray ⑂
Notebooks for the O'Reilly book "Learning Ray"
★ 0 2y agoExplain → -
incubator-celeborn ⑂
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
Java ★ 0 5mo agoExplain → -
spark-rapids-jni ⑂
RAPIDS Accelerator JNI For Apache Spark
★ 0 8mo agoExplain → -
spark-rapids-examples ⑂
A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.
★ 0 9mo agoExplain → -
spark-rapids ⑂
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
★ 0 8mo agoExplain → -
spark-jobserver ⑂
REST job server for Spark
Scala ★ 0 12y agoExplain → -
BigDL ⑂
BigDL: Distributed Deep Learning Library for Apache Spark
Scala ★ 0 7y agoExplain → -
analytics-zoo ⑂
Distributed Tensorflow, Keras and BigDL on Apache Spark
Jupyter Notebook ★ 0 7y agoExplain → -
celeborn-website ⑂
Apache Celeborn Site
★ 0 2y agoExplain → -
gluten ⑂
No description.
Scala ★ 0 2y agoExplain → -
incubator-uniffle ⑂
Uniffle is a high performance, general purpose Remote Shuffle Service.
★ 0 2y agoExplain → -
ec2-selector-cli
the cli tool to select ec2 instances based on filters
Rust ★ 0 3y agoExplain → -
frameless ⑂
Expressive types for Spark.
★ 0 3y agoExplain → -
velox-intel ⑂
A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
C++ ★ 0 3y agoExplain → -
gazelle_plugin ⑂
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
Scala ★ 0 3y agoExplain → -
incubator-sedona ⑂
A cluster computing framework for processing large-scale geospatial data
Java ★ 0 3y agoExplain → -
string_encoder
No description.
Rust ★ 0 3y agoExplain → -
terraform-aws-eks-node-group ⑂
Terraform module to provision a fully managed AWS EKS Node Group
★ 0 4y agoExplain → -
iceberg ⑂
Apache Iceberg
Java ★ 0 2y agoExplain → -
spark-lineage ⑂
Spark SQL listener to record lineage information
★ 0 5y agoExplain → -
spark-sql-macros ⑂
Spark SQL Macros provides a mechanism similar to Spark User-Defined function registration; with the key enhancement being that custom code gets compiled to equivalent Catalyst Expressions at macro define time.
★ 0 5y agoExplain → -
how-query-engines-work ⑂
This is the companion repository for the book How Query Engines Work.
★ 0 5y agoExplain → -
arrow-datafusion ⑂
Apache Arrow DataFusion and Ballista query engines
Rust ★ 0 4y agoExplain → -
delta ⑂
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Scala ★ 0 5y agoExplain → -
xgboost ⑂
Large-scale and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, on single node, hadoop yarn and more.
C++ ★ 0 5y agoExplain → -
cockroachdb_playground
some programs to play around cockroachdb
Python ★ 0 5y agoExplain → -
cockroachdb-todo-apps ⑂
CockroachDB To-Do Apps
★ 0 5y agoExplain → -
chisel3 ⑂
Chisel 3
Scala ★ 0 7y agoExplain → -
firrtl ⑂
Flexible Intermediate Representation for RTL
Scala ★ 0 7y agoExplain → -
noisepage ⑂
Self-Driving Database Management System from Carnegie Mellon University
★ 0 5y agoExplain → -
rabit ⑂
Reliable Allreduce and Broadcast Interface for distributed machine learning
★ 0 6y agoExplain → -
tvm ⑂
bring deep learning workloads to bare metal
C++ ★ 0 8y agoExplain → -
morpheus ⑂
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Scala ★ 0 7y agoExplain → -
github-markdown-toc ⑂
Easy TOC creation for GitHub README.md
Shell ★ 0 7y agoExplain → -
HiBench ⑂
HiBench is a big data benchmark suite.
Java ★ 0 9y agoExplain → -
dmlc-core ⑂
A common bricks library for building scalable and portable distributed machine learning.
C++ ★ 0 7y agoExplain → -
pinot ⑂
A realtime distributed OLAP datastore
Java ★ 0 7y agoExplain → -
TransmogrifAI ⑂
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Spark with minimal hand tuning
Scala ★ 0 7y agoExplain → -
mlflow ⑂
Open source platform for the machine learning lifecycle
Python ★ 0 7y agoExplain → -
brainsuck ⑂
A simple optimizing Brainfuck compiler (used as demo for my QCon Beijing 2015 talk)
Brainfuck ★ 0 8y agoExplain → -
arrow ⑂
Mirror of Apache Arrow
C++ ★ 0 8y agoExplain → -
parquet-cpp ⑂
Apache Parquet
C++ ★ 0 8y agoExplain → -
web-data ⑂
The repo to host all the web data including images for documents in dmlc projects.
HTML ★ 0 8y agoExplain → -
mxnet ⑂
Efficient and Flexible Distributed Deep Learning Framework, for python, R, Julia and more
Python ★ 0 8y agoExplain → -
ps-lite ⑂
A lightweight parameter server interface
C++ ★ 0 8y agoExplain → -
tree-lite
fast tree inference
C++ ★ 0 8y agoExplain → -
TLAPlusCourse
TLAPlusCourseProject
TLA ★ 0 8y agoExplain → -
TLAPlusFun
No description.
TLA ★ 0 8y agoExplain → -
parquet-mr ⑂
Mirror of Apache Parquet
Java ★ 0 8y agoExplain → -
spark-eventhubs ⑂
Achieving Real-time Data Analytics with Spark and EventHubs
Scala ★ 0 8y agoExplain → -
azure-storage-java ⑂
Microsoft Azure Storage Library for Java
Java ★ 0 8y agoExplain → -
hive ⑂
Mirror of Apache Hive
Java ★ 0 9y agoExplain → -
incubator-livy ⑂
Mirror of Apache livy (Incubating)
Scala ★ 0 9y agoExplain → -
FiloDB ⑂
Distributed. Columnar. Versioned. Streaming. SQL.
Scala ★ 0 9y agoExplain → -
spark-deep-learning ⑂
Deep Learning Pipelines for Apache Spark
Python ★ 0 9y agoExplain → -
perf-map-agent ⑂
A java agent to generate method mappings to use with the linux `perf` tool
C ★ 0 9y agoExplain → -
perfj ⑂
PerfJ is a wrapper of linux perf for java programs.
C ★ 0 9y agoExplain → -
HDInsightOMS ⑂
No description.
Shell ★ 0 9y agoExplain → -
ambari ⑂
Mirror of Apache Ambari
Java ★ 0 9y agoExplain → -
benchm-ml ⑂
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
R ★ 0 9y agoExplain → -
WebLoggingGenerator
No description.
Python ★ 0 9y agoExplain → -
CustomerTests
example code to show
Scala ★ 0 9y agoExplain → -
dr-elephant ⑂
Performance monitoring and tuning tool for Apache Hadoop
Java ★ 0 9y agoExplain → -
tpcds-kit ⑂
TPC-DS benchmark kit with some modifications/additions
Smarty ★ 0 10y agoExplain → -
streaming-benchmarks ⑂
Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...
Java ★ 0 9y agoExplain → -
tpcds ⑂
Port of TPC-DS dsdgen to Java
Java ★ 0 9y agoExplain → -
script-actions ⑂
script actions in powershell and bash to install/update new components on HDInsight clusters
Shell ★ 0 9y agoExplain → -
azure-documentdb-java ⑂
No description.
Java ★ 0 9y agoExplain → -
tensorframes ⑂
Tensorflow wrapper for DataFrames on Apache Spark
Scala ★ 0 9y agoExplain → -
test_timestamp
test_timestamp
Scala ★ 0 9y agoExplain → -
eventhubs-sample-event-producer ⑂
Example showing how events can be generated and pushed to Microsoft Azure Sevicebus Eventhubs
Scala ★ 0 9y agoExplain → -
xgboost_test
integration test for xgboost
Scala ★ 0 9y agoExplain → -
azure-event-hubs-java ⑂
Java client library for Azure Event Hubs https://azure.microsoft.com/services/event-hubs
Java ★ 0 9y agoExplain → -
eventhubs-client ⑂
A generic Java client for Microsoft Azure EventHubs
Java ★ 0 9y agoExplain → -
spark-streaming-data-persistence-examples ⑂
Examples showing how streaming events can be persisted to Azure blob, Hive table and Azure SQL Table through Spark.
Scala ★ 0 9y agoExplain → -
McGill-COMP535-Fall-2015
COMP 535: Computer Network
Java ★ 0 10y agoExplain → -
typescript_learn
start learning typescript
TypeScript ★ 0 9y agoExplain → -
peloton ⑂
The Self-Driving Database Management System
C++ ★ 0 9y agoExplain → -
astore ⑂
Avro Data Store based on Akka
Scala ★ 0 11y agoExplain → -
macrobase ⑂
http://macrobase.io/
Java ★ 0 10y agoExplain → -
SEEP ⑂
No description.
Java ★ 0 10y agoExplain → -
dmlc.github.io ⑂
the homepage http://dmlc.ml
HTML ★ 0 9y agoExplain → -
The-Art-Of-Programming-by-July ⑂
本项目是July的《程序员编程艺术》的电子书版本
C ★ 0 12y agoExplain → -
mediator ⑂
a medium inspired jekyll theme
HTML ★ 0 10y agoExplain → -
LearningHaskell
personal repo for learning Haskell
Haskell ★ 0 10y agoExplain → -
mapdb ⑂
MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.
Java ★ 0 11y agoExplain → -
rocksdb ⑂
A library that provides an embeddable, persistent key-value store for fast storage.
C++ ★ 0 11y agoExplain → -
maiter
Automatically exported from code.google.com/p/maiter
C++ ★ 0 11y agoExplain → -
gpudb
Automatically exported from code.google.com/p/gpudb
Python ★ 0 11y agoExplain → -
LambdaEva
No description.
Scala ★ 0 11y agoExplain → -
benchmarkingMapDB
a benchmark program for evaluating MapDB performance
Scala ★ 0 11y agoExplain → -
play-silhouette2-slick-seed
an example on how to use silhouette 2.0 with slick
Scala ★ 0 11y agoExplain → -
securesocial ⑂
A module that provides OAuth, OAuth2 and OpenID authentication for Play Framework applications
Scala ★ 0 11y agoExplain → -
scala-js-temp-project
No description.
Scala ★ 0 11y agoExplain → -
scala-js ⑂
Scala.js, the Scala to JavaScript compiler
Scala ★ 0 11y agoExplain → -
akka ⑂
Akka Project
Scala ★ 0 11y agoExplain → -
geohash-java ⑂
Implementation of GeoHashes in java. We try to be/stay compliant to the spec, as far as possible.
Java ★ 0 12y agoExplain → -
kafka-manager ⑂
A tool for managing Apache Kafka.
Scala ★ 0 11y agoExplain → -
log_analyzer
No description.
Scala ★ 0 11y agoExplain → -
akka_benchmark
No description.
Scala ★ 0 11y agoExplain → -
spark-pr-dashboard ⑂
Dashboard to aid in Spark pull request reviews
★ 0 11y agoExplain → -
elasticsearch ⑂
Open Source, Distributed, RESTful Search Engine
★ 0 11y agoExplain → -
NetworkFlowSimulator
No description.
Java ★ 0 12y agoExplain → -
ractive ⑂
Next-generation DOM manipulation
★ 0 12y agoExplain → -
SparkTest
some test code for spark development
★ 0 12y agoExplain → -
hadoop-common ⑂
Mirror of Apache Hadoop common
Java ★ 0 12y agoExplain → -
spark-ec2 ⑂
Scripts used to setup a Spark cluster on EC2 - my fork
Shell ★ 0 12y agoExplain → -
floodlight ⑂
Floodlight SDN OpenFlow Controller- my fork
Java ★ 0 13y agoExplain → -
homepage
my homepage in mcgill university
JavaScript ★ 0 13y agoExplain → -
spark_based_bbnp
spark_based_bbnp
Scala ★ 0 13y agoExplain → -
COMP512
No description.
Java ★ 0 12y agoExplain → -
riplpox ⑂
RipL-POX (Ripcord-Lite for POX): A simple network controller for OpenFlow-based data centers
Python ★ 0 12y agoExplain → -
pox ⑂
The POX Controller
Python ★ 0 12y agoExplain → -
SWIM ⑂
Statistical Workload Injector for MapReduce - Project at UC Berkeley AMP Lab
Java ★ 0 14y agoExplain →
No repos match these filters.