spark
Projects with this topic
-
Apache-Spark with Master-Slave setup to work out of the box using OpenHPC and Slurm
Updated -
"Cloud container data analytics, statistical modeling, and machine learning on distributed databases". "A free opensource alternative to SPSS, SAS, MATLAB, PowerBI, Tableau and Alteryx". Runs on Linux, Windows, MacOS, and in the cloud via containers.
Updated -
This web app finds the best configuration of a Spark Application given the hardware of the cluster
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated -
This repo presents performance comparisons between a serial implementation, a MPI based and a Spark based implementation of a document clustering algorithm
Updated -
analytics suite for 'Tor Project' metrics data
Updated