Apache Spark
Projects with this topic
-
-
-
Notebooks for Pandas, Spark and Python experiments.
Updated -
The SalesStream dashboards is an application for monitoring and analyzing revenue data in real time. By leveraging the power of Apache Spark and Apache Kafka, this system ensures that financial data is processed efficiently and in a timely manner, providing companies with up-to-date insights into their revenue streams.
Updated -
-
Implementation of Geo-Temporally Weighted Regression (GTWR) using Apache Spark, Spark ML and Apache Sedona.
Updated -
Implementation of Geographically Weighted Regression (GWR) using Apache Spark, Spark ML and Apache Sedona.
Updated -
The "Stage Metrics" plugin for Apache Spark to creating metrics by stage status
Updated -
-
sentiment analysis using spark ml library. implemented classic ml models: SVM, Logistic Regression, Naive Bayes and Random Forest. implemented embedding: Word2Vec and TF-IDF. also ensemble and hybrid (ml and lexicon based) methods were implemented
Updated -
-
Execute Hadoop and Spark applications on the BigData@Polito cluster with a single command
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated