Building a real-time big data pipeline 6: Spark Core, Hadoop, SBT
Apache Spark is an open-source cluster computing system that provides high-level APIs in Java, Scala, Python and R. Spark also packaged with higher-level libraries for SQL, machine learning (MLlib), streaming, and graphs (GraphX). >>>