Users can also download a “Hadoop free” binary and run Spark with any Hadoop version Downloads are pre-packaged for a handful of popular Hadoop versions. Spark uses Hadoop’s client libraries for HDFS and YARN. ![]() This documentation is for Spark version 3.3.1. Get Spark from the downloads page of the project website. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for incremental computation and stream processing. It provides high-level APIs in Java, Scala, Python and R,Īnd an optimized engine that supports general execution graphs. Apache Spark is a unified analytics engine for large-scale data processing.
0 Comments
Leave a Reply. |