Page Contents
How to Install Apache Spark on Mac
If your field of work consists of analytics or Python development, being able to practice and work on PySpark becomes a daily part of your life. If it is analyzing datasets, using machine learning, or even using Python in other development areas, having pre-requisites for the same is essential for your Mac. However, installing Apache Spark on your Mac is not a one single package installation and can require previous checks and installation in different steps. Follow these steps below to see how you can install Apache Spark on your Mac:
-
Checking prerequisites:
Before installing Apache Spark, you need to have Java, Homebrew, and other requisites for the proper functioning of your Apache Spark on your Mac. To begin with, let’s install Java:
-
- Homebrew Installation: Before installing Java you need to install Homebrew and you can do that by visiting: brew.sh. The page opens to show you a command to use in the terminal and copy and paste it in the terminal: /bin/bash -c “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)”
- Now use the following command to update homebrew: brew upgrade && brew update
- Now we’re good to install Java and you can do so by first checking the java version installed in your Mac. Type the following command in the terminal: Java -version
- Use this command to update and install the latest Java 8 package: or if you’re looking for even more recent package:brew cask install java8brew cask install java
- Homebrew Installation: Before installing Java you need to install Homebrew and you can do that by visiting: brew.sh. The page opens to show you a command to use in the terminal and copy and paste it in the terminal:
-
Install XCode:
XCode is the complete Mac development repository to help you install further packages. Install XCode using:
-
Installing Scala and other prerequisite packages:
Follow the steps below to type the commands in the terminal one by one:
-
- Scala Installation: brew install scala
- Apache Spark Installation: brew install apache-spark
- Spark terminal: Spark-shell
- To check if it is active use the command: val s = “hello world”
- Run pyspark to start pyspark shell
- Scala Installation:
-
Adding Spark to bash:
Type each command under different points in new lines and then add the path to the profile:
nano ~/.profile export SPARK_HOME=/usr/local/Cellar/apache-spark/2.4.4/libexec export PYTHONPATH=/usr/local/Cellar/apache-spark/2.4.4/libexec/python/:$PYTHONP$ source ~/.bash_profile
-
Finalize installation:
Now that we have installed and run all the required scripts and prerequisite packages, you can start all PySpark messages with:
-
Using Spark:
Now you can just use Spark in your browser with the following steps:
-
- Spark Master UI : http://localhost:8080/
- Spark Application UI : http://localhost:4040/
Conclusion
After running all commands and installing prerequisite packages, you have to check versions and if the packages installed are working as well. Moreover, in case there are still missing packages, we recommend you to install Apache Spark by setting up Anaconda Python repository. Please also note that you should use the code line to line instead of using it in the same line, as Python and Mac Terminal do not compile after you have written but execute commands on the go depending upon the compile environment.
Leave a Reply