The benefit of using a pre-built binary is that you will not have to go through the trouble of building the spark binaries from scratch.(You can unzip it to any drive on your computer) Once downloaded I unzipped the *.tar file by using WinRar to the D drive.I chose Spark release 1.2.1, package type Pre-built for Hadoop 2.3 or later from here. Download a pre-built Spark binary for Hadoop.If you are not a python user then you also do not need to setup the python path as the environment variable If you are a Python user then Install Python 2.6+ or above otherwise this step is not required.To install Spark on a windows based environment the following prerequisites should be fulfilled first. And finally, I was able to come up with the following brief steps that lead me to a working instantiation of Apache Spark. I invested two days searching the internet trying to find out how to install and configure it on a windows based environment. But I wanted to get a taste of this technology on my personal computer. However, they are using a pre-configured VM setup specific for the MOOC and for the lab exercises. In order to learn how to work on it currently there is a MOOC conducted by UC Berkley here. Apache Spark is a lightening fast cluster computing engine conducive for big data processing.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |