Hi@akhtar,
To integrate Hadoop with Spark, you need to configure a cluster first. After that, you should have the following software installed in your system.
-
Python
-
Apache Spark
-
findspark library
-
Numpy
-
Jupyter
After installing this software, open your Jupyter notebook and import findspark as follows.
$ import findspark
$ findspark.init('Replace Spark Path')
Now you are ready to do your job as you did in Machine Learning.