The error message you have shared with us we can see the error is related to numpy package we suggest you to follow the commands below in your terminal to first install pip and then numpy after this try to import Tokenizer
1. Add the EPEL Repository
Pip is not available in CentOS 7 core repositories. To install pip we need to enable the EPEL repository:
sudo yum install epel-release
2. Install pip
Once the EPEL repository is enabled we can install pip and all of its dependencies with the following command:
sudo yum install python-pip
3. Verify Pip installation
To verify that the pip is installed correctly run the following command which will print the pip version:
pip --version
After this use
pip install numpy
to install numpy package
Hope this helps!
To know more about Pyspark, it's recommended that you join PySpark Training today.
Thanks.