It can be difficult to install a Python machine learning environment on some platforms.
Python itself must be installed first and then there are many packages to install, and it can be confusing for beginners.
In this tutorial, you will discover how to set up a Python machine learning development environment using Anaconda.
After completing this tutorial, you will have a working Python environment to begin learning, practicing, and developing machine learning and deep learning software.
These instructions are suitable for Windows, Mac OS X, and Linux platforms. I will demonstrate them on OS X, so you may see some mac dialogs and file extensions.
Let’s get started.
In this tutorial, we will cover the following steps:
In this step, we will download the Anaconda Python package for your platform.
Anaconda is a free and easy-to-use environment for scientific Python.
This will download the Anaconda Python package to your workstation.
I’m on OS X, so I chose the OS X version. The file is about 426 MB.
You should have a file with a name like:
- Anaconda3-4.2.0-MacOSX-x86_64.pkg
In this step, we will install the Anaconda Python software on your system.
This step assumes you have sufficient administrative privileges to install software on your system.
Installation is quick and painless.
There should be no tricky questions or sticking points.
The installation should take less than 10 minutes and take up a little more than 1 GB of space on your hard drive.
In this step, we will confirm that your Anaconda Python environment is up to date.
Anaconda comes with a suite of graphical tools called Anaconda Navigator. You can start Anaconda Navigator by opening it from your application launcher.
You can learn all about the Anaconda Navigator here(link:https://docs.continuum.io/anaconda/navigator.html).
You can use the Anaconda Navigator and graphical development environments later; for now, I recommend starting with the Anaconda command line environment called conda(link:http://conda.pydata.org/docs/index.html).
Conda is fast, simple, it’s hard for error messages to hide, and you can quickly confirm your environment is installed and working correctly.
- conda -V
You should see the following (or something similar):
- conda 4.2.9
- python -V
You should see the following (or something similar):
- Python 3.5.2 :: Anaconda 4.2.0 (x86_64)
If the commands do not work or have an error, please check the documentation for help for your platform.
See some of the resources in the “Further Reading” section.
- conda update conda
- conda update anaconda
You may need to install some packages and confirm the updates.
The script below will print the version number of the key SciPy libraries you require for machine learning development, specifically: SciPy, NumPy, Matplotlib, Pandas, Statsmodels, and Scikit-learn.
You can type “python” and type the commands in directly. Alternatively, I recommend opening a text editor and copy-pasting the script into your editor.
- # scipy
- import scipy
- print('scipy: %s' % scipy.__version__)
- # numpy
- import numpy
- print('numpy: %s' % numpy.__version__)
- # matplotlib
- import matplotlib
- print('matplotlib: %s' % matplotlib.__version__)
- # pandas
- import pandas
- print('pandas: %s' % pandas.__version__)
- # statsmodels
- import statsmodels
- print('statsmodels: %s' % statsmodels.__version__)
- # scikit-learn
- import sklearn
- print('sklearn: %s' % sklearn.__version__)
Save the script as a file with the name: versions.py.
On the command line, change your directory to where you saved the script and type:
- python versions.py
You should see output like the following:
- scipy: 0.18.1
- numpy: 1.11.1
- matplotlib: 1.5.3
- pandas: 0.18.1
- statsmodels: 0.6.1
- sklearn: 0.17.1
What versions did you get?
Paste the output in the comments below.
In this step, we will update the main library used for machine learning in Python called scikit-learn.
At the time of writing, the version of scikit-learn shipped with Anaconda is out of date (0.17.1 instead of 0.18.1). You can update a specific library using the conda command; below is an example of updating scikit-learn to the latest version.
At the terminal, type:
- conda update scikit-learn
Alternatively, you can update a library to a specific version by typing:
- conda install -c anaconda scikit-learn=0.18.1
Confirm the installation was successful and scikit-learn was updated by re-running the versions.py script by typing:
- python versions.py
You should see output like the following:
- scipy: 0.18.1
- numpy: 1.11.3
- matplotlib: 1.5.3
- pandas: 0.18.1
- statsmodels: 0.6.1
- sklearn: 0.18.1
What versions did you get?
Paste the output in the comments below.
You can use these commands to update machine learning and SciPy libraries as needed.
In this step, we will install Python libraries used for deep learning, specifically: Theano, TensorFlow, and Keras.
NOTE: I recommend using Keras for deep learning and Keras only requires one of Theano or TensorFlow to be installed. You do not need both! There may be problems installing TensorFlow on some Windows machines.
- conda install theano
- conda install -c conda-forge tensorflow
Alternatively, you may choose to install using pip and a specific version of tensorflow for your platform.
See the installation instructions for tensorflow(link:https://www.tensorflow.org/get_started/os_setup#anaconda_installation).
- pip install keras
Create a script that prints the version numbers of each library, as we did before for the SciPy environment.
- # theano
- import theano
- print('theano: %s' % theano.__version__)
- # tensorflow
- import tensorflow
- print('tensorflow: %s' % tensorflow.__version__)
- # keras
- import keras
- print('keras: %s' % keras.__version__)
Save the script to a file deep_versions.py. Run the script by typing:
- python deep_versions.py
You should see output like:
- theano: 0.8.2.dev-901275534cbfe3fbbe290ce85d1abf8bb9a5b203
- tensorflow: 0.12.1
- Using TensorFlow backend.
- keras: 1.2.1
What versions did you get?
Paste your output in the comments below.
This section provides some links for further reading.