Workshop Azure Machine Learning – Building a Human Activity Classifier

Building a Human Activity Classifier on Azure

Workshop Setup and Instruction Guide

In most of our Microsoft Data Science meetup, hosted by i.e. Infi, InSpark, Winvision, Macaw, among others, we organize workshops. This time you will learn how to build a Human Activity Classifier with Azure Machine Learning. This classifier predicts somebody’s activity class (sitting, standing up, standing, sitting down, walking) based on the use of wearable sensors.

The point of this workshop is to introduce you to the basics of creating and deploying a machine learning model in Azure ML, it is not intended to be a deep-dive into model design, validation and improvement. If you want to learn more, please check out our Principles of Machine Learning, Applied Machine Learning or other courses.

This workshop contains the following tasks:

  1. Setup your Azure ML environment
  2. Build your model
  3. Publish your model

What You’ll Need

To perform the tasks, you will need the following:

  • A Windows, Linux, or Mac OSX
  • A web browser and Internet

1. Setup your Azure ML environment

There are several options to start with Azure ML. The easiest way is to got to https://azure.microsoft.com/en-us/services/machine-learning/ and click on the Get started now button.

get started with azure

Hereafter, you can select the Free Workspace option. You will need a Windows LiveID to sign in. If you don’t have one, you can sign up here: https://signup.live.com/

2. Build the Human Activity Classifier

This classifier predicts somebody’s activity class (sitting, standing up, standing, sitting down, walking). It is based on the Human Activity Recognition dataset. Human Activity Recognition (HAR) is an active research area, results of which have the potential to benefit the development of assistive technologies in order to support care of the elderly, the chronically ill and people with special needs. Activity recognition can be used to provide information about patients’ routines to support the development of e-health systems. Two approaches are commonly used for HAR: image processing and use of wearable sensors. In this case we will use information generated by wearable sensors (Ugulino et al, 2012).

In this workshop we use the Human Activity Recognition Data from its source: http://groupware.les.inf.puc-rio.br/har#ixzz2PyRdbAfA. More info can also be found on the UCI repository. The data has been collected during 8 hours of activities, 2 hours with each of the 2 men and 2 women, all adults and healthy. These people were wearing 4 accelerometers from LiliPad Arduino, respectively positioned in the waist, left thigh, right ankle, and right arm. This resulted in a dataset with 165634 rows and 19 columns.

Get and Understand the data

You have several options to start building this experiment:

  1. Starting from scratch and get the data from its source: http://groupware.les.inf.puc-rio.br/har#ixzz2PyRdbAfA. More info can also be found on the UCI repository. You can download the data from http://groupware.les.inf.puc-rio.br/static/har/dataset-har-PUC-Rio-ugulino.zip and extract the downloaded zip file to a convenient folder on your local computer. Jorg Klein wrote a nice blog on how to prepare data with the Azure Machine Learning Workbench.
  2. Start with this experiment: Experiment on Cortana Intelligence Gallery. From here you can open the experiment in your own Azure Machine Learning Studio by clicking on the “open” button. We will continue with this option.

This will open a window where you have to sign in with your Azure account to access the Azure Machine Learning Studio. You can click on your newly downloaded experiment to start.

Human Activity Classifier - DataChangers Inspiration - azure machine learning studio

Prepare the data

Before you can use it to train a classification model you must inspect and prepare the data:

  1. Click on RUN (menu below). You will have to wait until the model is finished running before you continue with the next step.
  2. To visualize the output of the dataset, right-click on the output port of the data module and select Visualize.
    Human Activity Classifier - DataChangers Inspiration - visualize data
    Now you can review the data it contains by selecting the columns. Note that the dataset contains the following variables:
  • user (text)
  • gender (text)
  • age (integer)
  • how_tall_in_meters (real)
  • weight (int)
  • body_mass_index (real)
  • x1 (type int, value of the axis ‘x’ of the 1st accelerometer, mounted on waist)
  • y1 (type int, value of the axis ‘y’ of the 1st accelerometer, mounted on waist)
  • z1 (type int, value of the axis ‘z’ of the 1st accelerometer, mounted on waist)
  • x2 (type int, value of the axis ‘x’ of the 2nd accelerometer, mounted on the left thigh)
  • y2 (type int, value of the axis ‘y’ of the 2nd accelerometer, mounted on the left thigh)
  • z2 (type int, value of the axis ‘z’ of the 2nd accelerometer, mounted on the left thigh)
  • x3 (type int, value of the axis ‘x’ of the 3rd accelerometer, mounted on the right ankle)
  • y3 (type int, value of the axis ‘y’ of the 3rd accelerometer, mounted on the right ankle)
  • z3 (type int, value of the axis ‘z’ of the 3rd accelerometer, mounted on the right ankle)
  • x4 (type int, value of the axis ‘x’ of the 4th accelerometer, mounted on the right upper-arm)
  • y4 (type int, value of the axis ‘y’ of the 4th accelerometer, mounted on the right upper-arm)
  • z4 (type int, value of the axis ‘z’ of the 4th accelerometer, mounted on the right upper-arm)
  • class (text, ‘sitting-down’ ,’standing-up’, ‘standing’, ‘walking’, and ‘sitting’)
    
    
  1. After cleaning the data, we can inspect the data. We start with some descriptive statistics adding a Summarize Data module.
    Run the model after adding this module.
    Human Activity Classifier - DataChangers Inspiration - summarize clean data
  2. Besides, we can inspect the correlation between the numeric columns using the using the Select Columns in Dataset module. Drag this module on the canvas and connect the output port of the Cleaning Missing Data module to the input port of the Select Columns in Dataset module. Now we have to select the numeric columns, using the WITH RULES, and starting with NO COLUMNS, and subsequently select Include, column types, Numeric:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Select-Numerics
  3. Now we can add the Compute Linear Correlation module to calculate the (Pearson’s) correlation. Hit run to run your changed model.
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Correlation
    Observe that there is a strong correlation between length (how_tall_in_meters), weight (weight) and b.m.i. (body_mass_index). This is not surprising as b.m.i is calculated based on length and weight.
    Human Activity Classifier - DataChangers Inspiration - correlation
  4. Just as an illustration, we will remove ‘body_mass_index’ using the Select Columns in Dataset. Connect this module to the output of the Clean Missing module. We also exclude ‘user’, as we don’t need this identifier later on in our model.
    Human Activity Classifier - DataChangers Inspiration - remove user and bmiSelect the Select Columns in Dataset module, and in the Properties pane launch the column selector. Then use the column selector to exclude the following columns:

    • user
    • body_mass_index

You can use the WITH RULES page of the column selector to accomplish this as shown here, starting with ALL COLUMNS and EXCLUDING user and body_mass_index:

Human Activity Classifier - DataChangers Inspiration - Azure-ML-Select-Columns

  1. Now we transform gender to be a categorical variable by adding an Edit Metadata module to the experiment, and connect the Select Columns in Dataset output to its input. Set the properties of the Edit Metadata module as follows:
    • Column: gender
    • Data type: Unchanged
    • Categorical: Make categorical
    • Fields: Features
    • New column names: Leave blank
  2. We will do a likewise transformation with our dependent variable ‘class’, and set it to a categorical variable and define it as our label. Add an Edit Metadata module to the experiment, and connect the Edit Metadata output to its input. Set the properties of the Edit Metadata module as follows:
    • Column: Edit Metadata class
    • Data type: Unchanged
    • Categorical: Make categorical
    • Fields: Label
    • New column names: Leave blank
  3. When the experiment has finished running, visualize the output of the Edit Metadata module and verify that:
    • The columns you specified have been removed.
    • All numeric columns now have a Feature Type of Numeric Feature.
    • All string columns now have a Feature Type of Categorical Feature.

 

Create and Evaluate a Classification Model

Now that you have prepared the data, you will construct and evaluate a classification model. The goal of this model is to identify a human activity and to find out if somebody is ‘sitting-down’, ‘standing-up’, ‘standing’, ‘walking’, or ‘sitting’.

  1. We are now ready to split the data into separate training and test We will train the model with the training dataset, and test the model with the test dataset. Therefore, add a Split Data module to the Human Activity Classifier experiment, and connect the output of the Edit Metadata module to the input of the Split Data module. Set the properties of the Split Data module as follows:
    • Splitting mode: Split Rows
    • Fraction of rows in the first output dataset: 0.7
    • Randomized split: Checked
    • Random seed: 123
    • Stratified split: False
  2. Add a Train Model module to the experiment, and connect the Results dataset1 (left) output of the Split Data module to the Dataset (right) input of the Train Model In the Properties pane for the Train Model module, use the column selector to select the class column. This sets the label column that the classification model will be trained to predict.
  3. Add a Multiclass Decision Forest module to the experiment, and connect the output of the Multiclass Decision Forest module to the Untrained model (left) input of the Train Model This specifies that the classification model will be trained using the multiclass decision forest algorithm.
  4. Set the properties of the Multiclass Decision Forest module as follows:
    • Resampling method: Bagging
    • Create trainer mode: Single Parameter
    • Number of decision trees: 8
    • Maximum depth of decision trees: 32
    • Number of random splits per node: 128
    • Minimum number of samples per leaf: 1
    • Allow unknown categorical levels: Checked
  5. Add a Score Model module to the experiment. Then connect the output of the Train Model module to the Trained model (left) input of the Score Model module, and connect the Results dataset2 (right) output of the Split Data module to the Dataset (right) input of the Score Model module.
  6. On the Properties pane for the Score Model module, ensure that the Append score columns to output checkbox is selected.
  7. Add an Evaluate Model module to the experiment, and connect the output of the Score model module to the Scored dataset (left) input of the Evaluate Model module.
  1. Verify that your experiment resembles the figure below, then save and run the experiment.
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Model-1
  2. When the experiment has finished running, visualize the output of the Score Model module, and compare the predicted values in the Scored Labels column with the actual values from the test data set in the class column.
  3. Visualize the output of the Evaluate Model module, and review the results (shown below). We see the score per class. Then review the Overall Accuracy figure for the model, which should be around 0.994. This indicates that the classifier model is correct 99% of the time, which is a good figure for an initial model, keeping in mind the original distribution of the classification (see below).
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Metrics
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Confusion-Matrix

Detailed Accuracy from the original paper

Correctly Classified Instances  164662 .4144 %

Incorrectly Classified Instances   970 0.5856 %

Root mean squared error  0.0463

Relative absolute error   0.7938 %

Relative absolute error   0.7938 %

So if we compare our results to those of the original paper, we are pretty close.

 

3.     Publish your Human Activity Classifier

Publish the Model as a Web Service

  1. Make sure you have saved and ran the experiment. With the Human Activity Classifier experiment open, click the SET UP WEB SERVICE icon at the bottom of the Azure ML Studio page and click Predictive Web Service [Recommended]. A new Predictive Experiment tab will be automatically created.
  2. Verify that, with a bit of rearranging, the Predictive Experiment resembles this figure:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Predictive
  3. We can now start to remove variables we don’t need for prediction. Besides eliminating ‘user’, and ‘bmi’ we can now also remove ‘class’, as we want that as output from the model. Therefore, you can drag the Select Columns in Dataset module up, add ‘class’ to be removed, and connect it to the original dataset and the output to the Execute R Script.
  4. Besides, we will make sure to use a numeric value for ‘z4’, so we can move the Webservice input and connect it directly to the Edit Metadata module where we make ‘gender’ categorical.
  5. For this experiment, we will also make sure to send complete records, so we remove the Clean Missing Data.
  6. Delete the connection between the Score Model module and the Web service output module.
  7. Add a Select Columns in Dataset module to the experiment, and connect the output of the Score Model module to its input. Then connect the output of the Select Columns in Dataset module to the input of the Web service output module.
  8. Select the Select Columns in Dataset module, and use the column selector to select only the Scored Labels This ensures that when the web service is called, only the predicted value is returned.
  9. Ensure that the predictive experiment now looks like the following, and then save and run the predictive experiment:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Predictive-Setup

 

  1. When the experiment has finished running, visualize the output of the last Select Columns in Dataset module and verify that only the Scored Labels column is returned.

Deploy and Use the Web Service

  1. In the Human Activity Classifier [Predictive Exp.] experiment, click the Deploy Web Service icon at the bottom of the Azure ML Studio window.
  2. Wait a few seconds for the dashboard page to appear, and note the API key and Request/Response You will use these to connect to the web service from a client application.
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Testing-Model
  3. You have several options to connect to the webservice. To test this webservice, you can click on New Web Services Experience (preview). This will open a new browser.
  4. Here you have the option to test your model (Test endpoint option under BASICS):
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Test-Endpoint
  5. When clicking on Test endpoint, you have the option to enable the usage of sample data, which will generate a sample record to test your model with:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Sample-Data
  6. After enabling this sample data, you will see the generated sample data:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Test-Webservice
  7. The final step would be pressing the Test Request-Response button: what kind of activity is this woman doing according to your model?
  8. Another option is to click on the blue TEST button.
    Azure-ML-Test-API
  9. This will open a pop-up window, where you can fill out some test values:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Enter-Data
  10. The last option is to open an Excel file, which will automatically create sample data. Opening this file will add the Azure Machine Learning add-in to the workbook. If that doesn’t work, or you don’t have Excel on your laptop, you could follow the next steps to make a workbook online:
  11. Open a new browser tab.
  12. In the new browser tab, navigate to https://office.live.com/start/Excel.aspx. If prompted, sign in with your Microsoft account (use the same credentials you use to access Azure ML).
  13. In Excel Online, create a new blank workbook.
  14. On the Insert tab, click Office Add-ins. Then in the Office Add-ins dialog box, select Store, search for Azure Machine Learning, and add the Azure Machine Learning add-in as shown below:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Office-Addin
  15. After the add-in is installed, in the Azure Machine Learning pane on the right of the Excel workbook, click Add Web Service. Boxes for the URL and API key of the web service will appear.
  16. On the browser tab containing the dashboard page for your Azure ML web service, right-click the Request/Response link you noted earlier and copy the web service URL to the clipboard. Then return to the browser tab containing the Excel Online workbook and paste the URL into the URL box.
  17. On the browser tab containing the dashboard page for your Azure ML web service, click the Copy button for the API key you noted earlier to copy the key to the clipboard. Then return to the browser tab containing the Excel Online workbook and paste it into the API key box.
  18. Verify that the Azure Machine Learning pane in your workbook now resembles this, and click Add:
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Excel-Azure-Key
  19. After the web service has been added, in the Azure Machine Learning pane, it is opened on 2. Predict. Here you have the option to generate sample data by clicking on Use sample data. This enters some sample input values in the worksheet.
  20. Select the cells containing the input data (cells A1 to P6), and in the Azure Machine Learning pane, click the button to select the input range and confirm that it is ‘Sheet1′!A1:P6.
  21. Ensure that the My data has headers box is checked.
  1. In the Output box type Q1, and ensure the Include headers box is checked.
  2. Click the Predict button, and after a few seconds, view the predicted label in cell Q2.
    Human Activity Classifier - DataChangers Inspiration - Azure-ML-Excel
  3. Change some values of row 2 and click Predict Then view the updated label that is predicted by the web service.
  4. Try changing a few of the input variables and predicting the human activity class. You can add multiple rows to the input range and try various combinations at once.

 

Summary

By completing this lab, you have prepared your environment and data, and built and deployed your own Azure ML model. We hope you enjoyed this introductory lab and that you will build many more machine learning solutions!

 

2 Replies to “Workshop Azure Machine Learning – Building a Human Activity Classifier”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.