Azure Machine Learning Studio in action
In previous post we talked about what is azure machine learning. In this post, we are going to see azure machine learning studio in action.
I am going to demonstrate machine learning in a simple experiment. First I have to create a new experiment by clicking on the New button.
There is a list of tempaltes available and I can choose from. The list of available temmplates is quite extensive. There is an experiment tutorial or what I am going to choose here is a blank experiment.
This experiment is presented to me as a blank canvas in the center of the screen as components to choose from on the left side and properties on the right side. What I first need is a name and I am going to call this the Income Experiment.
There is the option to import data so there is a data reader I can just drag it from the left component list to my canvas.
On the right side I can now choose from various import options so all the data available to me on the internet I can basically import into my experiment.
I am going to use pre-imported data for the sake of time. I add Adult Census Income database from sample Saved datasets.
This database I can visualize by clicking on the circle and selecting visualize from the menu.
You will see the data here. I have thirty two thousand five hundred records of data.
There are a lot of information in this database about census data available e.g. age of the person, education etc.
Each part of the data can be visualized as a curve. For instance here is the age distribution
Also as you can see below, income which is important for us has been split into binary value people earning more or less than $50,000.
This is because true or false or binary decisions are a bit easier for machien learning or self trained model to predict than exact values. So we have an algorithm trains itself to be able to predict the income for people it has never seen combinations, it does not know based on these 32,000 records it can use to learn from.
So, what we are going to do is first we need to only select a couple of fields. I use the data transformation manipulation functions and there is a project columns function.
I simply hook up the to this project columns function.
Then I launch the column selector and tell it to only use age, education, marital-status, relationship, race, sex, and income field.
With these fields selected, we have the data filtered to only the fields needed in the experiment. Next, we need a machine learning algorithm that is good at predicting this kind of data. As you can see in the machine learning part of selection menu here, there are a lot of inilialized model, there is classification
and there is the two class boosted decision tree. Two class means it is great at predicting true or false scenarios as this model.
Now, we need to train this model. So, under Train menu, train model that is capable of training a model, we just connect our model to this trainer.
It also needs data to train the model and for that we are going to split the data. There is a Split Data function and we connect our data source to the data splitter and we just tell it to use 80 percent (0.8) of the data to be fed to the left output.
We feed this 80 percent of data to our trainer and then after that we need to tell our data model tainer that the field called income is supposed to be trained.
So the trainer now knows that the model is supposed to be able to guess the income based on all the other data it receives.
After training the model we want to score our model to see how well it has performed and then we wan to evaluate our model to visualize its performance. we add Score Model and Evaluate Model and connect them to each other.
The score model function needs other data to compare how the model has performed. So this 20 percent of the data we did not use it in our data split we use for our score model function. So the model scoring function has data it knows that the trained algrithm has never seen before and it uses that data to find out if the algorithm actually is capable of guessing a person's income. Now, we save this experiment and run it. All the steps are going to be executed in Azure and not on my local computer.
After it finished running, let's visualize the model by clicking on the Visualize menu item. Here is a curve that leans to the left which means that it is very precise in predicting or quite precise here in predicting the income of a person.
We can set up a web service meaning we can tell our system to save our trained model in a finished trained model that is going to be reused and published as a web service. The entry point of the web service points to where we actually feed data into our experiment our model, our trained algorithm, and the output be the answer of the scoring of our model meaning seeing how well what the answer would be and how precise it will have been. So I need to remove the income field because the model has now learned to be able to predict the income by itself.
The web service basically is a function in the cloud that allows me to connect any software to this and have my software ask this web service with this input data about the person what would be the income of that person and the web service responds with a value and a percentage of how sure it is.
After saving and runnig, I can now pubish the web service
I can click on test button and enter a person information here.
It is going to be evaluated by our trained algorithm and the result is this person would earn more than fifty thousand dollars with a chance of over ninty percent. That is the accuracy that the algorithm predicts for this answer from our trained model.
This concludes this small demo of machine learning and I hope it is helpful.
Number of Views:635