Chapter 1. Model Building Guide

This tutorial shows how to build an example classification model in AdvancedMiner.

Before building a model, the user is encouraged to look at the QuickStart tutorial. A training dataset from the database will be required to follow the tutorial. Example data can be found in the "/AdvancedMiner/Client/scripts/scripts/data/classification/" directory. After selecting one of the available scripts and executing it, a new table with training data should appear in the database. The following example is based on the 'german_credit' dataset. All quoted names are example names of created objects.

During the model building process, the following MR objects are used:

Figure 1.1. Schema of the model building process

Schema of the model building process

The model building task consists of the following steps:

Note

  • It is important to know that the MiningFunctionSettings object can use different subtypes of the algorithmSettings object. For example, ClassificationFunctionSettings can use the settings for the tree learning algorithm (as in the example above) or for the logistic regression algorithm. This means that various algorithm can be used with the same function specification.

  • AdvancedMiner is able to build various types of models, such as approximation, clustering or time series models. The user may select the required model type by choosing the appropriate MiningFunctionSettings object. Regardless of the model type or particular algorithm used for model building, the scheme described above remains the same and constists of the same steps.

  • When experiencing problems with the execution of any of the steps or seting up object parameters, the user may click the 'Test' action, which can be found in the context menu of any MR object. After clicking this action, the user will see detailed information about incorrectness in the object definition ( see the movie ).

If all the required objects were set up correctly, then after executing MiningBuildTask a new classification tree model should appear in the MR repository. The user can browse model details by double-clicking the model icon ( see the movie ). The Figure below shows the content of the MR repository and the model created after carrying out previously described steps.

Figure 1.2. The result of MiningBuildTask execution

The result of MiningBuildTask execution

The steps described above can be also executed as a script.

Example 1.1. ModelBuilding

#                               MODEL BUILDING PROCESS    

# step 1 -- create and save PhysicalData in MR 
#        -- we choose building data: 'german_credit' 
PD = PhysicalData('german_credit')
save('german_credit_pd',PD)

# step 2 -- create MiningFunctionSettings -->> ClassificationFunctionSettings()
#        -- we define that we are going to prepare a classification model
MFS = ClassificationFunctionSettings()

# step 2a -- MiningFunctionSettings: add algorithmSettings -->> TreeSettings
#         -- we choose the classification algorythm: decision trees
MFS.setAlgorithmSettings(TreeSettings())

# step 2b -- MiningFunctionSettings: add logicalData
#         -- we define the logical structure of the build data
MFS.setLogicalData(LogicalData(PD))

# step 2c -- MiningFunctionSettings -> attributeUsageSet: set target attribute
#         -- we choose the 'Class' attribute as the target
MFS.getAttributeUsageSet().getAttribute('Class').usage = UsageOption.target

# steps 2,2a,2b,2c -- save MiningFunctionSettings into MR
save('MiningFunctionSettings',MFS)

# step 3 -- create MiningBuildTask
#        -- we define the action we are going to do: model building
MBT = MiningBuildTask()

# step 3a -- MiningBuildTask: add buildData
#         -- we choose the data for model building
MBT.setBuildDataName('german_credit_pd')

# step 3b -- MiningBuildTask: add functionSettings
#         -- we choose the method and algorithm settings for model building
MBT.setFunctionSettingsName('MiningFunctionSettings')

# step 3c -- MiningBuildTask: add model 
#         -- we set the new output model name
MBT.setModelName('MODEL')

# steps 3,3a,3b,3c -- save MiningBuildTask into MR
save('MiningBuildTask',MBT)

# step 4 -- execute task
execute('MiningBuildTask')

print "Done..."

Output:

Done...