<< Click to Display Table of Contents >> Navigation: 5. Detailed description of the Actions > 5.12. R_Discovery Analytics > 5.12.4. K-Medoids Clustering (
|
Icon:
Function: R_Cluster
Property window:
Short description:
K-Medoids Clustering
Long Description:
This Action is mainly for explanatory/teaching purposes. If you want to create a better segmentation, you should use Stardust.
K-Medoid is an alternate clustering technique that performs better than K-Means with non-spherical segments. It is, however, quite slow and impossible to apply to large dataset without sampling. K-Medoid will output a new column with the cluster number, and columns with the distance between each point and the center of each segment. You can easily transform this information into probability.
Parameters:
Method: you can use either PAM or CLARA
Scale Matrix before clustering: proceed with a normalization of the data to avoid dominance from varaibles on a larger scale.
Distance computation: Select whether you want to use Euclidean (sensitive to outliers) or Manhattan (absolute) distance.
Seed: set a seed number so you can run the same analysis again, with consistent results.
Number of segments: Select the number of segments to keep.
Number of samples: sumber of samples to use in the process. 1 means all the dataset will be used (may be very slow)
Cluster Name: name of the variable with the cluster results.
Include distance from center: include Euclidean distance from centers as new variables.
Plot Results: Select whether or not to display a distribution chart
Chart title: set the title of the chart (if you selected the previous option)
Model Name: Name of the model to use for later scoring.