Imputation (imputation)ΒΆ
Imputation replaces missing feature values with appropriate values. The example below shows how to replace the missing values with variables’ averages:
import Orange
bridges = Orange.data.Table("bridges")
imputed_bridges = Orange.data.imputation.ImputeTable(bridges,
method=Orange.feature.imputation.AverageConstructor())
print "Original data set:"
for e in bridges[:3]:
print e
print "Imputed data set:"
for e in imputed_bridges[:3]:
print e
The output of this code is:
Original data set:
['M', 1818, 'HIGHWAY', ?, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1819, 'HIGHWAY', 1037, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1829, 'AQUEDUCT', ?, 1, 'N', 'THROUGH', 'WOOD', '?', 'S', 'WOOD']
Imputed data set:
['M', 1818, 'HIGHWAY', 1300, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1819, 'HIGHWAY', 1037, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1829, 'AQUEDUCT', 1300, 1, 'N', 'THROUGH', 'WOOD', 'MEDIUM', 'S', 'WOOD']
The function uses feature imputation methods from Imputation (imputation) and applies them on entire data set. The supported methods are:
- imputation of minimal, maximal, average value (uses Orange.feature.imputation.Defaults),
- imputation of random value (uses Orange.feature.imputation.Random),
- imputation based on a predictive model (uses Orange.feature.imputation.Model),
- imputation where missing value is treated as a value (uses Orange.feature.imputation.AsValue).