Quick Summary

Refer to the full EDM documentation for complete details.

The software is a collection of routines for efficient mining of big data. Both classical and the more computationally expensive state-of-the-art prediction methods are included. Using a standard spreadsheet data format, this kit implements all of the data-mining tasks described in the book Predictive Data Mining: A Practical Guide.

Availability:

Software Highlights:

Data Preparation
  • editing
  • normalization
  • text transformation
  • segmentation
  • clustering
Feature Reduction and Selection
  • significance testing
  • tree selection
Value Reduction and Smoothing
  • rounding
  • k-means clustering
  • entropy
Case Reduction and Sampling
  • random
  • bootstrapping/bagging
  • adaptive/boosting
  • voting/averaging
Prediction Methods - Classification and Regression
  • math - linear
  • math - neural nets
  • distance - nearest neighbors
  • logic - decision trees
  • logic - decision rules
  • logic - association rules

Return to Main Index