Skip navigation links
Richard Lees
About Richard Lees
EasternMining Services
Customer References
Data Mining
Perfmon Cube
Web/Proxy Analysis
OraclePerformance Cube
Useful Links
Downloads
Lees Family Page
Search
Business Intelligence
BI Demonstrations
Australian Partners
Microsoft Australia BI
BI for Mobile Devices
  

Data Mining

Data mining is the use of powerful software tools to discover significant traits or relationships, from databases or data warehouses and typically (but not necessarily) used to predict future events.  Classical uses of data mining technologies are

  • Risk assessment
  • Claim likelihood
  • Customer profitability predictions
  • Fraud detection
  • Treatment efficacy
  • Product suggestion
  • Future sales

SQL Server has had data mining algorithms since SQL Server 2000.  Since these data mining algorithms are in the base product, it has helped turn data mining into a commodity technology.   SQL Server 2005 has the following data mining algorithms

  • Decision Trees
  • Clustering
  • Sequence Clustering
  • Naïve Bayes
  • Association
  • Time Series
  • Neural Networks
  • Linear Regression
  • Logistic Regression

SQL Server also includes tools that assist in testing the validity of algorithm predictions and compares multiple algorithms using sample data that was held back from model training.

Here is a sample data mining report that gets real-time predictions for web request response times.  See real-time data mining report.Here is a sample data mining report that predicts book titles that library borrowers might take out based on previous titles and their age & sex.  The data mining model is using real data from the Wellington Municipal Library.  See Library Suggestion Tool.

In 2003, Richard helped ComputerFleet and their BI partner come up with a revised business strategy that involved a bespoke Data Mining application using Decision Trees to predict inertia of their leased assets.  Case study.

Here is just one view of just one algorithm from SQL 2005.  This model was created from the clickstream data of RichardLees.com.au.  This particular view is the State Transition view, with focus on the ClickStreamAnalytics flyer.  It is an example of the traditional, elite use, of data mining.  Note; a really interesting and fast growing area of data mining is “embedded data mining”.  This is where data mining models are embedded into applications that use them for making predictions, or suggestions to non-technical staff, or customers.