Data Mining Banner

Data Mining Practice

"Data mining" uses computing power and advanced analytical techniques to discover useful relationships in large datasets.  At Friedrich, Klatt and Associates, we can mine the data that you already have, or the data that we've collected for you. Most importantly, we can show you how to understand your data and use it to your best advantage. 

Friedrich, Klatt and Associates Data Visualization

Our custom software solutions transform data about your business into information about your business.  In turn, our applications allow you to leverage your knowledge and experience in your field, and apply it to the information that we provide for you.  By complementing your strengths, data mining enables you to make better business decisions.

More than ever before, now is the perfect time to begin mining your data.   Desktop computers have become dramatically more powerful, data has gotten better and more plentiful, and the software that we are able to create has become more user-friendly, with easier learning-curves.

Our data mining services bring together leading tools, a number of disciplines, and cutting-edge analysis techniques to create strategic business intelligence from corporate data. 

Although we can apply these services to any data set across a broad range of industries and analysis topics, we have pre-defined interfaces for a number of common applications that can serve as data sources. 


Data Mining Tools

Our data mining practice incorporates a number of leading tools, including:

Statistics

  • SPSS Base Statistics
  • SPSS Advanced Statistics
  • SAS Enterprise Miner

Time Series

  • SPSS Trends

Data Exploration

  • SPSS Diamond
  • SAS JMP

Cluster

  • SPSS AnswerTree
  • Cognos Scenario

Data Cube

  • MicroStrategy Agent
  • MicroStrategy Web
  • Cognos PowerPlay
  • Business Objects
  • InfoAdvantage
  • Microsoft OLAP services, including Excel pivot table services

Neural Net

  • Cognos 4Thought

Report Writer

  • Seagate Crystal Reports
  • Microsoft Access
  • Cognos Impromptu
  • Wall Data Arpeggio

Spreadsheet

  • Microsoft Excel
  • Lotus 1-2-3

Geographical

  • ESRI ArcView
  • ESRI MapObjects
  • Visio Maps

Database

  • Microsoft Access
  • Leading database server, including Microsoft SQL Server, Oracle, Sybase Adaptive Server, Sybase SQL Anywhere, NCR Teradata, IBM DB2, and Informix

Compiler

  • Delphi
  • C++, Java
  • Visual Basic, VBA

Process Automation

  • Our Process Automation Suite

Data Mining Disciplines

Our data mining practice combines tools, techniques, and know-how from many disciplines, including: 

  • Database administrator.  The database administrator skills required are those normally associated with a system DBA. 
  • Statistician.  Data mining requires the skills of the statistician and mathematician.  Modern graphical statistical tools have eased this work, but have not come close to eliminating it.  See our Statistical Analysis Practice.
  • Business analyst.  These are the kinds of skills we associate with experienced business managers or consultants with MBA degrees or similar training. 
  • Accountant.  The data that we analyze often includes financial elements, or data from accounting systems.  When this is true, it is important to understand the structure of this data, charts of accounts, and the like.  These are the traditional concerns of the management accounting discipline, of the CPA. 
  • Data visualization.  In our experience, creating meaningful, intuitive, aesthetically pleasing displays of data and analysis results is immensely important in realizing the benefits of data mining.  See our Data Visualization Practice.
  • Programming.  One must combine advanced technical skills with the other skills here to create a complete, automated solution tailored to your organization.
  • Process automation.  Our focus is to create repeatable data mining processes, so that, once you have identified opportunities, you can continue taking advantage of them.  (This is an area where traditional management consulting practices fall short.)  Our process automation tools and techniques allow us to do this.
  • Data warehousing.  We are really talking about building decision support systems, and such systems usually benefit from building on a data warehouse that pulls together data from a variety of operational systems into a central historical data store that supports management decisions.  We take advantage of the techniques that this rapidly evolving discipline has developed and proven.
  • Report writing.  We use report writing tools such as Crystal Reports, Microsoft Access, and others.  See our Reporting Practice.
  • Data querying.  Nearly all data mining operations begin by executing queries against source data tables, usually using SQL.  One must be expert at the SQL syntax supported by a variety of database drivers and servers to take full advantage of their feature set and performance.
  • Multi-dimensional data analysis.  Often referred to as On-line Analytic Processing (OLAP) to distinguish it from the On-line Transaction Processing (OLTP) of operational systems. 
  • Project management.  The work across all of these disciplines must be efficiently coordinated to complete data mining projects. 
  • Application integration.  Includes automating integration with E-mail systems, automating mail merges with word processing programs such as Microsoft Word. 
  • Data exploration specialist.  Includes using data pattern exploration techniques such as neural nets, genetic algorithms, memory-based reasoning, cluster analysis, and decision trees.

Data Mining Techniques

We incorporate a number of leading analytical techniques into our data mining practice, including (in alphabetical order):

Charts
We have various ways of organizing and presenting data visually rather than numerically.   A particularly rich tool is the charting capability of the SPSS interactive Graphics program.  With it we can create a 3-D scatterplot of cases, each of which has a different color, style, and size, representing different attributes of the data, and all of which can be viewed in different ways.  So, for example, we could create a graph representing six different attributes of a company's customers (age, income, gender, whether they repeat, etc.) and then arrange to view them by region. See our Data Visualization Practice.
Correlations
Correlations tell us the degree to which two variables are related.  One uses correlations to find out whether a given set of variables vary systematically relative to one another.
Decision Tree/Cluster
Decision trees help to analyze a problem (or data) by breaking it down into a series of mutually exclusive alternatives (clusters or categories), each of which may be subdivided further.  They are used to identify people or entities likely to belong to a particular class, to assign cases to one of several categories (e.g., high, medium, and low), and to create rules and use them to predict unknown attributes of new cases.   Applications include direct mail, market analysis, credit scoring, quality control, and policy analysis.  One of the steps in creating a decision tree may be cluster analysis, in which one segments a heterogeneous population into more homogeneous subgroups or clusters.
Expert Systems
Expert systems solve problems or make decisions in a particular field by systematizing the knowledge of and rules defined by experts in that field.  This systematized knowledge is combined with an inference engine so that non-experts can apply the experts' knowledge to new  problems and decisions.  
Genetic (or Evolutionary) Algorithms
Genetic algorithms find solutions through an approach modelled on the long-term process of evolution.  The idea is to develop information discovery systems that can organize and adapt themselves based only on exposure to the environment (i.e., to various inputs) as a feedback loop.
Geographical Information Systems (GIS)
GIS systems can be used to represent and analyze data when at least one of the attributes is a spatial dimension (that is, the data has longitude and latitude, or one can derive such attributes).  A GIS system typically shows a map, on which the other attributes of the dataset are represented by colors or shapes.  For example, a large retailer might represent all its stores as circles on a map of the United States.  The size of each circle could indicate volume of sales at that store and its color the type of store format. See our Geographic Analysis Practice.
Linear Programming
Linear programming (or LP) is used to optimize some decision variable based on a linear function.  It is useful for analyzing problems where we can assume that the most important relationship is linear in nature (for example, where we could assume that it costs twice as much to ship two bundles of a commodity as to ship one). See our Optimization Practice.
Linear Regression
Linear regression finds the line that best fits a dataset plotted on a graph.  The line describes the way in which a dependent variable is related to an independent variable. See our Optimization Practice.
Machine-Generated Data Rules
Methods and tools through which rules for handling data are generated by a computer program (rather than by a human analyst) from example cases.  Includes rule induction, case-based reasoning, neural computing, and intelligent agents.
Memory-Based Reasoning
Memory-Based Reasoning, or MBR (also known as Case-Based Reasoning) tries to classify new data points by finding their nearest neighbors in historical data.  This becomes a geometrically based approach, and lends itself to simple visual representations of the model.
Multi-dimensional Pivot Tables/Date Cube Analysis
A way of viewing and analyzing data by multiple dimensions.  It allows users to perform complex queries without using SQL, using simple drill-down, and drag and drop operations using the mouse.  Often a key component of On-Line Analytical Processing (OLAP) decision support database systems.
Neural Net
A type of artificial-intelligence system that finds solutions working by loose analogy with the way the human brain works.  A neural network is designed as an interconnected system of processing elements each of which receives inputs and, after various adjustments, delivers appropriate outputs.  Used in such areas as pattern recognition and speech synthesis.  See our Optimization Practice.
Sampling
Sampling encompasses the methods used for selecting a particular subset of observations or data to be studied from the "population" of all possible observations or data.  For example, a study of retired men might take a random sample from each of the states consisting of 1,000 men over the age of 65 per state. See our Statistical Analysis Practice.
Time Series Analysis
Time series analysis uses data obtained by measuring a single variable regularly over a specific period of time in the past, usually to forecast and plan for the future.    An economist, for example, might forecast the growth of household income five years into the future by projecting from trends of the past 15 years and use this information, in turn, to predict average household expenditures on entertainment.