Data Mining Practice
"Data mining" uses computing power and advanced analytical
techniques to discover useful relationships in large datasets. At
Friedrich, Klatt and Associates, we can mine the data that you already have, or
the data that we've collected for you. Most importantly, we can show you how to
understand your data and use it to your best advantage.

Our custom software solutions transform data about your business
into information about your business. In turn, our applications
allow you to leverage your knowledge and experience in your field, and apply it
to the information that we provide for you. By complementing your
strengths, data mining enables you to make better business decisions.
More than ever before, now is the perfect time to begin mining your data.
Desktop computers have become dramatically more powerful, data has
gotten better and more plentiful, and the software that we are able to create
has become more user-friendly, with easier learning-curves.
Our data mining services bring together leading tools, a number of
disciplines, and cutting-edge analysis
techniques to create strategic business
intelligence from corporate data.
Although we can apply these services to any data set across a broad range of
industries and analysis topics, we have pre-defined interfaces for a number of
common applications that can serve as data sources.
|
Our data mining practice incorporates a number of leading tools, including:
Statistics
- SPSS Base Statistics
- SPSS Advanced Statistics
- SAS Enterprise Miner
Time Series
Data Exploration
Cluster
- SPSS AnswerTree
- Cognos Scenario
Data Cube
- MicroStrategy Agent
- MicroStrategy Web
- Cognos PowerPlay
- Business Objects
- InfoAdvantage
- Microsoft OLAP services, including Excel pivot table services
Neural Net
|
Report Writer
- Seagate Crystal Reports
- Microsoft Access
- Cognos Impromptu
- Wall Data Arpeggio
Spreadsheet
- Microsoft Excel
- Lotus 1-2-3
Geographical
- ESRI ArcView
- ESRI MapObjects
- Visio Maps
Database
- Microsoft Access
- Leading database server, including Microsoft SQL Server, Oracle, Sybase
Adaptive Server, Sybase SQL Anywhere, NCR Teradata, IBM DB2, and Informix
Compiler
- Delphi
- C++, Java
- Visual Basic, VBA
Process Automation
- Our Process Automation Suite
|
Our data mining practice combines tools, techniques, and know-how from many
disciplines, including:
- Database administrator. The database administrator
skills required are those normally associated with a system DBA.
- Statistician. Data mining requires the skills of the
statistician and mathematician. Modern graphical statistical tools have
eased this work, but have not come close to eliminating it. See our
Statistical Analysis Practice.
- Business analyst. These are the kinds of skills we
associate with experienced business managers or consultants with MBA degrees or
similar training.
- Accountant. The data that we analyze often includes
financial elements, or data from accounting systems. When this is true,
it is important to understand the structure of this data, charts of accounts,
and the like. These are the traditional concerns of the management
accounting discipline, of the CPA.
- Data visualization. In our experience, creating
meaningful, intuitive, aesthetically pleasing displays of data and analysis
results is immensely important in realizing the benefits of data mining.
See our Data Visualization Practice.
- Programming. One must combine advanced technical
skills with the other skills here to create a complete, automated solution
tailored to your organization.
- Process automation. Our focus is to create
repeatable data mining processes, so that, once you have identified
opportunities, you can continue taking advantage of them. (This is an
area where traditional management consulting practices fall short.) Our
process automation tools and techniques allow us to do this.
- Data warehousing. We are really talking about
building decision support systems, and such systems usually benefit from
building on a data warehouse that pulls together data from a variety of
operational systems into a central historical data store that supports
management decisions. We take advantage of the techniques that this
rapidly evolving discipline has developed and proven.
- Report writing. We use report writing tools such as
Crystal Reports,
Microsoft
Access, and others. See our Reporting
Practice.
- Data querying. Nearly all data mining operations
begin by executing queries against source data tables, usually using SQL.
One must be expert at the SQL syntax supported by a variety of database drivers
and servers to take full advantage of their feature set and performance.
- Multi-dimensional data analysis. Often referred to
as On-line Analytic Processing (OLAP) to distinguish it from the On-line
Transaction Processing (OLTP) of operational systems.
- Project management. The work across all of these
disciplines must be efficiently coordinated to complete data mining
projects.
- Application integration. Includes automating
integration with E-mail systems, automating mail merges with word processing
programs such as Microsoft Word.
- Data exploration specialist. Includes using
data pattern exploration techniques such
as neural nets, genetic algorithms, memory-based reasoning, cluster analysis,
and decision trees.
We incorporate a number of leading analytical techniques into our data
mining practice, including (in alphabetical order):
- Charts
- We have various ways of organizing and presenting data visually rather than
numerically. A particularly rich tool is the charting capability of the
SPSS interactive Graphics program. With it we can create a 3-D
scatterplot of cases, each of which has a different color, style, and size,
representing different attributes of the data, and all of which can be viewed
in different ways. So, for example, we could create a graph representing
six different attributes of a company's customers (age, income, gender, whether
they repeat, etc.) and then arrange to view them by region. See our
Data Visualization Practice.
- Correlations
- Correlations tell us the degree to which two variables are related.
One uses correlations to find out whether a given set of variables vary
systematically relative to one another.
- Decision Tree/Cluster
- Decision trees help to analyze a problem (or data) by breaking it down into
a series of mutually exclusive alternatives (clusters or categories), each of
which may be subdivided further. They are used to identify people or
entities likely to belong to a particular class, to assign cases to one of
several categories (e.g., high, medium, and low), and to create rules and use
them to predict unknown attributes of new cases. Applications include
direct mail, market analysis, credit scoring, quality control, and policy
analysis. One of the steps in creating a decision tree may be cluster
analysis, in which one segments a heterogeneous population into more
homogeneous subgroups or clusters.
- Expert Systems
- Expert systems solve problems or make decisions in a particular field by
systematizing the knowledge of and rules defined by experts in that
field. This systematized knowledge is combined with an inference engine
so that non-experts can apply the experts' knowledge to new problems and
decisions.
- Genetic (or Evolutionary) Algorithms
- Genetic algorithms find solutions through an approach modelled on the
long-term process of evolution. The idea is to develop information
discovery systems that can organize and adapt themselves based only on exposure
to the environment (i.e., to various inputs) as a feedback loop.
- Geographical Information Systems (GIS)
- GIS systems can be used to represent and analyze data when at least one of
the attributes is a spatial dimension (that is, the data has longitude and
latitude, or one can derive such attributes). A GIS system typically
shows a map, on which the other attributes of the dataset are represented by
colors or shapes. For example, a large retailer might represent all its
stores as circles on a map of the United States. The size of each circle
could indicate volume of sales at that store and its color the type of store
format. See our Geographic Analysis Practice.
- Linear Programming
- Linear programming (or LP) is used to optimize some decision variable based
on a linear function. It is useful for analyzing problems where we can
assume that the most important relationship is linear in nature (for example,
where we could assume that it costs twice as much to ship two bundles of a
commodity as to ship one). See our Optimization
Practice.
- Linear Regression
- Linear regression finds the line that best fits a dataset plotted on a
graph. The line describes the way in which a dependent variable is
related to an independent variable. See our Optimization Practice.
- Machine-Generated Data Rules
- Methods and tools through which rules for handling data are generated by a
computer program (rather than by a human analyst) from example cases.
Includes rule induction, case-based reasoning, neural computing, and
intelligent agents.
- Memory-Based Reasoning
- Memory-Based Reasoning, or MBR (also known as Case-Based Reasoning) tries
to classify new data points by finding their nearest neighbors in historical
data. This becomes a geometrically based approach, and lends itself to
simple visual representations of the model.
- Multi-dimensional Pivot Tables/Date Cube Analysis
- A way of viewing and analyzing data by multiple dimensions. It allows
users to perform complex queries without using SQL, using simple drill-down,
and drag and drop operations using the mouse. Often a key component of
On-Line Analytical Processing (OLAP) decision support database systems.
- Neural Net
- A type of artificial-intelligence system that finds solutions working by
loose analogy with the way the human brain works. A neural network is
designed as an interconnected system of processing elements each of which
receives inputs and, after various adjustments, delivers appropriate
outputs. Used in such areas as pattern recognition and speech
synthesis. See our Optimization Practice.
- Sampling
- Sampling encompasses the methods used for selecting a particular subset of
observations or data to be studied from the "population" of all
possible observations or data. For example, a study of retired men might
take a random sample from each of the states consisting of 1,000 men over the
age of 65 per state. See our Statistical
Analysis Practice.
- Time Series Analysis
- Time series analysis uses data obtained by measuring a single variable
regularly over a specific period of time in the past, usually to forecast and
plan for the future. An economist, for example, might forecast the
growth of household income five years into the future by projecting from trends
of the past 15 years and use this information, in turn, to predict average
household expenditures on entertainment.
|