STATISTICA Enterprise-wide Data Mining System
The popularity of Data Mining methodology is rapidly growing in a wide variety of areas where
specific tools are needed to make sense of ever-increasing amounts of information and to search for
significant patterns and trends in large databases.
Data Mining can be used on all types of data including quality data (see
Predictive Quality Control).
What is Data Mining?
Data Mining is the process of exploring and analysing large quantities of data, using various
methods ranging from graphical exploratory techniques to highly specific methods.
Data Mining methods are used to extract meaningful new information from the data and fall into
several categories, including description and visualisation, classification and clustering, estimation
and prediction.
Applications of data mining include fraud detection, credit card scoring and personal profile
marketing in order to enhance customer relations, direct marketing, trend analysis, financial
market forecasting, bioinformatics and product quality control and improvement. Web mining, through
which data from the Web are analysed, helps businesses understand customer "click- stream" behaviour
online.
Why choose StatSoft and the STATISTICA Data Miner?
StatSoft is one of the largest producers of statistics and analytic graphics in the world and has been
established in this market since 1984 (read
StatSoft's history here). StatSoft is supported by a wide network of
20 international offices on all continents.
STATISTICA, StatSoft's major product line, enjoys an unprecedented record of recognition amongst
users and reviewers. It has been rated FIRST in EVERY independent comparative review since its release in
1993. To read the reviews, click here.
STATISTICA is available in English, Spanish, German, French and many other foreign languages.
The STATISTICA Data Miner is a comprehensive and user-friendly set of complete data mining tools,
designed to enable users to easily and quickly analyse their data in order to uncover hidden trends,
explain known patterns and predict the future. See below for a more detailed description of the STATISTICA
Data Miner.
With over 18 years of experience in the field of data analysis, StatSoft offers not only the expertise
to install and run the STATISTICA Data Miner at your site, but also to train your staff and provide
consultancy services to help you make the most of the program.
Overview of the unique features of the STATISTICA Data Miner
From querying databases to generating final reports and graphs, STATISTICA Data Miner offers
ease of use without sacrificing power or comprehensiveness. STATISTICA Data Miner features a wide
selection of algorithms for classification, prediction, clustering and modelling, as well as an intuitive,
icon-based interface.
STATISTICA Data Miner can work within a client-server architecture in order to offload time-consuming
tasks from less powerful computers to dedicated servers. It also offers options to manage projects over
the Web and work collaboratively across the corridor or across continents.
STATISTICA Data Miner comes with a wide selection of predefined projects and a "point-and-click" user
interface, allowing users to easily build complex data mining projects without programming. The Data Miner
is also fully programmable using the industry-standard STATISTICA Visual Basic. For customers who need a
complete, deployed and ready-to-use solution designed to solve a specific type of problem, StatSoft provides
deployment, on-site training and programming services.
The data mining solutions provided by STATISTICA Data Miner are driven by powerful
procedures from five general data mining "techniques":
- General Slicer/Dicer and Drill-down Explorer
- General Classifier
- General Modeler/Multivariate Explorer
- General Forecaster
- General Neural Networks Explorer
In addition to all of the general statistical and graphical options available in STATISTICA, STATISTICA Data Miner features a number of highly specialised Data Mining Modules, including:
- Feature Selection and Variable Filtering (for very large data sets)
- Mining for Association Rules
- Interactive Drill-Down Explorer
- Generalized Additive Models (GAM)
- General Classification and Regression Trees (GTrees)
- General CHAID (Chi-square Automatic Interaction Detection) Models
- Interactive Trees (C& RT, CHAID)
- Boosted Tree Classifiers and Regression
- Multivariate Adaptive Regression Splines (MAR Splines)
- Goodness of Fit Computations
- Support Vector Machines
- Naive Bayes Classifiers
- K-Nearest Neighbour
STATISTICA Text Miner
STATISTICA Text Miner is an optional extension of STATISTICA Data Miner,
ideal for translating unstructured text data into meaningful, valuable clusters of decision-making "gold".
As most users familiar with data mining already know, real-world data comes in a variety of forms, not always
organized or easily ready to analyze. STATISTICA Text Miner digs for the underlying information not
readily apparent in traditional structured data.
STATISTICA Text Miner was specifically designed
as a general and open-architecture tool for mining unstructured information. The feature extraction/selection
and other analytic tools available in STATISTICA Text Miner are not only applicable to text documents
or Web pages, but can also be used to index, classify, cluster, or otherwise include in your analyses
unstructured information such as (pre-processed) bitmaps, sound files, etc.
How can I use STATISTICA Text Miner?
- Analyze the contents of Web pages. For example, users can automatically process and
summarize all Web pages of particular companies, message boards, etc.
- Include unstructured notes in predictive data mining projects. For example, users may
include responses to open-ended interview questions, patients' own descriptions of medical symptoms, etc.
in data mining projects involving the clustering of patients and symptoms.
- Analyze large document repositories. For example, users may analyze repositories of
documents such as narratives of insurance claims, etc., to include such information in fraud detection projects.
|
|
|