Cooperative Intelligent Data Analysis: an application to diabetic
patients management Riccardo Bellazzi, Cristiana Larizza, and Alberto Riva
This paper outlines the methodologies that can be used to perform a
distributed intelligent data analysis in a telemedicine system for
diabetic patients management. We present a decision-support system
architecture based on two modules, a Patient Unit and a Medical Unit,
connected by telecommunication services. We outline how the two
modules can cooperatively interpret the data by resorting to temporal
abstraction techniques combined with time series analysis.
Ptah: A System for Supporting Nosocomial Infection Therapy Marko Bohanec, Miran Rems, Smiljana Slavec, and Bozo Urh
We present Ptah, a system for supporting medical doctors in making
decisions related to the therapy of nosocomial (hospital-acquired)
infections. The system is based on a chronologically organized database
of infections and therapies. It facilitates four types of analyses
related to the effectiveness of antibiotics and resistance of bacteria
to antibiotics. The underlying methods construct time series of
resistance vectors from the database, and present their results
graphically.
Remembering the Unexpected Cases - CBR for Experts in Urology
Hans-Dieter Burkhard, Gabriela Lindemann, Stephan A. Loening, and
Jörg Neymeyer
Diagnosis is often considered as a classification task.
This assumes that sufficient relevant information (symptom values)
is already available. Related CBR approaches perform a match of
the new ``problem'' (e.g. described by a symptom vector) to similar
cases from the case base. The ``solutions'' (diagnosis) contained
in the found cases are proposed as possible solution of the new
case. This is mainly directed to ``standard'' solutions.
More subtle diagnosis support is needed for non-standard,
``unexpected'' problems. The experience with such cases is an
important skill of experts. Cases of this kind are usually not
described by the common predefined categories, and the diagnosis
process is more complicated. Nevertheless, it can be guided by
previous cases to get hints concerning feasible tests and therapies,
together with expected results. Previous cases are used for
argumentations in complicated situations.
Cases from Urology are investigated. As a flexible technique,
Case Retrieval Nets are used.
A Pruning Method for Decision Trees in Uncertain Domains:
Applications in Medicine Bruno Crémilleux and Claudine Robert
Most pruning methods for decision trees minimize a classification
error rate. In uncertain domains, some sub-trees which don't lessen
the error rate can be relevant to point out some populations of
specific interest or to give a representation of a large data file.
We propose a pruning method where we build a new attribute binding the
root of a tree with its leaves, each value of this attribute
corresponding to a branch leading to a leaf. It permits computation of
the global quality of a tree. The best sub-tree for pruning is the
one that yields the highest quality pruned tree. This pruning method
is not tied to the use of the pruned tree as a classifier. The
graphical representation of the global quality index as a function of
the number of pruned sub-trees allow the selection of a few trees in
the list of the nested pruned trees from the entire (unpruned) tree.
We give typical examples in medicine, highlighting routine use of
induction in this domain even if the targeted diagnosis cannot be
reached for many cases from the findings under investigation.
Diterpene Structure Elucidation from 13C NMR Spectra with
Machine Learning Saso Dzeroski, Steffen Schulze-Kremer, Karsten R. Heidtke,
Karsten Siems, and Dietrich Wettschereck
Diterpenes are organic compounds of low molecular weight that are
based on a skeleton of 20 carbon atoms. They are of significant
chemical and commercial interest because of their use as lead
compounds in the search for new pharmaceutical effectors. The
structure elucidation of diterpenes based on 13C NMR spectra is
usually done manually by human experts with specialized background
knowledge on peak patterns and chemical structures. Given a database
of peak patterns for diterpenes with known structure, we applied
machine learning to discover correlations between peak patterns and
chemical structure. Three machine learning approaches were used:
neural networks, nearest neighbor pattern classification, and
decision-tree induction. Simple pre-processing of the raw 13C NMR
spectra according to expert background knowledge yielded noticeable
performance improvements. All three approaches achieve very high
classification accuracy, but decision-tree induction has the advantage
of explicitly stating the knowledge discovered.
Induction of Redundant Rules for Medical
Applications Dragan Gamberger
In the paper some rule induction methods specific for medical domains
are presented. As an example the application of inductive learning
system ILLM to a breast cancer domain is described. The learning
domain has been 699 cases of fine-needle aspiration biopsy collected
in the Wisconsin Breast Cancer Database. The unique characteristics of
the ILLM have been used to construct the rules of increased
sensitivity and improved interobserver reproducibility with a hope
that these properties might significantly influence the diagnosis
reliability in practical applications. In the work a complete
description of the generated rules that might be used by physicians
and computer based systems, if attribute coding is done in accordance
with the learning cases, is offered.
A Theory of Interestingness for Knowledge Discovery in Databases
Exemplified in Medicine Carsten Hausdorf and Michael Müller
In this paper, we present an approach to evaluate, filter, and sort
findings of Data Mining methods by applying multiple, subjective
interestingness measures. A theory of interestingness for Knowledge
Discovery in Databases (KDD) is proposed which is based on a
language-oriented KDD model. Interestingness is seen as a relation
between user and information. We operationalize interestingness by
decomposing it in several facets (e.g. novelty) for which measures
are definable. User models hold prior knowledge, goals, and long--term
interests necessary to evaluate interestingness subjectively. We
introduce the Data Mining Assistant which helps to narrow the gap
between Data Mining and Knowledge Discovery by evaluating, sorting,
and translating Data Mining results. In the evaluation phase
interestingness ratings of the system have been compared with those
of users. Adjusting the parameters of our interestingness measure
differently, we aim at adapting the interestingness measure of the
system to this of users. The theoretical concepts are illustrated and
evaluated with an example from a medical domain.
Classification of Human Brain Waves using Self-Organizing Maps Udo Heuser, Joseph Göppert, Wolfgang Rosenstiel, and Andreas Stevens
This work presents a method for the classification of EEG spectra by
means of Kohonen's self-organizing map.
We use EEG data recorded by 19 electrodes (channels), sampled at 128
Hz. Data vectors are extracted at intervals of half a second with a
duration of one second each, resulting in vectors overlapping half a
second. Before the training of the map, the sample vectors were
compressed by either the Fast-Fourier-Transform or the
Wavelet-Transform.
Data preprocessed by the Fourier-Transform result in short-time power
spectra. These spectra are filtered by butterworth filters that meet
the EEG frequency bands of the delta-, theta-, alpha-, beta- and
gamma-rhythms. Data preprocessed by the Wavelet-Transform result in
wavelet components that are combined and averaged.
The pre-processed vectors form "clusters" on the trained
self-organizing map that are related to specific EEG-patterns.
Temporal Abstraction of Medical Data: Deriving Periodicity Elpida T. Keravnou
Temporal abstraction, the derivation of abstractions from time-stamped
data, is a central process in medical knowledge-based systems.
Important types of temporal abstractions include periodic occurrences,
trends, and other temporal patterns. The paper discusses the derivation
of periodic occurrences at a theoretical, domain-independent level,
and in the context of a specific temporal ontology.
Prognosing the Survival Time of the Patients with the
Anaplastic Thyroid Carcinoma with Machine Learning Matjaz Kukar, Nikola Besic, Igor Kononenko, Marija Auersperg,
and Marko Robnik-Sikonja
Noise Elimination Applied in Early Diagnosis of Rheumatic
Diseases Nada Lavrac, Dragan Gamberger, and Saso Dzeroski
Machine learning methods may be used to induce diagnostic rules from
patient records with known diagnoses. In a medical application it is
crucial that a machine learning system is capable of detecting
regularities in the data by appropriately dealing with imperfect data,
i.e., data that contains various kinds of errors, either random or
systematic. The paper presents a compression-based method that is
capable of detecting data which is suspected to contain errors and is
therefore unsuited for the extraction of regularities genuine to this
dataset. This noise elimination method is applied to a problem of
early diagnosis of rheumatic diseases which is known to be a difficult
problem, due both to its nature and to the imperfections in the
dataset. The method is evaluated by applying the noise elimination
algorithm in conjunction with the CN2 rule induction algorithm, and by
comparing their performance to earlier results obtained by CN2 in this
diagnostic domain.
An Explainable-Induction Approach for Diagnosing Retinal
Degeneration Stan Matwin, Riverson Rios, and Jim Mount
Data Analysis of Patients With Severe Head Injury: Outcome
Prediction With Decision Trees Iztok A. Pilih, Dunja Mladenic, Tine S. Prevec, and Nada Lavrac
The paper presents an application of the decision tree induction
to the problem of prognosis of the outcome after severe head
injury six months after the accident. Machine learning techniques
and tools have already been applied in a variety of medical
domains to help solving diagnostic and prognostic problems. These
tools enable the induction of diagnostic and prognostic
knowledge, for example in the form of rules or decision trees
from training data. Patient records with corresponding diagnoses
and prognoses are provided as input.
The study shows that induced decision trees are useful for the
analysis of importance of clinical parameters and of their
combinations for the prediction of outcome after severe head
injury. Among the parameters studied, the brainstem syndrome
(BSS) turns out to be the most important. Since this syndrome is
subjectively evaluated, an experiment was made in which BSS was
replaced by basic attributes from which BSS is estimated. It was
shown that BSS can be replaced by the motor reaction to pain
stimuli, which, in conditions of this study, has a similar
predictive power. Due to a small number of patient data available
for this study the induced decision trees cannot yet be
considered as a reliable prognostic tool. Nevertheless,
meaningful regularities have been discovered that help in the
analysis of this difficult prognostic task.
ASSOCIATE: An Approach to The Interpretation of ICU
Data Apkar Salatian and Jim Hunter
Intensive Care depends on sophisticated life support technology; the
effective management of device-supported patients is complex,
involving the interpretation of several time-dependent variables. he
ASSOCIATE system analyses historical data for summarisation and
patient state assessment and processes raw ICU data in real-time for
intelligent alarming. It uses a temporal expert system based on
associational reasoning and applies three consecutive processes:
'filtering', which is used to remove noise; 'segmentation' to generate
temporal intervals from the filtered data - intervals which are
characterised by a common direction of change (i.e increasing,
decreasing or steady); and 'interpretation' which performs
summarisation and patient state-assessments from a historical point of
view and intelligent alarming from a real-time point of view.
Experiments with Machine Learning in the Prediction of
Coronary Artery Disease Progression Branko Ster, Matjaz Kukar, Andrej Dobnikar,
Igor Kranjec, and Igor Kononenko
We have applied fourteen classifiers to the problem of Coronary artery
disease progression. The classifiers were taken from different paradigms
of machine learning (symbolic, statistical and neural) in order to
encapsulate the different approaches. The unsolved problem of Coronary
artery disease progression consists of predicting the stenosis (narrowing
of the coronary artery) change on the basis of clinical, laboratory and
epidemiological attributes. A total of 263 patients belonging to two
classes (stenosis changed vs. non-changed) were described with 25 attributes.
The overall results are not promising and suggest that the attributes used
are not sufficiently relevant to enable a prediction of Coronary artery
disease progression. It should also be pointed out that the simplest
classifiers (the naive Bayes, Linear discriminant analysis) generally
yield the best results. This phenomenon seems to be typical for medical
data and is consistent with previous experience.
Intelligent Documentation of Adverse Events by
Explanation-Based Abstraction Bidjan Tschaitschian, Franz Schmalhofer, Jürgen Schmidt, Heiner
Gertzen, and Michael Aschenbrenner
The many advances which have been achieved in the field of
machine learning require additional application-oriented research so
that they can be successfully applied to real world problems. For the
domain of clinical studies in the pharmaceutical industry, we have
therefore developed a new learning procedure on the basis of
explanation-based abstraction. The background knowledge which is usually
needed for automated learning is never directly available in a formalized
manner. As an alternative to a complete formalization, it is therefore
shown how a partial de-automation of a learning algorithm and interactive
components are key success factors for utilizing machine learning in the
industrial practice.
Automated Extraction of Medical Expert System Rules from
Clinical Databases Shusaku Tsumoto and Hiroshi Tanaka
Incremental Learning of Probabilistic Rules from Clinical
Databases based on Rough Set Theory Shusaku Tsumoto and Hiroshi Tanaka
Qualitative Model Approach to Computer Assisted Reasoning in
Physiology Blaz Zupan, John A. Halter, and Marko Bohanec
Developing practical tools to aid in understanding
physiological systems is a formidable undertaking. This paper presents
a method that uses a property structure for the domain being
investigated. Furthermore, it employs realistic models to present
examples of the behavior of the system. From these examples the
principles that relate the properties are inferred through the use of
machine learning. To allow prediction of property values in
quantitative domain, interval logic and fuzzy logic based methods for
qualitative model interpretation are proposed.
The principles are expressed as qualitative rules that derive the
values of the properties. The structured approach and the qualitative
representation of principles provide a simplified means to reason
about the roles of properties and meaning of principles of the
physiological systems being investigated.