Car Dataset

nursery.data (example set with 12960 instances, C4.5 format)
nursery.names (C4.5 names file)

Creator: Vladislav Rajkovic et al. (13 experts)
Donors to UCI ML Repository: Marko Bohanec, Blaz Zupan
Date: June, 1997

Past Usage

The hierarchical decision model, from which this dataset is derived, was first presented in

M. Olave, V. Rajkovic, M. Bohanec: An application for admission in public school systems. In (I. Th. M. Snellen and W. B. H. J. van de Donk and J.-P. Baquiast, editors) Expert Systems in Public Administration, pages 145-160. Elsevier Science Publishers (North Holland), 1989.

Within machine-learning, this dataset was used for the evaluation of HINT (Hiearchy INduction Tool). The results are presented in

B. Zupan, M. Bohanec, I. Bratko, J. Demsar (1997) Machine learning by function decomposition. In (D. Fisher, ed.) Proc. ICML-97, pages 421-429. Morgan-Kaufmann.

and show that HINT is able to completely reconstruct the original hierarchical model. The paper further compares the generalization capability of HINT and C4.5. The learning curve obtained by both learning systems is (p is the percent of examples used for learning, y axis shows the classification accuracy when all remaining examples are classified).

Relevant Information

Nursery Database was derived from a hierarchical decision model originally developed to rank applications for nursery schools. It was used during several years in 1980's when there was excessive enrollment to these schools in Ljubljana, Slovenia, and the rejected applications frequently needed an objective explanation. The final decision depended on three subproblems: occupation of parents and child's nursery, family structure and financial standing, and social and health picture of the family. The model was developed within expert system shell for decision making DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp. 145-157, 1990.).

The hierarchical model ranks nursery-school applications according to the following concept structure:

The features used in the structure are:

   NURSERY            Evaluation of applications for nursery schools
   . EMPLOY           Employment of parents and child's nursery
   . . parents        Parents' occupation
   . . has_nurs       Child's nursery
   . STRUCT_FINAN     Family structure and financial standings
   . . STRUCTURE      Family structure
   . . . form         Form of the family
   . . . children     Number of children
   . . housing        Housing conditions
   . . finance        Financial standing of the family
   . SOC_HEALTH       Social and health picture of the family
   . . social         Social conditions
   . . health         Health conditions

and can use the following sets of values:

   NURSERY            not_recom, recommend, very_recom, priority, spec_prior
   . EMPLOY           convenient, less_conv, inconv, critical
   . . parents        usual, pretentious, great_pret
   . . has_nurs       proper, less_proper, improper, critical, very_crit
   . STRUCT_FINAN     convenient, inconvenient, critical
   . . STRUCTURE      less_crit, critical, very_crit
   . . . form         complete, completed, incomplete, foster
   . . . children     1, 2, 3, more
   . . housing        convenient, less_conv, critical
   . . finance        convenient, inconv
   . SOC_HEALTH       not_recommended, recommended, priority
   . . social         non-prob, slightly_prob, problematic
   . . health         recommended, priority, not_recom

Input attributes are printed in lowercase. Besides the target concept (NURSERY) the model includes four intermediate concepts: EMPLOY, STRUCT_FINAN, STRUCTURE, SOC_HEALTH. Every concept is in the original model related to its lower level descendants by a set of examples click on the intermediate or target concept - circled in the structure - to see the set of examples that define it).

The Nursery Database contains examples with the structural information removed, i.e., directly relates NURSERY to the eight input attributes: parents, has_nurs, form, children, housing, finance, social, health.

Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.

Statistics

Number of Instances: 12960 (instances completely cover the attribute space)
Number of Attributes: 8

Class distribution:

Class N N[%]
not_recom 4320 33.333%
recommend 2 0.015%
very_recom 328 2.531%
priority 4266 32.917%
spec_prior 4044 31.204%


Datasets from the structured model

Examples for NURSERY:

EMPLOY      STRUCT_FINAN  SOC_HEALTH       NURSERY
----------------------------------------------------
convenient  convenient    not_recommended  not_recom
less_conv   convenient    not_recommended  not_recom
inconv      convenient    not_recommended  not_recom
critical    convenient    not_recommended  not_recom
convenient  inconvenient  not_recommended  not_recom
less_conv   inconvenient  not_recommended  not_recom
inconv      inconvenient  not_recommended  not_recom
critical    inconvenient  not_recommended  not_recom
convenient  critical      not_recommended  not_recom
less_conv   critical      not_recommended  not_recom
inconv      critical      not_recommended  not_recom
critical    critical      not_recommended  not_recom
convenient  convenient    recommended      recommend
less_conv   convenient    recommended      very_recom
inconv      convenient    recommended      priority
critical    convenient    recommended      priority
convenient  inconvenient  recommended      very_recom
less_conv   inconvenient  recommended      very_recom
inconv      inconvenient  recommended      priority
critical    inconvenient  recommended      priority
convenient  critical      recommended      priority
less_conv   critical      recommended      priority
inconv      critical      recommended      priority
critical    critical      recommended      spec_prior
convenient  convenient    priority         priority
less_conv   convenient    priority         priority
inconv      convenient    priority         priority
critical    convenient    priority         priority
convenient  inconvenient  priority         priority
less_conv   inconvenient  priority         priority
inconv      inconvenient  priority         priority
critical    inconvenient  priority         spec_prior
convenient  critical      priority         priority
less_conv   critical      priority         priority
inconv      critical      priority         spec_prior
critical    critical      priority         spec_prior

Examples for EMPLOY:

parents      has_nurs     EMPLOY
------------------------------------
usual        proper       convenient
pretentious  proper       less_conv
great_pret   proper       inconv
usual        less_proper  less_conv
pretentious  less_proper  less_conv
great_pret   less_proper  inconv
usual        improper     less_conv
pretentious  improper     inconv
great_pret   improper     critical
usual        critical     inconv
pretentious  critical     critical
great_pret   critical     critical
usual        very_crit    critical
pretentious  very_crit    critical
great_pret   very_crit    critical

Examples for STRUCT_FINAN:

STRUCTURE  housing     finance     STRUCT_FINAN
-----------------------------------------------
less_crit  convenient  convenient  convenient
critical   convenient  convenient  inconvenient
very_crit  convenient  convenient  inconvenient
less_crit  less_conv   convenient  inconvenient
critical   less_conv   convenient  inconvenient
very_crit  less_conv   convenient  critical
less_crit  critical    convenient  inconvenient
critical   critical    convenient  critical
very_crit  critical    convenient  critical
less_crit  convenient  inconv      inconvenient
critical   convenient  inconv      inconvenient
very_crit  convenient  inconv      critical
less_crit  less_conv   inconv      inconvenient
critical   less_conv   inconv      inconvenient
very_crit  less_conv   inconv      critical
less_crit  critical    inconv      inconvenient
critical   critical    inconv      critical
very_crit  critical    inconv      critical

Examples for STRUCTURE:

form        children  STRUCTURE
-------------------------------
complete    1         less_crit
completed   1         critical
incomplete  1         critical
foster      1         very_crit
complete    2         critical
completed   2         critical
incomplete  2         very_crit
foster      2         very_crit
complete    3         very_crit
completed   3         very_crit
incomplete  3         very_crit
foster      3         very_crit
complete    more      very_crit
completed   more      very_crit
incomplete  more      very_crit
foster      more      very_crit

Examples for SOC_HEALTH:

social         health       SOC_HEALTH
-------------------------------------------
non-prob       recommended  recommended
slightly_prob  recommended  recommended
problematic    recommended  priority
non-prob       priority     priority
slightly_prob  priority     priority
problematic    priority     priority
non-prob       not_recom    not_recommended
slightly_prob  not_recom    not_recommended
problematic    not_recom    not_recommended