nursery.data (example set with 12960 instances, C4.5 format)
nursery.names (C4.5 names file)
Creator: Vladislav Rajkovic et al. (13 experts)
Donors to UCI ML Repository: Marko Bohanec,
Blaz Zupan
Date: June, 1997
The hierarchical decision model, from which this dataset is derived,
was first presented in
M. Olave, V. Rajkovic, M. Bohanec: An application for admission in public
school systems. In (I. Th. M. Snellen and W. B. H. J. van de Donk and J.-P.
Baquiast, editors) Expert Systems in Public Administration, pages 145-160.
Elsevier Science Publishers (North Holland), 1989.
Within machine-learning, this dataset was used for the evaluation of HINT
(Hiearchy INduction Tool). The results are presented in
B. Zupan, M. Bohanec, I. Bratko, J. Demsar (1997) Machine
learning by function decomposition. In (D. Fisher, ed.) Proc.
ICML-97, pages 421-429. Morgan-Kaufmann.
and show that HINT is able to completely reconstruct the original hierarchical model. The paper further compares the generalization capability of HINT and C4.5. The learning curve obtained by both learning systems is (p is the percent of examples used for learning, y axis shows the classification accuracy when all remaining examples are classified).
Nursery Database was derived from a hierarchical decision model originally developed to rank applications for nursery schools. It was used during several years in 1980's when there was excessive enrollment to these schools in Ljubljana, Slovenia, and the rejected applications frequently needed an objective explanation. The final decision depended on three subproblems: occupation of parents and child's nursery, family structure and financial standing, and social and health picture of the family. The model was developed within expert system shell for decision making DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp. 145-157, 1990.).
The hierarchical model ranks nursery-school applications according to the following concept structure:
The features used in the structure are:
NURSERY Evaluation of applications for nursery schools . EMPLOY Employment of parents and child's nursery . . parents Parents' occupation . . has_nurs Child's nursery . STRUCT_FINAN Family structure and financial standings . . STRUCTURE Family structure . . . form Form of the family . . . children Number of children . . housing Housing conditions . . finance Financial standing of the family . SOC_HEALTH Social and health picture of the family . . social Social conditions . . health Health conditions
and can use the following sets of values:
NURSERY not_recom, recommend, very_recom, priority, spec_prior . EMPLOY convenient, less_conv, inconv, critical . . parents usual, pretentious, great_pret . . has_nurs proper, less_proper, improper, critical, very_crit . STRUCT_FINAN convenient, inconvenient, critical . . STRUCTURE less_crit, critical, very_crit . . . form complete, completed, incomplete, foster . . . children 1, 2, 3, more . . housing convenient, less_conv, critical . . finance convenient, inconv . SOC_HEALTH not_recommended, recommended, priority . . social non-prob, slightly_prob, problematic . . health recommended, priority, not_recom
Input attributes are printed in lowercase. Besides the target concept (NURSERY) the model includes four intermediate concepts: EMPLOY, STRUCT_FINAN, STRUCTURE, SOC_HEALTH. Every concept is in the original model related to its lower level descendants by a set of examples click on the intermediate or target concept - circled in the structure - to see the set of examples that define it).
The Nursery Database contains examples with the structural information removed, i.e., directly relates NURSERY to the eight input attributes: parents, has_nurs, form, children, housing, finance, social, health.
Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.
Number of Instances: 12960 (instances completely cover the attribute
space)
Number of Attributes: 8
Class distribution:
Class | N | N[%] |
---|---|---|
not_recom | 4320 | 33.333% |
recommend | 2 | 0.015% |
very_recom | 328 | 2.531% |
priority | 4266 | 32.917% |
spec_prior | 4044 | 31.204% |
EMPLOY STRUCT_FINAN SOC_HEALTH NURSERY ---------------------------------------------------- convenient convenient not_recommended not_recom less_conv convenient not_recommended not_recom inconv convenient not_recommended not_recom critical convenient not_recommended not_recom convenient inconvenient not_recommended not_recom less_conv inconvenient not_recommended not_recom inconv inconvenient not_recommended not_recom critical inconvenient not_recommended not_recom convenient critical not_recommended not_recom less_conv critical not_recommended not_recom inconv critical not_recommended not_recom critical critical not_recommended not_recom convenient convenient recommended recommend less_conv convenient recommended very_recom inconv convenient recommended priority critical convenient recommended priority convenient inconvenient recommended very_recom less_conv inconvenient recommended very_recom inconv inconvenient recommended priority critical inconvenient recommended priority convenient critical recommended priority less_conv critical recommended priority inconv critical recommended priority critical critical recommended spec_prior convenient convenient priority priority less_conv convenient priority priority inconv convenient priority priority critical convenient priority priority convenient inconvenient priority priority less_conv inconvenient priority priority inconv inconvenient priority priority critical inconvenient priority spec_prior convenient critical priority priority less_conv critical priority priority inconv critical priority spec_prior critical critical priority spec_prior
parents has_nurs EMPLOY ------------------------------------ usual proper convenient pretentious proper less_conv great_pret proper inconv usual less_proper less_conv pretentious less_proper less_conv great_pret less_proper inconv usual improper less_conv pretentious improper inconv great_pret improper critical usual critical inconv pretentious critical critical great_pret critical critical usual very_crit critical pretentious very_crit critical great_pret very_crit critical
STRUCTURE housing finance STRUCT_FINAN ----------------------------------------------- less_crit convenient convenient convenient critical convenient convenient inconvenient very_crit convenient convenient inconvenient less_crit less_conv convenient inconvenient critical less_conv convenient inconvenient very_crit less_conv convenient critical less_crit critical convenient inconvenient critical critical convenient critical very_crit critical convenient critical less_crit convenient inconv inconvenient critical convenient inconv inconvenient very_crit convenient inconv critical less_crit less_conv inconv inconvenient critical less_conv inconv inconvenient very_crit less_conv inconv critical less_crit critical inconv inconvenient critical critical inconv critical very_crit critical inconv critical
form children STRUCTURE ------------------------------- complete 1 less_crit completed 1 critical incomplete 1 critical foster 1 very_crit complete 2 critical completed 2 critical incomplete 2 very_crit foster 2 very_crit complete 3 very_crit completed 3 very_crit incomplete 3 very_crit foster 3 very_crit complete more very_crit completed more very_crit incomplete more very_crit foster more very_crit
social health SOC_HEALTH ------------------------------------------- non-prob recommended recommended slightly_prob recommended recommended problematic recommended priority non-prob priority priority slightly_prob priority priority problematic priority priority non-prob not_recom not_recommended slightly_prob not_recom not_recommended problematic not_recom not_recommended