Nuevo sistema multiclasificador jerárquico. Posibilidades de aplicación
Archivos
Fecha
2007-06-05
Autores
Rodríguez Abed, Abdel
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Central “Marta Abreu” de Las Villas
Resumen
En este trabajo se diseña e implementa un sistema multiclasificador basado en una
secuencia de clasificadores que se especializan en las regiones de la base de
entrenamiento donde se han concentrado los errores de los homólogos ya entrenados.
Para ello se usa un conjunto de clasificadores jueces construidos jerárquicamente para
separar dichas regiones y determinar la aptitud de cada clasificador para responder
adecuadamente ante un nuevo caso. Para combinar las salidas de los clasificadores se
usan dos variantes. La primera, basada en la selección de un único experto; y la segunda
por medio de una votación pesada. Se validaron los dos modelos con diferentes
clasificadores base, usando 37 bases de casos entre las cuales se encuentran 11 de
carácter biomédico o bioinformático. Se realizó una comparación estadística de estos
modelos con los multiclasificadores más usados: Bagging y Boosting, obteniendo
resultados significativamente superiores con el multiclasificador jerárquico usando
Multilayer Perceptron como clasificador base y una combinación por selección. Esto
demostró la eficacia del modelo propuesto, así como su aplicabilidad en bases de carácter
general
In this thesis we designed and implemented a new ensemble of classifiers based on a sequence of classifiers which were specialized in regions of the training dataset where errors of its trained homologous are concentrated. In order to separate this regions, and to determine the aptitude of each classifier to properly respond to a new case, it was used another set of classifiers built hierarchically. We explored two variants to combine the base classifiers. The first one was based on the selection of only one expert; and for the second one we used a weighted vote. We validated both models with different base classifiers using 37 training datasets, 11 of them have biomedical or bioinformatics character. It was carried out a statistical comparison of these models with the well known Bagging and Boosting, obtaining significantly superior results with the hierarchical ensemble using Multilayer Perceptron as base classifier and selection to combine the outputs. Therefore, we demonstrated the efficacy of the proposed ensemble, as well as its applicability to general problems.
In this thesis we designed and implemented a new ensemble of classifiers based on a sequence of classifiers which were specialized in regions of the training dataset where errors of its trained homologous are concentrated. In order to separate this regions, and to determine the aptitude of each classifier to properly respond to a new case, it was used another set of classifiers built hierarchically. We explored two variants to combine the base classifiers. The first one was based on the selection of only one expert; and for the second one we used a weighted vote. We validated both models with different base classifiers using 37 training datasets, 11 of them have biomedical or bioinformatics character. It was carried out a statistical comparison of these models with the well known Bagging and Boosting, obtaining significantly superior results with the hierarchical ensemble using Multilayer Perceptron as base classifier and selection to combine the outputs. Therefore, we demonstrated the efficacy of the proposed ensemble, as well as its applicability to general problems.
Descripción
Palabras clave
Aprendizaje Automático, Multiclasificador Jerárquico, Clasificación, Problemas de Bioinformática, Weka, Biomedicina, Universidad Central “Marta Abreu” de las Villas (UCLV)