Construcción de sistemas multiclasificadores usando optimización de Colonias de Hormigas
Fecha
2015-07-04
Autores
Santos Martínez, Lester René
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Central “Marta Abreu” de Las Villas
Resumen
Las técnicas de clasificación están siendo muy utilizadas en la solución de diferentes problemas de la sociedad. Existen varios modelos de clasificación reportados en la literatura como las redes neuronales, árboles de clasificación, análisis discriminante, entre otros. En investigaciones recientes muchos autores introducen el término multiclasificador como un “clasificador” que combina las salidas de un conjunto de clasificadores individuales, utilizando algún criterio (ej.: promedio, voto mayoritario, mínimo, etc.). Cuando se combinan clasificadores es importante garantizar la diversidad entre ellos ya que no tendría sentido combinar clasificadores cuya clasificación es la misma. Existen varios modelos para construir un multiclasificador y todos garantizan esta diversidad de diferentes formas. En el caso de aquellos que usan distintos clasificadores bases, existen algunas medidas estadísticas que pueden ser usadas para estimar cuán diversos son, ellas se denominan medidas de diversidad.
La selección de los distintos clasificadores bases para un sistema multiclasificador es una tarea compleja, precisamente por las grandes cantidades de clasificadores individuales y las múltiples combinaciones que ellos pueden generar, ante este problema combinatorio se propone el uso de las meta heurísticas con las medidas de diversidad para obtener una combinación de clasificadores diversos y una exactitud en la combinación superior a la mejor individual.
El curso pasado se desarrolló la investigación (Hernández, 2014), en la que se usaron específicamente los Algoritmos Genéticos para lograr lo explicado anteriormente, como resultado de la misma se obtuvo la primera versión de un sistema llamado: Splicing v1.2.
En este trabajo se realizan las modificaciones necesarias sobre ese sistema para obtener una versión más completa donde se implementa una nueva meta heurística, en este caso, la meta heurística ACO1, con distintas variantes y varias heurísticas, para resolver exactamente el problema anterior.
Se demuestra que los resultados obtenidos son tan buenos como los de Algoritmos Genéticos en cuanto a la exactitud en la clasificación del multiclasificador implementado, las soluciones obtenidas en esta investigación contienen menor cantidad de clasificadores, por tanto son sistemas menos complejos.
Las variantes de ACO muestran resultados muy similares entre ellas aunque la mejor fue MMAS2, específicamente con la heurística de la diversidad. Finalmente, se muestra una aplicación en el campo de la medicina.
Classification techniques are being widely used in solving different problems in society. There exist several classification models referenced in the literature such as neural networks, classification trees and discriminant analysis, among others. In recent researches, many authors introduce the term “multiple classifier” as a "classifier" which combines the outputs of a set of individual classifiers using certain criteria (e.g., average, majority vote, minimum, etc.). When combining classifiers is important to ensure diversity among them, because it would not make sense to combine classifiers whose classification is the same. There are several models for constructing a multiple classifier system and all ensure the diversity of different ways. For those who use different base classifiers, there are some statistical measures that can be used to estimate how diverse they are, they are called diversity measure. The selection of the different base classifiers for a multiple classifier system is a complex task, precisely for the big amount of individual classifiers and the multiple combinations that they can generate. In order to address this combining problem, there is proposed the use of the metaheuristics with the diversity measures to obtain a combination of different classifiers and an accuracy in the superior combination to the best individual. Last school year a research (Hernández, 2014) was carried out, in which the Genetic Algorithms were specifically used to obtain the above explained; as a result, the first version of a system called Splicing v1.2 was obtained. In the present work, some needed modifications on that system to obtain a more complete version are developed, where a new metaheuristic is implemented, in this case, the metaheuristic ACO, with different variants and some heuristics, to solve exactly the stated problem. It is demonstrated that the obtained results are as good as the ones of the Genetic Algorithms according to the accuracy in the classification of the implemented multiclassifier, the obtained solutions in this research contain less amount of classifiers, therefore they are less complex systems. The variants of ACO show very similar results among them, although the best was MMAS, specifically with the heuristic of diversity. Finally, an application in the field of medicine is showed.
Classification techniques are being widely used in solving different problems in society. There exist several classification models referenced in the literature such as neural networks, classification trees and discriminant analysis, among others. In recent researches, many authors introduce the term “multiple classifier” as a "classifier" which combines the outputs of a set of individual classifiers using certain criteria (e.g., average, majority vote, minimum, etc.). When combining classifiers is important to ensure diversity among them, because it would not make sense to combine classifiers whose classification is the same. There are several models for constructing a multiple classifier system and all ensure the diversity of different ways. For those who use different base classifiers, there are some statistical measures that can be used to estimate how diverse they are, they are called diversity measure. The selection of the different base classifiers for a multiple classifier system is a complex task, precisely for the big amount of individual classifiers and the multiple combinations that they can generate. In order to address this combining problem, there is proposed the use of the metaheuristics with the diversity measures to obtain a combination of different classifiers and an accuracy in the superior combination to the best individual. Last school year a research (Hernández, 2014) was carried out, in which the Genetic Algorithms were specifically used to obtain the above explained; as a result, the first version of a system called Splicing v1.2 was obtained. In the present work, some needed modifications on that system to obtain a more complete version are developed, where a new metaheuristic is implemented, in this case, the metaheuristic ACO, with different variants and some heuristics, to solve exactly the stated problem. It is demonstrated that the obtained results are as good as the ones of the Genetic Algorithms according to the accuracy in the classification of the implemented multiclassifier, the obtained solutions in this research contain less amount of classifiers, therefore they are less complex systems. The variants of ACO show very similar results among them, although the best was MMAS, specifically with the heuristic of diversity. Finally, an application in the field of medicine is showed.
Descripción
Palabras clave
Multiclasificadores, Medidas de Diversidad, Problema Combinatorio, Metaheurística ACO, Sistema de Hormiga Max-Min (MMAS), Algoritmos Genéticos, Predicción, Hipertensión Arterial, Niños