Extensión de algoritmos representativos del aprendizaje automático al trabajo con datos tipo conjunto
Archivos
Fecha
2010-07-10
Autores
González Castellanos, Mabel
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Central “Marta Abreu” de Las Villas
Resumen
La forma de representación de los ejemplos en un conjunto de entrenamiento
resulta de extrema importancia para el concepto de aprendizaje automatizado.
Este constituye uno de los primeros pasos que se sigue en el diseño de un
sistema que se supere mediante la experiencia y consiste en la descripción del
problema mediante un conjunto de atributos con un determinado nivel de
medición.
Existen problemáticas con presencia de rasgos que pueden tomar varios valores
de manera simultánea para un mismo ejemplo. Dichos rasgos se pueden
representar de forma natural mediante conjuntos. Considerar conjuntos como
tipo de dato es una problemática poco abordada en el contexto del aprendizaje
automatizado. Las herramientas de aprendizaje automatizado disponibles no
brindan facilidades para el tratamiento de rasgos con las características
anteriormente descritas.
Considerando lo anterior esta investigación aborda el tratamiento de atributos de
tipo conjunto en el contexto de tres enfoques fundamentales del aprendizaje
automatizado: árboles de decisión, enfoque basado en instancias y el
probabilístico. Este trabajo incluye la presentación de nuevas propuestas
basadas en algoritmos de clasificación clásicos, que explotan los beneficios de
utilizar conjuntos como tipo de dato. El análisis experimental demuestra la
validez de las propuestas evaluadas.
The form of representation of the examples in a training set is of utmost importance to the concept of machine learning. This is one of the first steps to be followed in the design of a system that is exceeded by the experience and is the description of the problem through a set of attributes with a certain level of measurement. There are problems with the presence of traits that can take several values simultaneously for the same example. These features can be represented naturally by sets. Consider sets as a data type is a little problem addressed in the context of machine learning. Machine learning tools available do not provide facilities for the treatment of features with the characteristics described above. Considering the above, this research addresses the treatment of type attributes set in the context of three fundamental approaches to machine learning: focus lazy approach and probabilistic instances. This work includes the submission of new proposals based on traditional classification algorithms that exploit the benefits of using such type of data sets. The experimental analysis demonstrates the validity of the proposals evaluated.
The form of representation of the examples in a training set is of utmost importance to the concept of machine learning. This is one of the first steps to be followed in the design of a system that is exceeded by the experience and is the description of the problem through a set of attributes with a certain level of measurement. There are problems with the presence of traits that can take several values simultaneously for the same example. These features can be represented naturally by sets. Consider sets as a data type is a little problem addressed in the context of machine learning. Machine learning tools available do not provide facilities for the treatment of features with the characteristics described above. Considering the above, this research addresses the treatment of type attributes set in the context of three fundamental approaches to machine learning: focus lazy approach and probabilistic instances. This work includes the submission of new proposals based on traditional classification algorithms that exploit the benefits of using such type of data sets. The experimental analysis demonstrates the validity of the proposals evaluated.
Descripción
Palabras clave
Algoritmos Representativos, Extensión, Aprendizaje Automático, Datos Tipo Conjunto