Minería de datos para series temporales en Weka y su aplicación en el pronóstico de precipitaciones
Fecha
2013-07-06
Autores
Soto Valero, César
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Central "Marta Abreu" de Las Villas
Resumen
Las series temporales permiten describir una gran variedad de fenómenos que transcurren a lo largo del tiempo. Los modelos que realizan análisis de series temporales usando técnicas de minería de datos son capaces de resolver múltiples problemas, superando las limitaciones de los métodos estadísticos tradicionales. Weka es un poderoso sistema de aprendizaje automatizado, sin embargo, ofrece muy pocas posibilidades para el trabajo con series temporales.
En el presente Trabajo de Diploma se diseña e implementa un paquete para Weka, con herramientas desarrolladas especialmente para la clasificación de series temporales. Dichas herramientas son: una función de distancia basada en DTW, un algoritmo de búsqueda de vecinos más cercanos para kNN y un filtro para la reducción de la numerosidad de series temporales. Además, se aborda el problema del pronóstico de precipitaciones aplicando técnicas de minería de datos a las salidas numéricas del modelo global GFS.
Time series can describe a variety of events that take place over time. The model that analyzes time series using data mining techniques is able to solve a lot of problems. This is because time series data minning model overcoming the limitations of traditional statistical methods. Weka is a powerful machine learning system, however, offers very few possibilities for working with time series data. In this work is designed and implemented a new package for Weka with tools developed especially for the classification of time series. Such tools include a distance function based on DTW, a nearest neighbor search algorithm for kNN and a filter for reducing numerosity. Also, it focuses the issue of applying data mining techniques to rainfall forecast problem using the numerical outputs of GFS global model.
Time series can describe a variety of events that take place over time. The model that analyzes time series using data mining techniques is able to solve a lot of problems. This is because time series data minning model overcoming the limitations of traditional statistical methods. Weka is a powerful machine learning system, however, offers very few possibilities for working with time series data. In this work is designed and implemented a new package for Weka with tools developed especially for the classification of time series. Such tools include a distance function based on DTW, a nearest neighbor search algorithm for kNN and a filter for reducing numerosity. Also, it focuses the issue of applying data mining techniques to rainfall forecast problem using the numerical outputs of GFS global model.
Descripción
Palabras clave
Minería de Datos, Series Temporales, Weka, Pronóstico de Precipitaciones, Paquete, Clasificación, DTW, GFS