Utilizing the HTM algorithms for weather forecasting and anomaly detection
Abstract
Various studies have utilized different artificial neural networks (ANN) for weather forecasting. This thesis examines how well the official implementation of a novel online ANN called the Hierarchical Temporal Memory (HTM) can forecast the weather and detect anomalies in the weather data. Created by Numenta (www.numenta.com), the HTM emulates the brain's neocortical structures and processes to mimic its capabilities of memory retention. By using sparse distributed representations instead of binary representations as its foundation for information storage and representation, it is able to learn complex patterns in noisy data sets that can be used to make predictions and detect anomalies in streamed data. Numenta has officially implemented the theory of HTM in an open- source Python platform called NuPIC. Although there are slight differences between the theory of HTM and its implementation, the most important factor about NuPIC is the addition of several purely engineered algorithms. Two of the most notable additions, are an algorithm that enables NuPIC to make the final decisions in cases when more than one possible prediction is possible, and an algorithm that makes it possible to simultaneously input multiple metrics to NuPIC. The weather data that was to be predicted consisted of several weather factors, wind direction, wind speed, atmospheric pressure, precipitation, temperature, and relative humidity measurements spanning over a period of 12 years. Originally, the goal was to input the data sets simultaneously. However, because the functionality responsible for enabling this feature was malfunctioning at the time of the thesis work, every weather data set had to be input separately. The results showed that NuPIC was able to make decent forecasts, but was for the most part outperformed by a simple technique that made predictions by calculating the average of the last few days. The main reasons for this was due to the weather's lack of similarity between past and current conditions, and NuPIC's inability to generalize its knowledge in order to factor weather trends in its predictions. Although there is also a minor issue with the current engineered prediction algorithm, the results indicate that prediction is not NuPIC's strongest suit. NuPIC was completely unable to detect any noteworthy anomalies in the weather data, which again is most likely due to the weather data's chaotic nature. Despite the negative results, there were also some positive ones. An unrelated experiment that detected anomalies in the oil price, revealed that NuPIC was able to detect anomalies that were linked to major real world economic and/or geopolitical events. This indicates that the quality of NuPIC's results are highly dependant on the properties of the data set that it is given. Data sets that conform to NuPIC's strengths can lead to both decent predictions and anomaly detections, while those that do not produce poor results.