Data scientist enters the office betting pool: Whoever most accurately predicts the day Austin’s heat wave breaks (first day with a high temperature less than 100 degrees Fahrenheit) wins. Seeking a defendable approach, our hero generates an ARIMA time-series forecast based on the last eleven years of daily high temperatures:
The model suggests September 1st will be the next day when the high temperature does not reach 100 degrees F. Badass Data Science will report the model’s accuracy once the measurements are in.
Downloaded Camp Mabry weather station data from ftp.ncdc.noaa.gov/pub/data/gsod/. Fit the data using the R “forecast” package’s “auto.arima” function.
This forecast is low-hanging fruit. Next time, our hero will include more historical data, the southern oscillation pattern, Hadley cell expansion, sunspots, Congressional hot air, and the DJIA into the model.