Caveat: I am not a technical investor–just a hobbyist, so take this analysis with a grain of salt. I am also just beginning with my Master’s work in statistics.

I wanted to examine the correlation between *changes* in the daily closing price of the Dow Jones Industrial Average (DJIA) and lags of those changes, to see if there is a pattern I could use. First I downloaded the DJIA data from Yahoo using Pandas:

# # load useful libraries # from pandas.io.data import DataReader from datetime import datetime import matplotlib.pyplot as plt import numpy as np import math from scipy.stats.stats import pearsonr from scipy.stats.stats import spearmanr # # load DJIA data from Yahoo server # djia = DataReader("DJIA", "yahoo", datetime(2000,1,1), datetime.today())

I then generated the autocorrelation plots for the one-day differenced closing prices and the signs of the one-day differenced closing prices:

# # investigate the diff between DJIA closing prices # diff_1st_order = djia["Adj Close"].diff() diff_1st_order_as_list = [] for d in diff_1st_order: if not np.isnan(d): diff_1st_order_as_list.append(d) plt.subplot(2, 1, 1) plt.acorr(diff_1st_order_as_list, maxlags=10) plt.title("Autocorrelation of Diff of DJIA Adjusted Close") plt.xlabel("Lag") plt.ylabel("Correlation") # # sign of diff, not diff itself # diff_1st_order_sign = [] for d in diff_1st_order_as_list: if not np.isnan(d / abs(d)): diff_1st_order_sign.append(d / abs(d)) else: diff_1st_order_sign.append(0) plt.subplot(2, 1, 2) plt.acorr(diff_1st_order_sign, maxlags=10) plt.title("Autocorrelation of Sign of Diff of DJIA Adjusted Close") plt.xlabel("Lag") plt.ylabel("Correlation")

There is a very small negative correlation between the one-day closing price difference and the one-day lag of the one-day closing price difference. Similarly, there is an even smaller positive correlation between the one-day closing price difference and the three-day lag of the one-day closing price difference.

So I set out to find the proportion of times the difference and the one-day lag of the difference changes from day to day:

# # frequencies of 1-day lag changes in direction of closing price # count_opposite = 0 count_same = 0 i_list = [] j_list = [] for i in range(0, len(diff_1st_order_as_list) - 1): price_diff_i = diff_1st_order_as_list[i] price_diff_j = diff_1st_order_as_list[i+1] # one trading day ahead i_list.append(price_diff_i) j_list.append(price_diff_j) sign_of_price_diff_i = 0 if not np.isnan(price_diff_i / abs(price_diff_i)): sign_of_price_diff_i = int(price_diff_i / abs(price_diff_i)) sign_of_price_diff_j = 0 if not np.isnan(price_diff_j / abs(price_diff_j)): sign_of_price_diff_j = int(price_diff_j / abs(price_diff_j)) if sign_of_price_diff_i == sign_of_price_diff_j: count_same += 1 else: count_opposite += 1 print print 'Correlation coefficients for the diff lists:' print '\t', 'Pearson R: ', pearsonr(i_list, j_list)[0] print '\t', 'Spearman R: ', spearmanr(i_list, j_list)[0] print print 'Amount of time closing value direction remains the same: ', round(float(count_same) / (float(count_same) + float(count_opposite)), 3) amount_time_changes = float(count_opposite) / (float(count_same) + float(count_opposite)) print 'Amount of time closing value direction changes: ', round(amount_time_changes, 3) L = amount_time_changes - 1.959964*((math.sqrt(amount_time_changes*(1.0 - amount_time_changes)))/math.sqrt(float(len(diff_1st_order_as_list)))) U = amount_time_changes + 1.959964*((math.sqrt(amount_time_changes*(1.0 - amount_time_changes)))/math.sqrt(float(len(diff_1st_order_as_list)))) print 'Agresti-Coull C.I.: ', round(L, 3), '< p <', round(U, 3) print

This analysis tells me that in the long run (at least over the period that I pulled DJIA data for), betting using one-day changes of direction of the closing price of the DJIA would slowly pay off. (However, we are ignoring the magnitudes of the changes in this analysis; the magnitudes may be insufficient to be worth the price of a trade. A future analysis will investigate this). The test for correlation between the two time-series (one-day difference and lag of one-day difference) shows that the Agresti-Coull confidence interval is appropriate (we have near independence), although this betting scheme relies on the thinest correlation detected by the autocorrelation plot.

There was another possible pattern in the autocorrelation plot above: a three-day lag positive correlation and a one-lag negative correlation. I decided to check out the proportion of times using the combination of the two would result in a prediction success greater than the null of 25% for the case that the three-day lag changes direction in opposite direction as the one-day lag and in the same direction as the zero-day value:

# # frequencies of combination of 1-day lag and 3-day lag changes in direction # of closing price # count_matches = 0 count_non_matches = 0 i_list = [] j_list = [] k_list = [] for i in range(0, len(diff_1st_order_as_list) - 3): price_diff_i = diff_1st_order_as_list[i] price_diff_j = diff_1st_order_as_list[i+1] # one trading day ahead price_diff_k = diff_1st_order_as_list[i+3] # three trading days ahead i_list.append(price_diff_i) j_list.append(price_diff_j) k_list.append(price_diff_k) # price_diff_i represents 3-day lag sign_of_price_diff_i = 0 if not np.isnan(price_diff_i / abs(price_diff_i)): sign_of_price_diff_i = int(price_diff_i / abs(price_diff_i)) sign_of_price_diff_j = 0 # price_diff_j represents 1-day lag if not np.isnan(price_diff_j / abs(price_diff_j)): sign_of_price_diff_j = int(price_diff_j / abs(price_diff_j)) sign_of_price_diff_k = 0 # price_diff_k represents current day if not np.isnan(price_diff_k / abs(price_diff_k)): sign_of_price_diff_k = int(price_diff_k / abs(price_diff_k)) if sign_of_price_diff_k != sign_of_price_diff_j and sign_of_price_diff_k == sign_of_price_diff_i: count_matches += 1 else: count_non_matches += 1 print 'Correlation coefficients for the diff lists:' print '\t', 'Pearson R: ', pearsonr(i_list, j_list)[0] print '\t', 'Spearman R: ', spearmanr(i_list, j_list)[0] print '\t', 'Pearson R: ', pearsonr(i_list, k_list)[0] print '\t', 'Spearman R: ', spearmanr(i_list, k_list)[0] print '\t', 'Pearson R: ', pearsonr(j_list, k_list)[0] print '\t', 'Spearman R: ', spearmanr(j_list, k_list)[0] print amount_time_changes = float(count_matches) / (float(count_matches) + float(count_non_matches)) print 'Amount of time 1-day change is opposite direction and 3-day change is same direction: ', round(amount_time_changes, 3) L = amount_time_changes - 1.959964*((math.sqrt(amount_time_changes*(1.0 - amount_time_changes)))/math.sqrt(float(len(diff_1st_order_as_list)))) U = amount_time_changes + 1.959964*((math.sqrt(amount_time_changes*(1.0 - amount_time_changes)))/math.sqrt(float(len(diff_1st_order_as_list)))) print 'Agresti-Coull C.I.: ', round(L, 3), '< p <', round(U, 3) print

To complete the code, we need to show the plot:

# # show the plot # plt.show()

I’m not certain the narrow margin of opportunity detected by this analysis is sufficient for the development of a trading strategy. More investigation is needed.