Efﬁcient Online Portfolio Selection with Heuristic AI Algorithm

Online portfolio selection is one of the most important problems in several research communities, including ﬁnance, engineering, statistics, artiﬁcial intelligence, and machine learning, etc. The primary aim of online portfolio selection is to determine portfolio weights in every investment period (i.e., daily, weekly, monthly etc.) to maximize the investor’s ﬁnal wealth after the end of investment period (e.g., 1 year or longer). In this paper, we present an efﬁcient online portfolio selection strategy that employs a heuristic artiﬁcial intelligence (AI) algorithm to maximise the total wealth based on historical stock prices. Based on empirical studies conducted on recent historical datasets for the period 2000 to 2017 on four different stock markets (i.e., NYSE, S&P500, DJIA, and TSX), the proposed algorithm has been shown to outperform both Anticor and OLMAR —the two most prominent portfolio selection strategies in contemporary literature. The algorithm achieved 34.22 of the total wealth while Anticor and OLMAR returned 1.08, and 4.52 respectively, on the NYSE market. On the S&P market, the algorithm returned 15.39 of the total wealth while the Anticor and OLMAR returned 6.20 and 2.2, respectively. On the TSX market, the algorithm returned 2.43 of the total wealth while Anticor and Olmar returned 0.96 and 0.41, respectively.


Introduction
On-line portfolio selection and simulation have attracted increasing interest in the machine learning and AI communities recently. Empirical evidence has shown that high and low stock prices are temporary, while stock price relatives are likely to adhere to the mean reversion property, which assumes that poor performing stocks will perform well in the subsequent periods and vice versa. Most recently, portfolio optimization algorithms have been developed based on the principles of the mean reversion property. In 2004, [2] published an influential paper titled "Can We Learn to Beat the Best Stock" in the Journal of Artificial Intelligence Research", demonstrating how a simple heuristic algorithm was able to consistently outperform the best stocks in NYSE, TSX, S&P500, and DJIA stock markets for 9 years from 1994 to 2003. Interestingly, the algorithm was able to significantly outperform all other more sophisticated and theoretically proven algorithms in the literature despite its simplicity. The paper caught the attention of the research community to further pursue research in the area in order to develop more profitable algorithms. Since then, several other new algorithms have been proposed, including PAMR [10], CORN [8], and OLMAR [9]. Notably, the OLMAR algorithm has been shown to outperform all other existing algorithms from recent empirical studies (after AntiCor [2]) when evaluated in four different stock markets (i.e., NYSE, TSE, SP500, and DJIA. [9]. The most extensive empirical studies were on historical NYSE data, which was 330 EFFICIENT ONLINE PORTFOLIO SELECTION WITH HEURISTIC AI ALGORITHM evaluated from January 1985 to June 2010. Based on the empirical studies, [9] claim that the OLMAR algorithm has generated improvements by 3 orders of magnitude after the end of investment on the NYSE dataset period, when compared to the original Anticor algorithm [9]. In this paper, we will continue re-evaluate both algorithms with recent historical market data (i.e, from 2000 until 2017), in order to examine the applicability of both algorithms with current market volatilities. Based on our examination, we have discovered several new findings that demonstrate the weaknesses of both the Anticor and OLMAR algorithms. We further present a new strategy and algorithm that aims to tackle these weaknesses. Our empirical studies have demonstrated that our new algorithm is shown to be more effective in terms of generating higher returns (i.e., the total wealth), and it also incurs lower investment risk than existing online portfolio selection algorithms.

Organization of the Paper
The rest of this paper is structured as follows. Section 2 gives some background and work related to the current state-of-the art works in the field of portfolio selection and optimization. In particular, we will describe the Anticor and OLMAR algorithms, which are considered as the state-of-the-art algorithms in the field. Section 3 evaluates and presents the results of both the Anticor and OLMAR on most recent historical datasets from four different stock markets. The objective is to examine the applicability of both algorithms in current market conditions. Section 4 further discusses the strengths and weaknesses of both Anticor and OLMAR algorithms, and presents the idea of using market indices/benchmark indices to minimize investment risks. Section 5 presents our new strategy that makes use of both benchmark index and dynamic moving average model based on the mean reversion properties. Section 6 presents the results of the empirical studies on the proposed strategy. Section 7 presents some caveats and obstacles to utilizing the new strategy, and Section 8 concludes our paper with final remarks and future work.

Portfolio Selection Algorithms
There are two different categories of portfolio selection algorithms. The first category is based on theoretically grounded algorithms, while the other category is based on heuristics. Earlier portfolio selection algorithms were developed based on the theoretical guarantee of exponential growth, aiming to achieve as much wealth by rebalancing the portfolio after each trading day. The concept is to allocate a proportionate amount of investment to a set of individual stocks so that the wealth can accumulated at exponential rate until the end of investment period. Such algorithms include Universal Portfolio [3], Stochastic Linear-Quadratic-Exponential [?], and Online Newton Step [1]. These algorithms aim to accumulate the wealth a through sequential rebalancing strategy given a sufficiently long period of time. Although very elegant in terms of their mathematical formulation, they have displayed very disappointing performance in practical applications [2,10].
More recent algorithms have employed heuristic strategies to maximise the total wealth based on historical stock prices. They have been shown to outperform all theoretical algorithms in empirical studies. However, there are only a handful of heuristic strategies that have been proposed recently. These include Anticor [2], Kalman Filtering [11] and OLMAR [7,9]. Anticor is the first algorithm which was shown to outperform all theory based algorithms, including Nonparametric Nearest Neighbor [5], Nonparametric Nearest Neighbor Log-optimal [4], Online Newton Step [1], Exponential Gradient [6], Exponential Gradient [6], and Universal Portfolio [3]. However, the latest empirical studies have shown that the OLMAR outperformed both Anticor and all other algorithms in the literature on three major historical datasets: NYSE, S&P500, and TSX markets. Independent studies conducted by Paul Perry ([12]) on more recent ETF datasets have also validated the superiority of OLMAR algorithm over other existing algorithms. Interestingly, the OLMAR algorithm is based on the original concept of Anticor's price mean reversal. However, the difference is that Anticor is based on a single-period price reversal, while OLMAR exploits the multiperiod price reversal correlation to further increase the accuracy of the prediction. Recently, [11] also claims that their proposed algorithm gives better profitability than Anticor, but the algorithm has not been validated extensively for consideration as a serious contender.

The Anticor Algorithm Revisited
In 2004, [2] published a very simple heuristic algorithm that has been demonstrated to outperform all other existing portfolio selection algorithms in the literature. While traditional universal algorithms and technical trading heuristics attempt to predict winners or trends, their approach, known as the Anticor algorithm, relies on predictable statistical relations among all pairs of stocks in the portfolio. The principle of the Anticor (AC) algorithm is to evaluate changes in overall stocks' performance by dividing the historical sequence of past returns series into equal-size periods known as windows, each with a length of w days, where w is an adjustable parameter.
Following the mean reversion principle, the algorithm will then transfer the wealth from recently highperforming stocks to anti-correlated low-performing stocks. The idea is that low-performing stocks will eventually increase to the prices' mean.
Initially, Anticor captures a short stock market history between two consecutive windows LX 1 and LX 1 , each of w trading days [13,2]: The LX 1 and LX 2 are the two vector sequences constructed by taking the logarithm over market subsequences corresponding to the time windows [t − 2w + 1; t − w] and [t − w + 1; t], respectively. Further, window size w is chosen based on historical performance. In [2]'s empirical studies, the chosen value is w = 30 for the best performance. Next, the cross-correlation matrix between column vectors in LX 1 and LX 2 is calculated as follows: The strategy of the algorithm is to generate signal based on two important conditions. The first condition is when it detects that stock i has outperformed stock j during the last window. The second condition is when the stock i's performance in the last window is anti-correlated to stock j's performance in the second last window [µ 2 (i) ≥ µ 2 (j) ∧ M corr(i, j) > 0]. If both criteria are met, the algorithm then transfer weight allocation from stock i to stock j in the hope that stock j will increase, leading to higher profits gained.
Despite the algorithm's simplicity, the empirical results for four major market indices (NYSE, S&P500, DJIA, and TSX) from July 1962 to April 2013 have provided strong evidence that the Anticor algorithm is able to significantly beat the market. Moreover, it also beat the best stocks in their respective markets.

The OLMAR Algorithm Revisited
The On-Line Portfolio Selection with Moving Average Reversion (OLMAR) was initially inspired by the Anticor algorithm, which makes use of a single period mean reversal correlation to determine which subsets of highperforming stocks should be transferred to poor performing stocks. [9] discovered that the Anticor's single period mean reversion approach has several drawbacks as stock prices frequently fluctuate due to inherent noises.
To overcome the limitation, the authors proposed the OLMAR algorithm that makes use of both short-term mean and long-term mean based on a multiple-period mean reversion, or the so-called "Moving Average Reversion" (MAR) to explicitly predict next price relatives using moving averages. Initially, the OLMAR algorithm calculates moving average reversion based on the expected price relative vector: where w is the window size and ⊗ denotes element-wise production. The algorithm passively keeps the previous solution if the classification is correct, while aggressively approaching a new solution if the classification is incorrect. The algorithm then solves a quadratic optimization problem to determine the optimal weight allocation for each selected stock in the portfolio. The OLMAR algorithm employs two controlling parameters: window size w and the ϵ parameters. Window size indicates the maximum lookup time period (i.e., number of days), which then specifies how many days the algorithm should consider in the past in order to calculate the moving average. The reversion threshold ϵ is used to maximize the expected return based on the historical performance. For each time step, the algorithm continues to update the portfolio based on the predicted price relative vector, the window size w and the reversion threshold ϵ, and past t portfolio weight allocation b t . Based on the empirical evaluation, it was demonstrated that parameters ϵ = 10 and w = 5 provide a consistent results for OLMAR in all cases. The values of the parameters of OLMAR were chosen using "educated guesses", followed by some trial-and error experimentation to fine-tune the performance. Li et al. discovered that these settings achieve the top performance among all competitors. On the well-known benchmark NYSE(O), OLMAR significantly outperforms all other algorithms including the Anticor. Similar observations were also found for the NYSE(N) portfolio. However, OLMAR was unable to beat the Anticor on the DJIA dataset. This has cast some doubt on the effectiveness of the OLMAR algorithm. Nonetheless, these doubts demonstrate the need for further examination of both algorithms in detail to determine both their strengths and weaknesses.

Evaluating Anticor and OLMAR
In [2], empirical studies from four historical market datasets were examined. Each study involved various benchmarks of different stock market index exchanges (i.e., NYSE, SP100, DJIA, and TSX). Comparisons were made for all existing portfolio selection strategies. Anticor showed outstanding performance on the NYSE, TSX, SP500, and DJIA indices for the period spanning Jan 1998 to 2003. [9] also conducted a detailed empirical study on the Anticor and existing portfolio strategies selected stocks from the NYSE, DJA, and TSE stock indices. A portfolio of 17 stocks was selected from the NYSE market, then tested from 1985 to 2010; a portfolio of 30 stocks from the DJIA index were selected and tested from 2001 to 2003 period; a porfolio 22 stocks were selected from the TSE market and tested from 1994 to 1998. [9] also proposed the OLMAR algorithm, which was shown to be more effective than the Anticor algorithm on these four datasets. Interestingly, empirical study has also demonstrated that none of other existing algorithms were able to outperform either Anticor and OLMAR in any stock indices, while OLMAR has been shown to provide significantly better performance than Anticor by 3 orders of magnitude almost in all datasets (apart from the DJIA market). [11] has conducted a preliminary evaluation of various portfolio selection algorithm on previously untested markets datasets. The evaluation includes the TOP (South Africa), FTSE (UK), TSE (Canada), and NASDAQ100 (US) indices in a recent period spanning Jan 2000 to Oct 2013. Datasets from previous studies were also used for benchmarks. The authors proposed an alternative variant of Anticor algorithm using the Kalman Filtering method to enhance the algorithm strategy, which has been shown to be effective based on empirical results. However, this achievement was no better than that of OLMAR, which was able to outperform the Anticor algorithm by 3 orders of magnitude.
Based on past empirical evaluation, the Anticor has been evaluated using several stock market indices until 2013. On the other hand, the OLMAR algorithm has only been evaluated on market datasets up to period 2010. It would therefore be interesting to find both algorithms perform with more recent market data until the year 2017. We therefore re-evaluate both Anticor and OLMAR algorithms using the most recent market data. In order to avoid experimental bias, we use the same stock selection as used by [2]. The aim is to examine how these algorithms would perform in current market. At the time of writing, the latest market data which can be obtained is December 2017. Figure 1 shows the total returns of both ANTICOR and OLMAR on four historical datasets (i.e., SP100, NYSE, TSX, and DJIA) based on [2]'s stock selection. The OLMAR algorithm spectacularly outperforms Anticor for most markets, except for the S&P 500 portfolio index. This is because OLMAR incurs significant loss during the recession 2008 period. For all markets, both algorithms lost significant portion of its previous gain in just less than one year during 2008. For S&P market, it can be observed that the Anticor algorithm starts to outperform the OLMAR algorithm. Several important findings may be made from this observation. First, the OLMAR works spectacularly well in bull market conditions. Empirical results show than it performed exceedingly well during the period spanning 2003 to 2007, which is considered a bull period in US stock market history. However, both Anticor and OLMAR bear a great risk during recession/bear period because they wiped out a large proportion of investor wealth. In 2008, we can observe a very sharp decline in terms of total wealth due to the large losses the algorithm incurs. In fact, all returns and wealth generated between 2000 and 2007 were wiped out in just one year after the 2008 market recession.
From these findings, the particular weaknesses of both algorithms are apparent. The OLMAR performs spectacularly well in bull conditions, but incurs significant loss under bear markets. Similarly, the Anticor also incurs significant loss during the recession. It can be observed that the gains made by both algorithms from 2005 to 2007 were also wiped out in 2008. Hence, both algorithms are considered risky because there are several instances in which they experienced losses of more than 20% in a single day during the investment period. Furthermore, both algorithms are unable to generate consistent returns for most years. It can be seen the OLMAR algorithm incurs significant losses during the year 2010 to 2013; this is clearly observed for NYSE and TSX markets. However, the Anticor algorithm surprisingly outperforms the OLMAR algorithm during the same period. Nonetheless, both algorithms continue to incur further losses for the year 2014 onwards. The OLMAR algorithm is based on multi-period reversion mean, which makes the assumption that the prices of poorly performing stocks will return to their historical averages over time. However, as can be observed from empirical results, this assumption leads to a few crucial issues. It is not clear what is the best way to compute the price mean due to market volatility. What window size w should be used: last 5 days, 10 days, 20 days, or 30 days? Based on the window size, the trend outcome will be interpreted differently. [2] suggested w = 30 as the optimal window size for NYSE, DJIA, SP500, and TSX historical datasets. Of course, it may be argued the optimal parameter largely depends on the characteristics of specific historical datasets. Hence, it is important to determine the optimal window size w in order to achieve the best performance. Figures 2 and 3 show Anticor's and OLMAR's cumulative wealth on four different portfolio index from four different market indices under varying window size w = 3, w = 5, w = 10, w = 15 during the period Jan 2000 to Dec 2017. As can be observed, the optimal window size varies from one index to another. For the NYSE portfolio index, w = 10 gives the best total wealth, whereas the best profits are achieved by different window sizes for other portfolio indices varies. The results reveal that the optimal window does not always provide better improvement with size. For example, the w = 3 gives better performance on the NYSE index portfolio when compared to w = 15 for NYSE stock portfolio. In fact, the window size w = 30 incurs the worst performance for the NYSE data.
The bottom left of the Figure 3 further illustrates the instability of the window size. In previous studies, [2] claimed that window size does not have a significant impact on performance. Our results have demonstrated otherwise. It may be observed that window size 10 (i.e., w = 10) generates the best returns for the NYSE stock but the same window size used for S&P500 incurs the worst loss. This demonstrates that employing one specific window size can provide the best returns from one market index (i.e., NYSE) while it can also provide worst returns on other market (i.e., S&P 500) simultaneously. It is certainly not claimed here that the results presented previously in [2] were erroneous, or that the new results in this paper somehow invalidate those earlier results. Rather, in the light of empirical evaluation on more recent datasets, we have gained a new and more informed perspective on both the Anticor and OLMAR results presented in [2] and [9]; those earlier results now appear less promising in light of the new data from experiments with a more realistic scenario and more recent historical data. The OLMAR algorithm is based on multi-period reversion mean, which makes the assumption that the prices of poorly performing stocks will return to their historical averages over time. However, as can be observed from empirical results, this assumption leads to a few crucial issues. It is not clear what is the best way to compute the price mean due to market volatility. What window size w should be used: last 5 days, 10 days, 20 days, or 30 days? Based on the window size, the trend outcome will be interpreted differently. [2] suggested w = 30 as the optimal window size for NYSE, DJIA, SP500, and TSX historical datasets. Of course, it may be argued the optimal parameter largely depends on the characteristics of specific historical datasets. Hence, it is important to determine the optimal window size w in order to achieve the best performance. Figures 2 and 3 show Anticor's and OLMAR's cumulative wealth on four different portfolio index from four different market indices under varying window size w = 3, w = 5, w = 10, w = 15 during the period Jan 2000 to Dec 2017. As can be observed, the optimal window size varies from one index to another. For the NYSE portfolio index, w = 10 gives the best total wealth, whereas the best profits are achieved by different window sizes for other portfolio indices varies. The results reveal that the optimal window does not always provide better improvement with size. For example, the w = 3 gives better performance on the NYSE index portfolio when compared to w = 15 for NYSE stock portfolio. In fact, the window size w = 30 incurs the worst performance for the NYSE data.
The bottom left of the Figure 3 further illustrates the instability of the window size. In previous studies, [2] claimed that window size does not have a significant impact on performance. Our results have demonstrated otherwise. It may be observed that window size 10 (i.e., w = 10) generates the best returns for the NYSE stock  Table 2. The longest and the largest drawdown for individual stocks (from different market indices) before the prices reversed back to their former historical averages but the same window size used for S&P500 incurs the worst loss. This demonstrates that employing one specific window size can provide the best returns from one market index (i.e., NYSE) while it can also provide worst returns on other market (i.e., S&P 500) simultaneously. It is certainly not claimed here that the results presented previously in [2] were erroneous, or that the new results in this paper somehow invalidate those earlier results. Rather, in the light of empirical evaluation on more recent datasets, we have gained a new and more informed perspective on both the Anticor and OLMAR results presented in [2] and [9]; those earlier results now appear less promising in light of the new data from experiments with a more realistic scenario and more recent historical data.
Next, we are also interested in examining the reasons for the poor performance of these two algorithms. Earlier, we noted that both algorithms posted significant losses during the recession period of 2008-2009. Table 1 shows the maximum trading daily loss incurred by both algorithms between this period on the NYSE dataset. The Anticor algorithm suffers the most in terms maximum loss, with one-day very largest -47% on the 15th September 2008, while the OLMAR also made a large one-day loss at 41% on the 24th September 2008. This is due to the overoptimistic strategy of the mean-reversal principle. In the historical data, there were a number of cases which show that the stock prices took a very long period before they returned to their former historical averages. For example, AIG had a consecutive price decline from November 2008 until the end of 2010 (more than 2 years) before it started to revert back to its former historical average. Hence, a strategy that always anticipate stocks would revert to their historical averages will incur significant losses during this period. This explains why both algorithms (especially Anticor) experienced significant losses during the recession period or bear market.
To illustrate this point, Table 2 further shows the longest drawdown incurred by both Anticor and OLMAR (in terms of the number of n days) for individual stocks (from various market indices) and the largest drawdown they incur before the prices started to recover to their former historical averages. It can be seen that the F stock (i.e., Ford Motor Company) took 333 days before its stock price returns to its former historical mean (from March 2011 to Nov 2012). On the other hand, the MCD stock (i.e. McDonald's) took only 29 days to revert to its price mean during the same time period. The results demonstrate that the mean reversion between pairs of stock prices cannot be used to reliably determine when stocks should recover to their historical averages. In some cases, the waiting time may be very long and significantly vary from one stock to another. Hence, this simple observation indicates that a single optimal window size is not sufficient to determine even a short-term trend in stock prices.
Negative news and/or changes to the companies' fundamentals can also have a sudden impact on the stock market price regardless how good is the stock fundamentals are. Such an impact may last several days. For example, in 15th September 2008, the American International Group Inc. (AIG) has posted its largest 1-day loss of 60.8% from its market share after the company failed to present a plan to raise capital and omit credit rating downgrades. In the A. NAZIR 337 following days, its share price continued to decline by 21% and 45%, respectively. As previously shown in Table  1, both OLMAR and Anticor made huge losses in single day due to this phenomena. A large portion of wealth was wiped out during such events, highlighting the inherent risk associated with both algorithms. Worryingly, there are also many cases in the history which demonstrate that stocks will never revert to the original mean at all due to company's fundamental problems. The AIG stock is such example. AIG was initially listed in the DJIA stock index but later removed due to its continuous streak of bad performance.
In 1991, [3] published an influential paper of the first online portfolio selection algorithm and has made its experimental dataset available to the public. Since then, the datasets have been used by all new proposed algorithms as a benchmark dataset. There are, however, several considerable issues with the data. First, several stocks which were originally listed in the dataset are no longer one of the largest from the index composite. They may not provide market liquidity. Third, using old datasets may introduce potential dataset selection and/or data-snooping biases. Dataset selection and snooping biases occur when a given set of data is used more than once for purposes of inference or model selection [14]. When such data reuse occurs, there is always a possibility that any satisfactory results obtained may simply be due to chance rather than to any merit inherent in the method yielding the results. Hence, a more comprehensive test of performance across variations are needed to ensure some degree of confidence that one will not mistake results that may have been generated by chance for genuinely good results.
To avoid dataset selection and/or data-snooping biases, we created four new historical datasets from the four different markets (i.e., NYSE, DJIA, SP500, and TSX). The stock selection is made based on 2 important criteria: 1) The selected stock must be listed in the index from the beginning of 2000 (since this is our starting point of test runs); and 2) the selected stock must belong to one of the largest companies by market capitalization with high liquidity. The results are summarized in Figure 4 , showing the accumulated total wealth attained by both Anticor and OLMAR. A comparison with earlier results (i.e, Figure 1) reveals significant differences in terms of the performance (i.e, total wealth, annualized return, and Sharpe ratio). Both Anticor and OLMAR performed worst on new stock selection of portfolios (i.e., lower wealth, lower sharp ratio) in all 4 index markets: NYSE, TSX, SP500, and DJIA. For the new NYSE portfolio, OLMAR only managed to achieve 4.17 of total wealth and Sharpe ratio of 0.14 on the NYSE datasets. When compared to the original Borodin's NYSE datasets within the same period (i.e., 2000-2017), this would account for less than 300% of the total wealth (the total wealth was 13.12 with a Sharpe ratio of 0.4 on Borodin's NYSE datasets).
In the new DJIA portfolio, differences are more apparent when the portfolio of stocks is extended for longer time. Earlier results on the Borodin's DJIA dataset have demonstrated that OLMAR generated total wealth of 1.54; 24.34% annualized return; 0.44 Sharpe ratio between 2001 and 2003. Fig. 1c shows the results of the DJIA portfolio between extended period of 2000 to 2017. Anticor achieved a total wealth of 1.42, while OLMAR only achieved a total wealth of 1.21, which represents annualized returns of only 1.9% and 1.2%, respectively. This implies that the strategies achieve not more than 0.2 total wealth in average per year. Further, they both obtained very low Sharpe ratio of 0.09 and 0.03, respectively. This is significantly lower when compared to the total wealth of 1.54 for the 2-year period (i.e., [2001][2002][2003], which OLMAR achieved with the Borodin's dataset. Such large differences demonstrate that both Anticor and OLMAR algorithms are not robust enough for a wide selection of stock portfolios, despite being validated on the same market indices. Significant differences in performance can be clearly observed in such cases, not only in different market datasets, but these differences are so apparent with different selection of stocks on the same market index across different time periods. These findings now cast serious doubt on past empirical results concerning the robustness of both Anticor and OLMAR. The results have clearly demonstrated that both algorithms failed to adapt with new unseen market data and unseen market volatility. If the algorithms cannot cope with different set of stocks under varying market conditions (i.e., bull and bear markets), then these algorithms will attract little practical interest.

Using Market indices and/or Benchmark indices to Minimize Risks
In the previous section, we have seen how both Anticor and OLMAR incur significant losses during certain market periods, especially during bear or recession times. The risk are too great. We saw many instances in both Anticor and OLMAR in which more than 20% of investor's wealth could be wiped in a single day. This is due to the "nonmean reversion" phenomena, in which the invested stocks fail to recover their prices from their original means. The risks involved in such strategies are perhaps a primary reason for why investors prefer to invest in index funds rather than actively managed funds. Fund managers commonly agree that the market recession and the bull market can be detected earlier from the index trend since these indices take a broad economic view. In 2008, a steep declined in stock prices may be observed for all four market indices. Such indicator provided an early sign of recession or bear market periods. A major pragmatic question is whether one can utilize such index information to guide the investment decision by both Anticor and OLMAR. A promising approach is to make use of simple statistical relations of the overall market sentiments. Market indices can be very useful a complementary tool to detect early signs of bull, bear, or flat market. The portfolio strategy would then take advantage of such detection to determine whether to invest or not in a particular day. If bull market sign is detected, the best strategy may be to maximize investment; if a bear market sign is detected, the strategy may consider not investing in the market; if flat sign is discovered, the optimum strategy is perhaps to make a conservative investment.
Both Anticor and OLMAR employ a mean reversion strategy exploiting the properties of financial markets, which assumes poor performing stocks will perform well in the subsequent periods and vice versa. However, empirical studies conducted by the earlier results indicate this is not often the case. We have seen several cases which have been demonstrated in the history that individual stocks often continue to decline in prices on two major factors: (1) major recession such as during the bear market, and (2) the stock starts to have negative outlook among investors due to bad fundamentals, rumours, and/or a company's internal problem. Such phenomena explain why A. NAZIR 339 both Anticor and OLMAR heavily suffer in selected periods, whereby the majority of stocks tend to decline in values regardless of the their fundamentals.
We propose a portfolio selection and optimization strategy, which aims to minimize the risks of bad investments so that the investors' wealth can be protected and sustained during bad periods. The first step of the proposed strategy is to make use of the market index to generate a benchmark index which may be used to evaluate the risk involved in making investment for a particular day. Based on the calculated risk, a decision can be made either to invest or not to invest. The second step is to determine a subset of stocks to invest (provided a buy signal has been triggered by the first step), assigning greater weights to selected stocks which would offer higher gains based on the mean reversion principle (i.e., selecting stocks with high likelihood for their prices to revert to their historical averages). The rationale behind this approach is to avoid investment during a bear market, while at the same time maximizing profits during a bull market. We will perform a demonstration with real historical datasets from various markets, such that the proposed strategy offers better returns with minimal risks when compared to all other existing approaches.

Dynamic Moving Average Model with Benchmark Index (DMA-BI)
We propose the Dynamic Moving Average model with Benchmark Index (DMA-BI), which aims to take advantage of the mean reversal phenomena at minimal risks. Similar to both Anticor and OLMAR, the strategy is to take advantage of the fact that large potential gains can be realized from the mean reversion phenomenon (i.e., reversal to the mean) but such investment trades must be filtered out using the benchmark index to avoid risky trades. That is the motivation behind our DMA-BI strategy. By attempting to filter out trades that carry high risks, DMA-BI is capable of some extraordinary performance with reasonable transaction costs.
There are two main steps of the strategy. The first step is to identify the trend movement of the next day in order to determine whether it is worthwhile to make a trade. For this purpose, the strategy aims to predict the next day trend movement based on current market trend. In achieving this, strategy creates a benchmark index that can accurately represent the index of which the stocks represent. Let p, n and m be the number of stocks in the portfolio, the number of stocks in benchmark index and number of stocks in the market index, respectively. The number of stocks in the benchmark index may be exactly the same as its market index if the number of stocks is very minimal (i.e.,n = m) or it may contain less stocks than the actual number of stocks in the indices (n ≤ m). This is often the case for small market index like DJIA, in which it only comprises 30 stocks. However, some market indices include a large number of stocks, and therefore it is costly and not practical to include every stock in the benchmark index. In such scenario, the benchmark index must accurately replicate the movement of the market index with a relatively small number of stocks. In our algorithm, we select a subset of stocks for our benchmark index based on the largest market capitalization. The largest 100 companies in each market indices are selected based on their market capitalization. They have been selected because they have the largest impact on the index, offering a high probability of accurate replication of the market index.
Let s denote the stock selected for the benchmark index. The benchmark index will comprise a subset of chosen stocks S = {s 1 , s 2,..., s n }. The trend movement of stock s is calculated based on the moving average of the previous k days' closing prices. If those prices are s t , s t−1 , ..., s t−k+1 then the trend movement for stock s is calculated based its moving average: where k is chosen based on historical performance. In our empirical studies, k is set to 1 ≤ k ≤ 300 to capture short-and medium-term trends.
Thus, the benchmark index for the current day can be computed as follows:

EFFICIENT ONLINE PORTFOLIO SELECTION WITH HEURISTIC AI ALGORITHM
The benchmark index bi is calculated dynamically every day after the market closes. The benchmark index will serve as a metric to estimate the latest market trend before any investment decision is made. Initially, the trend movement of the today's stock price and the historical price is computed based on the chosen kth value. For example, if k = 1, then the trend movement tm(s) 1 for stock s is computed. On the other hand, if k is chosen as 10 (i.e., k = 10), then the trend movement tm(s) 10 for stock s is computed.
Based on both tm(s) and benchmark index bi values, the algorithm then decides whether stock s is safe to invest. If the individual stock s declines at faster rate than the benchmark index: tm(s) < 1, bi < 1 and tm(s) < bi, such stock is considered risky to invest because it declines worse than the benchmark index. This implies that there is a high probability that the stock price will further decline due to a number of negative factors (i.e, bad earnings, bad news events, negative shareholder reactions, etc.). Hence, such stock will not be chosen for next day trade. This ensures only non-risky stocks are selected for investment. The algorithm will only trigger a buy signal for stock s only if the benchmark index is positive and when the trend movement for stock s exceeds the benchmark index.
Formally, suppose the set K represents all non-risky stocks in a portfolio which have been filtered out, a buy signal of a stock s is denoted as f (s) and is defined as follows: The algorithm will filter out all risky stocks (i.e., s ̸ ∈ K) and once we have defined a set of K non-risky stocks, the next step of our strategy is to maximize the profit by selecting the most profitable stocks to trade. This is partly inspired by the mean reversion principle, which indicates that poor performing stocks will revert to their original prices. Hence, the strategy is to prioritize worst performing stocks that have high likelihood of returning to their historical averages.
In order to allocate worst performing stocks with higher weight allocation, we would need to compute the total trend movement value of all stocks in the portfolio K. If there are n assets in the portfolio K, the total trend movement value of the portfolio is defined as: Suppose y and z are portfolio weight vectors, the algorithm should then choose a portfolio y over portfolio z if the total trend movement value of y is less than to that of portfolio z.
Hence, if tmv(y) and tmv(z) represent the total trend movement values for both portfolio y and z, we can now define the preference relation as follows: y ≻ x ⇐⇒ wa{y, z ∈ K|y.tmv(y) < z.tmv(z)} < wa{y, z ∈ K|z.tmv(z) < y.tmv(y)} (5) where wa is a parameter that allows one to control the proportion of stocks to be selected for allocation. Next, we would need to consider the risk level of each stock based on its historical volatility. The risk is identified as the variance of the portfolio σ 2 , in which the variance-covariance matrix Σ is computed. We also calculate the average risk level avgσ 2 from all stocks in the portfolio. Next, we can now find an optimal portfolio P (from a set of stocks in portfolio K) of whose risk is less than the average risk level avgσ 2 : Further, we would need to find a solution that minimizes the total trend movement values relative to a maximum risk constraint. This is achieved by finding the minimum of the linear function m (a vector m) on the set of P of portfolios respecting this constraint. Suppose y and z are arbitrary portfolios, then  Table 3. DMA-BI results in terms of the total wealth, Sharpe ratio, and annualized returns based on the historical datasets for 4 market indices.
Finally, we can find the optimum portfolio by solving the following linear program with quadratic constraints:

Empirical Studies
The work reported in this paper is motivated by the belief that the portfolio selection strategy combining both a market index predictor (i.e., benchmark index) and selecting stocks with high likelihood of mean reversion would improve the performance in terms of achieving higher wealth and lower investment risk.
To validate this, we present an experimental study of the DMA-BI strategy with both Anticor and OLMAR. Four main historical datasets are used, each from different market. The first NYSE dataset comprises a selection of stocks from the NYSE market during the period 2000 to 2017. The stocks are chosen based on a number of criteria. The NYSE comprises 100 top stocks with the largest market capitalization. To avoid data-snooping bias, market capitalization was selected based on its listing at the year 2000. Hence, this represents a realistic scenario since the strategy does not know whether the same 100 stocks will continue to remain in the largest market capitalization category in the next 15 years. Similarly, the top 100 stocks (by largest market capitalization) will also be selected for other market indices such as the SP100, and TSX indices. The only exception is the DJIA since the index only comprises 30 stocks at a maximum. Hence, all 30 stocks from the DJIA will be included in the DJIA datasets. The benchmark index is chosen by the top 20 stocks (by largest market capitalization) for all markets (i.e., NYSE, S&P500, DJIA and TSX). To facilitate comparisons, all datasets will begin from Jan 2000 and ends at Dec 2017 (as of today's date). Hence, we will evaluate the performance of DMA-BI against Anticor and OLMAR during the last 15 year period. Table 3 reports on the performance summary of the DMA-BI strategy against both Anticor and OLMAR on four different markets i.e., NYSE, S&P500, DJIA, and TSX. The performance is shown in terms of the total wealth, Sharpe ratio, as well as the annualized return. Overall, the DMA-BI strategy generates higher returns with less risks when compared to the other two strategies. In particular, the strategy produces excellent and fantastic returns on the NYSE and S&P markets with the total wealth of 34.22, and 15.38, respectively. These returns are very impressive. For the TSX market, the DMA-BI also outperforms both Anticor and OLMAR by generating returns at 2.43, while Anticor only achieves 0.96 and further the OLMAR only achieves 0.41. This shows that the DMA-BI strategy is superior than both Anticor and OLMAR in NYSE, S&P, and the TSX markets. However, we can also observe that the DMA-BI strategy fails to outperform the Anticor on the DJIA market. To understand this lack of performance, it is necessary to examine the overall index performance of both the all the four markets, especially the DJIA and TSX markets. Table 4 shows the average performance of our portfolio on all four markets. Via close examination, we can observe that the DJIA and TSX portfolios perform very poorly when compared to the NYSE and S&P portfolios. For the last 15 years, the DJIA only returned a total wealth of 2.34 on average, whereas all other stock index portfolios return above 4.5. This explains the reason for the lack of performance for the DMA-BI strategy on the DJIA portfolio. Nonetheless, the DMA-BI still achieves higher total wealth than the DJIA's average return with a total wealth of 2.64, whereby the DJIA only returns a total wealth of 2.34. This indicates that superiority of the DMA-BI strategy to outperform the market benchmark even the index itself performs poorly. Furthermore, the DMA-BI strategy significantly outperforms both the Anticor and OLMAR for the TSX portfolio. Surprisingly, both Anticor and OLMAR suffer wealth losses on the TSX portfolio, with Anticor returns a loss of 0.24% annually and OLMAR returns even large loss at the rate of -5.4% annually. On the other hand, the DMA-BI strategy achieves positive returns with an average annualized return of 5.75%.
For the DJIA dataset, the Anticor is able to achieve a total wealth of 5.18, which is almost double to that of the DJIA's. Further, while the OLMAR incurs the worst return, the Anticor surprisingly outperforms the DMA-BI by generating a total wealth of 5.18, when compared to the total wealth of 2.64 generated by the DMA-BI strategy. This is quite impressive but since the Anticor algorithm has repeatedly given very poor performance on other 3 datasets, it is interesting to examine why Anticor gives spectacular return in this case. To examine this further, Table 5 shows the breakdown of the largest gains achieved by both strategies during the investment period. It can be seen that the Anticor strategy generates a very high return of 46% in a single day. On the other hand, the highest profit achieved by the DMA-BI is below than 20% (i.e., 15.8). However, the losses incurred by the Anticor is also significantly higher than that of the DMA-BI. It can be observed that the Anticor incurred three daily large negative returns with losses more than 20% in a single day. Further examination shows that the Sharpe ratio between the DMA-BI and the Anticor is very insignificant: the DMA-BI has Sharpe ratio of 0.25 and the Anticor has a sharp ratio of 0.21. This illustrates that Anticor's strategy is more risky than the DMA-BI strategy. Despite earning lower return than the Anticor, the DMA-BI strategy generates profit at much lower risk; unlike Anticor, the DMA-BI never loses more than 20% of its wealth in a single day. Table 6 provides more detailed information of the top losses incurred by Anticor, OLMAR, and DBA-BI on the four stock markets. It can be clearly seen that the DMA-BI incurs very insignificant losses when compared to the Anticor and OLMAR in all four markets. For the NYSE market, Anticor incurred worst performance with 55% loss, OLMAR also generated very high loss with 41%, while DMA-BI incurred 9% loss only. Similar observations were also made on all remaining markets.
Next, we examine the performance impact of the DMA-BI under varying weight allocation parameter. As previously mentioned, the DMA-BI strategy employs the weight allocation wa parameter, which is defined by wa = 1/f, ..., 1.0, whereby wa ≥ 1/f and wa ≤ 1 where f is the number of filtered stocks. Given multiple choices of filtered stocks, this parameter enables one to control the proportion of stocks to be allocated for trading decision. For example, if wa = 1, all filtered stocks will be selected for allocation, whereas if wa = 1/f , only 1 stock will be chosen for final allocation. Figure 6 illustrates the performance achieved for various wa weight allocations. Results show the performance of DMA-BI varies as the parameter wa changes. As can be observed, the total wealth decreases as the weight allocation increases. Interestingly however, the Sharpe ratio increases when the weight A. NAZIR 343 Table 6. Comparison of the top daily losses incurred by Anticor, OLMAR, and DMA-BI on NYSE, S&P, DJIA, and TSX markets. The results are reported in terms of percentage of daily loss, starting from the largest to the lowest losses for each strategy.   This seems contradictory on the surface, but on closer inspection we can see the reason from  Table 7. Table 7 illustrates the maximum profits and losses incurred for various weight allocation parameters. It can be clearly seen that gains and losses decrease significantly when the weight allocation increases. With wa = 0.0125, the strategy is able to generate impressive 15.8%, 13.0%, and 11.2% gains in a single day. However, as the wa increases to 0.12, the strategy only earns maximum 10.5% gain in a single day, while the other top 10 remaining gains were under 10%. This pattern is more apparent when weight allocation is increased further to w = 0.5. As can be observed, there is not a single day the strategy was able to achieve higher than 10% increase. The highest gain achieved was only 9.1%. This implies that by increasing the weight allocation wa, the strategy tolerates less risk at the expense of lower profits. Hence, investors can tune this parameter appropriately to achieve the most comfortable level of risk.
Next, it is crucial to validate the DMA-BI strategy under trading costs and commissions. In previous studies, the assumption is that investors pay at a rate of c/2 for each buy and for each sell. Hence, the return of a sequence b 1 , ..., b n of portfolios with respect to a market sequence Figure 7 illustrates the performance of the strategy with proportional commission fee c = 0.1%, 0.2%, ..., 0.6%. For example, if the commission fee is at the rate of 0.1% per transaction, this implies that $1 is paid for every $1000 of stocks bought or sold. Even such commission rate is considered aggressive, since the current average commission by most brokers is $7.99, and active daily traders could find several quality brokerage firms charging $4 to $5 per trade. Nonetheless, the results show that the DMA-BI strategy is able to withstand reasonable commission rates and still generates high profits. For example, with c = 0.2%, the algorithm still outperform on the NYSE market until c reaches 0.6 (i.e., c < 0.6%), where the total wealth starts to deteriorate. This is very reasonable since most stock brokers charge very small commissions, and some even charge a very small flat commission rate for high volume trade. Given a large scale of investment (with more than $1,000,000 daily trade), the investors can further save from very large commissions and therefore only suffer a very small proportional transaction deduction.
We may conclude that the results presented here have shown that indeed DMA-BI outperforms both Anticor and OLMAR as well as the market index in all market indices under various conditions. This indicates that using market index information as a benchmark index and using dynamic moving average to select profitable filtered stocks are indeed beneficial. In fact, we are sufficiently encouraged to prompt speculation that perhaps more advanced techniques of prediction can be developed to take advantage of the fact that we can now rely on the benchmark index to determine whether the market trend is a bull or bear. Being able to determine this trend, we can further identify and select the most profitable stocks to invest based on the mean reversion principle.

Other Caveats
We have analysed some other caveats or issues before the conclusion of our paper. A related problem that one must face when actually trading is the difference between bid and ask prices. These bid-ask spreads (and the availability of stocks for both buying and selling) are functions of stock liquidity and are typically small for large market   capitalization stocks. However, our DMA-BI strategy does not have this problem because we consider here only very large market cap stocks (i.e. the top 100 largest companies by market capitalization). In fact, the strategy creates a benchmark index from the market index to estimate the market trend. Hence, we agree that our strategy may not work effectively for a portfolio comprising small caps or penny stocks because small caps and penny stocks do not often move in tandem with the market index. Any report of abnormal returns using historical markets should be suspected of data-snooping biases. In particular, previous studies make the assumption that all stocks were traded every day and there were no bankruptcies or stocks that became virtually worthless in any of these data sets. However, we have removed this assumption in our datasets. We included all stocks which were originally listed in the indices from the beginning of 2000 (since every trading day in our datasets start from 2000). This represents a realistic scenario because it is not possible to anticipate which stocks will be out of the index in the future. Even under these strict conditions, our DMA-BI strategy is able to withstand the test of time and generate spectacular returns.
Finally, stock selection is another data snooping hazard that can occur. One can claim that we simply choose the stocks that would generate good performance for our strategy. However, our strategy has been tested with the 100 largest companies in four different markets for a long period (i.e., 15 years) until of Dec 2017.

Concluding Remarks
From the results summarized and analysed in this paper, it is clear that the DMA-BI is a genuine improvement on both the Anticor and OLMAR strategies. One investigation has included a standard comparison of total wealth returns, the annualized returns, and also the Sharpe ratio to measure the risk factors. We have demonstrated that the DMA-BI strategy substantially outperform both Anticor and OLMAR -the best strategies currently reported in the literature. We have also proposed a reason for this different in performance -that the DMA-BI strategy learns to avoid risky investment based on historical performance of the market index, rather than simply trading every day. Once a bull market has been detected, the DMA-BI strategy then employs both moving averages and mean-reversal with volatility (covariance) optimization techniques to identify the most profitable stock(s) while minimizing risk. To our knowledge, existing online portfolio selection algorithms (including Anticor and OLMAR) have not considered risk at all, not to mention minimizing it.
Whilst the strategy has been proven effective, it is plausible that various other strategies, such as more sophisticated machine learning techniques, may further improve the performance of the strategy. In particular, the strategy chooses the k and wa values to compute the moving average and weight allocation to capture both short-term and long-term trends. In our empirical studies, these parameters were dynamically calculated on a daily basis based on the benchmark index. However, a more intelligent approach is to adaptively and tune these parameters based on multi-objective parameter optimization using additional information such as individual stock historical patterns, correlations between stocks, and the benchmark index as well their historical price averages. We have not explored this direction but it could be interesting to examine the impact of such additional parameters. Searching for multi-objective optimal parameters is likely to require nonlinear multivariate analysis techniques. If such optimal parameters can be identified, then the next problem to be solved is finding a way to adjust those independent parameter-values "on the fly" as the market alters dynamically.
One final caveat must be mentioned. Namely, the efficient market theory states that any trading and/or portfolio selection algorithm has no extra edge on the market because the market will quickly react to any method which does consistently and substantially beat the market. Like any other strategy, the widespread use of the DMA-BI strategy may soon lead to the end of spectacular returns.