
[1911.11819v1] Cryptocurrency Price Prediction and Trading Strategies Using Support Vector Machines
In both classification accuracy and backtest performance, SVM was our best model
Abstract Few assets in financial history have been as notoriously volatile as cryptocurrencies. While the long term outlook for this asset class remains unclear, we are successful in making short term price predictions for several major crypto assets. Using historical data from July 2015 to November 2019, we develop a large number of technical indicators to capture patterns in the cryptocurrency market. We then test various classification methods to forecast shortterm future price movements based on these indicators. On both PPV and NPV metrics, our classifiers do well in identifying up and down market moves over the next 1 hour. Beyond evaluating classification accuracy, we also develop a strategy for translating 1hourahead class predictions into trading decisions, along with a backtester that simulates trading in a realistic environment. We find that support vector machines yield the most profitable trading strategies, which outperform the market on average for Bitcoin, Ethereum and Litecoin over the past 22 months, since January 2018.
‹Figure 1: Historical volumes for BTC, ETH, and LTC from July 2015 to the present are highly volatile, as well as highly correlated with each other and with historical prices. (Coinbase data)Figure 2: LEFT: The distribution of 1hour returns for BTCUSD from July 2015 to the present. The two vertical lines indicate the thresholds for class membership, at 0.005 and 0.005. RIGHT: We show the 9 month rolling proportion of points in each class, from 2016 to the present, for BTCUSD. We will use this 9 month rolling window to retrain our classifiers, as described in Section ??. These classes are highly imbalanced, with the majority of points falling in class c3. We clearly see a higher proportion of the classes c1 and c2 during the volatile bubble and crash periods of 20172018. (Problem Formulation)Figure 3: The first 50 autocorrelations and partial autocorrelations of 1hour log returns for BTCUSD are all close to zero. ETHUSD and LTCUSD exhibit similar properties. (Problem Formulation)Figure 4: PPV and NPV values for our classifiers, averaged from Jan 2018 to present. (Classification accuracy)Figure 5: This table shows monthly backtest returns for SVM for Bitcoin from Jan 2018 to the present, along with the total compounded return and standard deviation of returns. SVM outperforms the market somewhat consistently, while having lower volatility. The return correlation of SVM to the market is 35.5% over this period. The plot shows these returns cumulated over the backtest period, i.e. the hypothetical growth of $100 in our portfolio vs. the BTC market. (Backtest results)Figure 6: We show monthly backtest returns for SVM for Ethereum and Litecoin from Jan 2018 to the present. For both assets, SVM outperforms the market while having lower volatility. The return correlation of SVM to the market over this period is 0.1% for ETHUSD, and 18.5% for LTCUSD. (Backtest results)Figure 7: Hypothetical growth of $100 in our portfolio vs. the market, for ETH and LTC. (Backtest results)Figure 8: We compare the monthly returns for SVM and the market in BTCUSD, along with rolling market volatility. We see that our strategy is more likely to trade when market volatility is high, and often does not trade at all in low volatility periods, when it is more “uncertain” of any price moves. (We explore this further in Appendix D.) Our strategy is also relatively uncorrelated with the market. (Backtest results)Figure 9: Here is an illustration of Bollinger bands, the Bollinger level (current price relative to upper and lower bands), and the Bollinger width (rolling volatility). (A.1. Bollinger bands)Figure 10: Here is an illustration of the various features of the movingaverage convergence divergence (MACD) metric. The blue line is the MACD line, which is the fast EMA minus the slow EMA. The red line is the signal line, which is an EMA of the MACD line. The histogram line is represented by the gray histograms. This is computed as the MACD line minus the signal line. (A.2. Moving average convergencedivergence)Figure 11: These tables compare overall the classification accuracy of SVM, random forest, and XGBoost for the 3 crypto assets we tested. SVM consistently performs better than the other 2 classifiers, and also performs significantly better than chance. It is worth noting that some of these figures may overstate the effectiveness of our models, since the classes are highly imbalanced, as described in Section ??. Thus, when certain months have very high overall accuracy, this may reflect the fact that the vast majority of points in that month fall into class c3, i.e. we rarely make any trades. (Appendix C. Extended classification accuracy results)›

