Posts

The target of this research was to find an indicator that helps predict the direction of the overall US Equity market for the next week using sentiment data from the previous week. The hypothesis is when there is high volatility in sentiment over the previous week, which means investors have differing opinions, the subsequent week overall market performance will underperform. When volatility on sentiment is low or neutral, the crowd has reached a consensus and the general market will outperform over the next week. The sentiment metric used to represent volatility is Raw-Volatility in SMA’s S-Factor data feed, which captures the volatility of the sentiment from Twitter conversations. All Raw-Volatility data points were taken from the 3:40 pm ET timestamp (20 minutes before the market close). We calculated the summation of Raw-Volatility for each date as a proxy to represent the volatility of Twitter social sentiment on the entire market. The exact calculation is as follows, where “N” is the number of companies with sentiment on that date and “D” is the date:

We then created a 7-day standardized volatility using a 91-day benchmark:

This Z_Volatility score follows a roughly normal distribution.

Using the S&P 500 ETF Trust (SPY) as a proxy of general market performance, we then look at the relationship between Z_Volatility and SPY’s return series. The daily close-to-close return is calculated as:

Hypothesis: When Z_Volatility for the previous closing Date is high, the subsequent market performance will be lower. When Z_Volatility is low or neutral, the next day’s market performance will be higher.

To test this, our strategy is to open short position of SPY when Z_Volatility > 1. When Z_Volatiltiy is =< 1, the portfolio treats SPY as a long position. This hypothetical portfolio is then compared to SPY over the past 10 years:

Prior to the COVID-19 pandemic, which began in early 2020, SPY outperformed the modified portfolio. However, since then the behavior of this factor changed drastically. Here is the same graph as above starting in 2020:

Taking a closer look, the separation since the beginning of 2020 is quite significant. Adding a short position to SPY when volatility on sentiment is high, has enhanced the portfolio’s return. Even though many of the days will maintain a long position, the Z-Volatility is predictive of downturns in the market since 2020. Traders could use this metric as an indicator to stay out of the market, or at the very least trade with more caution. The COVID-19 Pandemic led to a large amount of uncertainty surrounding the stock market and the direction its heading. A high Z_Volatility score indicates the public’s opinion is more uncertain about the direction of various stocks. This research shows the value of sentiment from Social Market Analytics in predicting macro-level events and price movements.

If you are interested in learning more about how SMA’s S-Factor data can help your trading strategies, please email us at contactus@socialmarketanalytics.com or schedule a demo using this link.

Social Market Analytics has extensive Intellectual Property in three distinct areas:  Topic model creation, account filtering and natural language processing (NLP).  I have written blog post about SMA topic model creation capabilities and the impact of our account filtering algorithms.  This blog answers the question – “Do your machine learning algorithms really add value to the NLP process?”.  Answer -> Yes. The chart below illustrates the statistically significant benefits of Social Market Analytics Machine Learning Algorithms in isolation. 

Start date for this analysis is 11/20/2018 and the end date is 4/30/2019.  This period was chosen because of the significant market draw down in December.  We use dictionaries with three distinct rule sets.  We use a static dictionary as of the start and end dates and compare resulting predictive returns with a point-in-time dictionary (production).  Our patented NLP scores Tweets using the dictionaries at each time, S-Scores are calculated from the generated Tweet scores.  The point-in-time dictionary represents word additions, phrases, and grammatical logic as they are made. 

We isolate the impact of our NLP process by turning off account filtering applied to the Twitter stream.  To ensure we are pulling Tweets only discussing companies and securities, we are using our topic model filtering algorithms.  We regularly publish our full return charts to illustrate the impact of our entire process. 

Let us start by defining the lines in our chart.

 

Red Line = Tweets are scored using our dictionary of words and phrases as of 11/20/2018.  This illustrates the performance with no machine learning applied on a go forward basis. This is the base case. This line represents the least amount of learned information.

Black Line = Tweets are scored using words and phrases applied Point-In-Time.  This is the production feed SMA customers receive.  We use Supervised and Unsupervised Machine Learning.  There are impacts from both during this period.

Green Line = Represents the Perfect Information scenario. Take the most up to date dictionary of words and phrases (4/30/2019) and apply them backwards.  All information learned during the volatile period is included.  This represents the values expected to be received on a go forward basis.

The charts below represent the cumulative Open to Close return of securities selected based on S-Score 20 minutes prior to market open.  S-Score measures the tone of the current conversation relative to historical benchmarks.  We select securities with an |S-Score| > 2.  Securities with S-Score > 2 are purchased on the open.  Securities with S-Score < -2 are sold short on the open.  SMA Chart lines represent a theoretical long/short portfolio. Isolated long and short sides are available upon request. 

For comparison purposes S&P 500 open to close chart for the analyzed period is below.

The chart below illustrates the cumulative O-C performance illustrating the impact of our ML algorithms.  As expected, the lowest performance is the red line representing the dictionary at start date.  The back line represents SMA production data and green line represents the perfect information case. 

Again, this only looks at the impact of SMA NLP and does not include account filtering.  At SMA we believe it’s not just what is being said but who is saying it.  We employ a twelve variable algorithm to score and filter all Twitter accounts Tweeting about companies/securities to identify our approved account universe.  As you can see SMA NLP is a learning system with demonstrable impact.  To learn more please contact us at contactUs@SocialMarketAnalytics.com.

Thanks,

Joe

Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter.  We apply our patented filtering and natural language processing(NLP) to Tweets to proactively select Twitter accounts to use in our predictive metrics.  We track several metrics to gauge the predictive nature of our dataset.  For this blog I am going to illustrate one of these metrics.

2018 was a rough year for the SP500, it lost about 9% (rolling one year).  Given market loss and the high volatility we thought it would be an ideal dataset over which to run an experiment.  Two questions we get regularly are: How would your data perform in a bear market?  And what is the benefit of your NLP and account ratings systems? This blog will answer both questions from the perspective of 2018 market performance.

The table below illustrates performance of six theoretical portfolios.  These portfolios represent stocks with Social Market Analytics S-Scores of 2 or higher (Long signal) or Social Market Analytics S-Scores of -2 or lower (Short signal).  S-Score compares the tone of current Twitter conversations with average tone of Twitter conversations over the last twenty days.  Social Market Analytics has multiple baseline for multiple prediction periods.

Each security in our universe represents a proprietary Topic Model.  Each Topic is a collection of rules used to include or exclude specific Tweets from security buckets.  For example, if you are looking for Tweets about Ethan Allen furniture (ETH) you do not want to include Tweets about Ethereum Crypto Currency (Also symbol ETH) conversations.

We created portfolios with our account filtering algorithms and compared them with portfolios of all twitter accounts discussing our Equity Topic Models. The purpose of the run was to quantify the ability of our patented account filtering algorithms to identify professional, and hence more accurate, investors. Spoiler alert: Our account filtering improved the long/short return by 50% (18.73 for 2018 versus 12.53 NLP only)

NLP applied only:

The NLP only portfolios illustrate the power of our NLP process to accurately identify and fine grain score Tweets discussing securities and companies.  Our patented process reads each Tweet multiple times to identify if and how strongly someone is voicing a view of expected future performance.  The NLP only portfolios illustrate the predictive power of our NLP in isolation.  When you apply the Account filtering you get a predictive boost.

Account Filtered + NLP applied:

Account Filtered plus NLP portfolios illustrate the benefit of applying our account filtering metrics.  Early in the life of Social Market Analytics we learned its not just what is being said on Twitter but who is saying it. We developed proprietary metrics to identify investors more likely to be correct about the future direction of a security. When the conversation of these professional investors is significantly more positive than the average conversation over the last 20 days those securities significantly outperform.  When the conversation of these professional investors is significantly more positive than the average conversation over the last 20 days those securities significantly underperform.

 Portfolio Construction

Portfolios are constructed of securities with an S-Score of 2 or higher (long) or -2 or lower (short).  All portfolios are equally weighted.  A negative value for a short portfolio denotes a positive return to that portfolio.  Short portfolios are supposed to move lower.  All securities are entered on the Open based on a 9:10 am Eastern time S-Scores and exited on the Close.  There is no overnight exposure.

Result Analysis

We use SP500 as our performance benchmark.  SP return is calculated from open to close in the same manner as the selected securities. Using open to close performance the SP500 returned -16.89% for comparison.  As you can see from the table the S-Score > 2 outperformed the market and negative S-Score securities significantly underperformed the market (generating positive alpha).  The L/S portfolio with NLP only returned +12.54%, NLP plus account filtering improved that performance by 50% to +18.73%.  We do not illustrate this as a single factor model but removing 10% a year for slippage and commissions still significantly outperforms.

nlp-accountratingPlease contact us with any questions or to see how SMA’s NLP and filtering capabilities can be used in your investment process.  ContactUs@SocialMarketAnalytics.com

Social Market Analytics (SMA) tracks real-time sentiment on equities, commodities, currencies, ETF’s and crypto currencies.  SMA has the most powerful and customizable Alerting API combining Twitter sentiment and pricing metrics.  Users receive custom real-time sentiment alerts on instruments in their watch list.  For example, on December 11, 2018, SMA’s alerting system sent an alert on Corn at 12:12 pm CT when corn was @ $385.25. Below is the email and mobile alert.

Cornalert

Mobile

Subsequent to the alert, corn moved lower starting at 12:17pm CT. The price continued to move lower the remainder of the day and closed at $383.25. (See chart below)

Corn Alert

The above alert was based on SMA’s rolling 24-hour sentiment. SMA also calculates a Long-term sentiment with longer price projection periods.  Corn’s long-term S-Factor flipped from positive to negative on November 14th. 12/10 was the first day the long-term S-Factor for corn reached a significantly negative level of -1.5 standard deviations more negative than the longer-term baseline conversation. For more information please contactUs@SocialMarketAnalytics.com

This year has been tough for most investment strategies.  Firms using traditional sources of data are generating the same underwhelming returns.  Two years ago, Social Market Analytics, Inc.  (SMA)  (Twitter)   launched the SMLCW index in partnership with the CBOE.  This index is re-balanced weekly and comprised of the twenty-five securities selected from the CBOE large cap universe with the highest average S-Score over the prior week.  It’s A long only index of super-cap stocks with unusually positive Twitter conversations.

SMA publishes a family of metrics providing a full representation of the Twitter conversation across equities (US and LSE), commodities, currencies, ETF’s & Cryptos.

S-Score is a normalized representation of the current Twitter conversation of professional investors as identified by Social Market Analytics patented algorithms.  SMA has access to the full Twitter feed through our licensed partnership with Twitter and listens in real-time for any mention of topics and securities of interest.  These Tweets are scanned in real-time for sentiment and influence of the poster and compared to prior conversations over the look back period.  Securities with higher S-Scores subsequently outperform and securities with negative S-Scores under-perform.

SMA S-Scores are predictive over multiple prediction periods.  With seven years of out-of-sample data we can extend our comparison baselines and predict over longer periods.

Year-To-Date the SMLCW index is up over 7.5% while the SP500 is flat.  Subtracting a couple percent for commissions/slippage and the index is still significantly positive. This is not a back-test, this index has been live and on your quote screens for nearly two years.  YTD actual performance chart from the CBOE site is below.

SMLCW - YTD

As mentioned, this is a long only index.  During the recent market drawdown this long index has been performing.  SMA negative S-Score stocks have been moving lower at a significant rate – generating positive alpha.  Below is a chart of the SMLCW index compared to the SP500.  for any questions or to learn more please contact us at:  ContactUs@SocialMarketAnalytics.com.

Thanks,

Joe

 

Social Market Analytics, Inc. (SMA) aggregates the intentions of professional investors as expressed on Twitter & StockTwits and publishes a series of metrics that describes the current conversation relative to historical benchmarks.  Our data is a leading indicator of price movement both positive and negative.

There is unique predictive information in unstructured content.  Social Market Analytics use AI and Machine Learning techniques developed over the last eight years to convert this unstructured content into data suitable for quantitative analysis. This opens a whole new area of big data analysis.

Social Market Analytics (SMA) calculates predictive sentiment on the entire US equity universe, Currencies, Commodities, Crypto currencies, ETF’s and custom sources.   This blog is about the predictive nature of our LSE security universe.  We calculate our custom metrics on the top 1000 market cap securities listed on the LSE.  Our LSE data starts on 1/1/2016. Below is a cumulative quintile distribution of returns based on our S-Score metrics.  Our S-Score is effectively a Z-Score comparing 24-hour sentiment based on the Tweets of professional investors compared to a 20-day baseline.   Prediction periods vary per asset class and baseline. Longer baseline comparisons lead to longer prediction periods.

Stocks with abnormally positive conversations typically outperform their peers and stocks with abnormally negative conversations typically underperform their peers.  As expected conversations with normal positive or negative tones perform like the overall market.

Below is a typical quintile chart for the LSE 1000 universe tracked from post Brexit to 8/31/2018. The spread between the top and bottom quintiles is 10% annualized.   Sharpe and Sortino ratios are in the table below that.  To learn more or request a historical data set contact SMA with any questions ContactUS@SocialMarketAnalytics.com

LSEQuintiles 1

LSE Quintiles2

Social Market Analytics (SMA) publishes real time Twitter based sentiment for nearly 300 crypto currencies including Bitcoin.  To view Bitcoin sentiment values and 35 other commodities in real time, go to the CME Active Traders website.   Twitter based sentiment has proven to be strongly predictive for Bitcoin and other commodities.

Today we will review a sentiment-based Z-Score strategy to generate profitable trades for Bitcoin.  This is similar to traditional standard deviation band strategies calculated with price.

When Twitter volume from certified investors is abnormally high use the sentiment of the abnormally large conversation to select entry points.  Strategy overview is below:

CMEBitcoin 1

A visualization of the strategy is below. When the Z-Score of Social Market Analytics Indicative Twitter volume is greater than the threshold and the tone of the conversation is significant enter or modify trades.  Sentiment  > 2 standard deviations and the volume of the conversation is high enter a position.  Positions are modified based on further extensions of the Z-Score.

CMEBitcoin2

Test period is from 1/1/2017 to current.  Overall results below.  For more detailed results on this and other strategies contact ContactUS@SocialMarketAnalytics.com

CMEBitcoin3

SMA has examples of profitable applications of Twitter based sentiment to many coins.

Social Market Analytics (SMA) data is live on the CME Active Trader Website.  Real-time sentiment and indicative Twitter volume is used by traders to generate new ideas.  Sentiment data is predictive across various time frames.  High sentiment commodities go on to outperform and negative sentiment commodities underperform.  SMA covers 36 commodities on the CME website for: Agricultural, Equity Indexes, Energy, Metals, Interest Rates & FX.

On Monday 9/24 Gold Sentiment crossed through extreme positive at 7:30 am central time.    https://activetrader.cmegroup.com/Products/Metals

GoldBlog1

Clicking on the chart expands the time frame for further analysis.

GoldBlog2

To learn more about Social Market analytics commodity sentiment data or more about the CME implementation: ContactUs@SocialMarketAnaltics.com.

To receive alerts like this in real time follow us on Twitter at @sma_alpha.

Social Market Analytics (SMA)  provides real-time sentiment data for equities (North America & LSE), commodities, foreign exchange, Crypto Currencies and ETF’s.

In this blog I am going to explore a trading system using the SMA Twitter based sentiment data to trade a basket of: EURUSD, EURGBP, GBPJPY, GBPUSD ,USDCAD ,USDCHF ,USDJPY.

We will explore two straight forward trading systems:

  • Forex Sentiment RSI: Daily Long/Short Strategy
  • SMA S-Score Based Currency Selection Model

RSI Calculation Methodology 

CurrencyBlog 1

This strategy is a single-factor model solely based on adjusting daily weights according to 3-Day Sentiment RSI on the 7 of the highest daily volume Forex pairs. It is long-short with the assumption that tails act with similar magnitude.

  • Long/Short
    1. RSI >= 50, Long
    2. RSI < 50, Short
  • 50% Long & 50% Short Asset Allocation
    1. Long weights are calculated using only longs
    2. Short weights are calculated using only shorts
  • Daily weight adjusted following:
    1. separately for the long side and the short side

 

currencyBlog2

The strategy significantly improves returns compared to an equal weighted baseline.  Sharpe and Sortino ratios are statistically significant:

  • Sharpe Ratio:
    • 2.77 Jan 03, 2017 to July 19, 2018
    • 3.40 YTD
  • Sortino Ratio:
    • 5.40 Jan 03, 2017 to July 19, 2018
    • 7.46 YTD

The volatility of each leg of the strategy is either kept stable or decreased in comparison with the baseline.

SMA S-Score Based Currency Selection Model

This daily trading strategy is based on the S-Score at 09:10:00 EST and executing a 24-hour hold based on these values at 09:15:00 EST. We find consistency across execution times.  The goal is to assess sentiment and take make a directional trade in agreeance, given that the sentiment falls at least 1 standard deviation from the 20-day mean.

Equal weighted based on standard deviation criteria:

– Long: S-Score > 1

– Short: S-Score < -1

– Baseline: Equal Weighted Portfolio of the 7 Currency pair

Long and short legs are capped at 50% of the daily portfolio, even on the occurrence of an outlier day where all pairs are long, or all pairs are short.

currencyBlog3

 

The strategy drastically improves returns compared an equal weighted baseline.  Up to 40% cumulative over a 19-month period with a consistent annual rate of return.

  • Sharpe Ratio:
    • 2.56 Jan 03, 2017 to July 19, 2018
    • 3.56 YTD
  • Sortino Ratio:
    • 4.93 Jan 03, 2017 to July 19, 2018
    • 7.72 YTD

These are straight forward strategies that illustrate the predictive nature of our dataset.  Twitter and StockTwits based factors.  To learn more about how Social Market Analytics sentiment data can help your trading please contact us at contactus@Socialmarketanalytics.com or Doug Hopkins @ (312) 788-2621.