Posts

The target of this research was to find an indicator that helps predict the direction of the overall US Equity market for the next week using sentiment data from the previous week. The hypothesis is when there is high volatility in sentiment over the previous week, which means investors have differing opinions, the subsequent week overall market performance will underperform. When volatility on sentiment is low or neutral, the crowd has reached a consensus and the general market will outperform over the next week. The sentiment metric used to represent volatility is Raw-Volatility in SMA’s S-Factor data feed, which captures the volatility of the sentiment from Twitter conversations. All Raw-Volatility data points were taken from the 3:40 pm ET timestamp (20 minutes before the market close). We calculated the summation of Raw-Volatility for each date as a proxy to represent the volatility of Twitter social sentiment on the entire market. The exact calculation is as follows, where “N” is the number of companies with sentiment on that date and “D” is the date:

We then created a 7-day standardized volatility using a 91-day benchmark:

This Z_Volatility score follows a roughly normal distribution.

Using the S&P 500 ETF Trust (SPY) as a proxy of general market performance, we then look at the relationship between Z_Volatility and SPY’s return series. The daily close-to-close return is calculated as:

Hypothesis: When Z_Volatility for the previous closing Date is high, the subsequent market performance will be lower. When Z_Volatility is low or neutral, the next day’s market performance will be higher.

To test this, our strategy is to open short position of SPY when Z_Volatility > 1. When Z_Volatiltiy is =< 1, the portfolio treats SPY as a long position. This hypothetical portfolio is then compared to SPY over the past 10 years:

Prior to the COVID-19 pandemic, which began in early 2020, SPY outperformed the modified portfolio. However, since then the behavior of this factor changed drastically. Here is the same graph as above starting in 2020:

Taking a closer look, the separation since the beginning of 2020 is quite significant. Adding a short position to SPY when volatility on sentiment is high, has enhanced the portfolio’s return. Even though many of the days will maintain a long position, the Z-Volatility is predictive of downturns in the market since 2020. Traders could use this metric as an indicator to stay out of the market, or at the very least trade with more caution. The COVID-19 Pandemic led to a large amount of uncertainty surrounding the stock market and the direction its heading. A high Z_Volatility score indicates the public’s opinion is more uncertain about the direction of various stocks. This research shows the value of sentiment from Social Market Analytics in predicting macro-level events and price movements.

If you are interested in learning more about how SMA’s S-Factor data can help your trading strategies, please email us at contactus@socialmarketanalytics.com or schedule a demo using this link.

My name is Campbell Taylor. I am a rising senior and a Statistics major at The Ohio State University. Through my first few weeks as a Quantitative Research Intern for Social Market Analytics I’ve been exposed to alternative data and its applications in the financial market. In this research, I created a day trading strategy built around changes in sentiment on twitter.

Social Market Analytics (SMA) captures unstructured data through alternative sources such as Twitter. Using unique Natural Language Processing sentiment analysis, SMA rates tweets in real time and create metrics that enhance insights into equities’ market movements.

Sentiment factors used for this analysis are distributed through SMA’s S-Factor feed. The factors are:

  • S-Score: normalized representation of a stock’s sentiment on twitter over 24 hours
  • SV-Score: normalized representation of a stock’s indicative tweet volume over 24 hours
  • S-Buzz: measurement of unusual Twitter activity compared to the universe of stocks

A large S-Score (> 2) is associated with extreme positive sentiment on Twitter, while a small S-Score (< -2) is associated with extreme negative sentiment. The same applies for SV- Score. S-Buzz ranges from 0 to 4.5, with 1 being the statistical mean. The goal of this research is to use these sentiment factors overlaid with pricing momentum to develop a profitable daily trading strategy.

The momentum used for this research is defined as the following:

This isolates the pricing momentum to strictly overnight movement. Similarly, I used the differences in sentiment to capture the overnight sentiment changes. The two sentiment timestamps are taken at 9:10 AM EST of the current trading day and 3:40 PM EST of the previous trading day, both 20 minutes prior to market open and close. Subtracting the previous day’s closing sentiment from the current day’s opening sentiment isolates the overnight sentiment change.The target return can be defined as:

A popular trading strategy is buying securities with rising momentum and selling them when the momentum has appeared to be exhausted. My original hypothesis was that positive overnight sentiment movements would enhance the overnight pricing momentum. More specifically: a positive S-Score difference (increased positivity), SV-score difference (increased volume of tweets), or S-Buzz difference (more unusual activity) would lead to the positive momentum continuing until close of the trading day, and vis versa for negativity.

To find which sentiment movement was most significant in predicting returns, I built a logistic regression model. This models the probability of a discrete outcome given the input variables. In this case, the probability of positive open to close returns given the various overnight sentiment changes and the overnight pricing momentum. The idea being parameters that increase the probability of positive returns will create a trading strategy that would be more profitable than the market over time.

Before selecting the model, I checked the distribution of the variables to ensure there was no abnormalities in their distribution. The histograms below show the following distributions (left to right, up to down): Overnight Difference in S-Score, Overnight Difference in SV-Score, Overnight Difference in S-Buzz, Overnight Pricing Momentum, and Open to Close Returns. All the variables appear to be roughly normally distributed, which is beneficial for statistical modeling and taking the tails of the distribution.

Using the four remaining variables, I used a stepwise information criteria method to aid the selection of the best parameters for trading. The information criteria measure the model’s performance while considering the number of parameters used. To my surprise, the model showed that positive (negative) overnight momentum decreased (increased) the probability of positive open to close returns for the next trading day. Additionally, overnight momentum was more significant as a factor variable rather than a continuous variable. Meaning the sign of the momentum is more important than the magnitude of the momentum. Each of the sentiment changes were significant in predicting the return. A positive difference in S-Score and S-Buzz increased the probability of positive returns, while a positive difference in SV-score lowered it. While each variable is significant, it is important to consider the number of stocks that will pass the conditions of all parameters when trading. Very few daily stocks will satisfy all 3 specific sentiment parameters, which will lead to a large variance in results. Thus, it makes sense to narrow the model to one S-Factor variable. Selecting the S-Factor variable that has the most occurrences of extreme changes will give the most robust results. The difference in S-Score had more than double the number of stocks with extreme changes than SV-Score and S-Buzz. Therefore, the final trading strategy will be built around the difference in S-Score overlaid with overnight pricing momentum.

Since the difference in S-Score is a continuous variable that follows a normal distribution, I only wanted to trade on stocks with extreme overnight changes. I defined extreme changes as above 2 and below -2, like the S-Score variable itself. A difference in S-Score over 2 indicates there is an extreme increase in sentiment surrounding that stock on Twitter. Similarly, a difference in S-Score below -2 indicates an extreme decrease in sentiment.

I calculated the cumulative returns of 4 different trading strategies and the S&P 500 ETF trust (SPY) as the benchmark for the general market. Each of the strategies enter at market open and exit at market close with an equal weight placed on each stock. Two of the strategies will be long positions and two of them will be theoretical short positions. The long positions have parameters that increase the probability of positive returns, while the short positions have parameters that lower it.

The Long positions:

  • Trading only on stocks that had negative overnight momentum
  • Trading only on stocks that had negative overnight momentum, but an extreme increase in sentiment (difference in S-Score > 2)

The theoretical Short positions:

  • Trading only on stocks that had positive overnight momentum
  • Trading only on stocks that had positive overnight momentum, but an extreme decrease in sentiment (difference in S-Score < -2)

Trading with these 4 different strategies quantifies the effect that sentiment movement has on the overnight momentum. I expected the two long positions to give positive cumulative returns and the short positions to have negative cumulative returns. Based on the model, the long position with sentiment should give the highest returns while the short position with sentiment should give the lowest returns. Before calculating returns, I looked at the number of trades per day in the strategies with sentiment to ensure the trades won’t be too heavily weighted on one stock (top-down).

The x-axis of the histograms shows the number of trades made in a day, while the y-axis shows the number of days with that number. Both distributions suggest there will be some volatility in the number of trades per day. However, the mean and median number of trades for both strategies are high enough to ensure diversity for many of the days. There will be days where there are less than 10 trades, but those will be less than 15% of the trading days in a 10-year span. Therefore, the low volume days will be spread out and not affect the strength of the results. The average also isn’t too high to the point where it is impossible to execute the trades at the markets open. Knowing the number of trades was solid, I used these strategies to trade from December 1st 2011 to June 3rd 2022.

The time series graph shows the cumulative return of the strategies over time. Between April and June of 2020 there is a sharp increase in returns for the negative momentum with sentiment increase strategy. The abnormality can be attributed to the market condition following the beginning of lockdowns for the COVID-19 Pandemic. While markets were turbulent during this time, the long position with sentiment performed very well. Overnight sentiment movement had a significant impact on the pricing momentum. The long positions both gave positive cumulative returns, and the theoretical short positions gave negative cumulative returns. As the model suggested, trading stocks that had negative momentum with an extreme increase in sentiment gave the best returns. This strategy produced a cumulative return over 1400% in the 10-year time frame. The Sharpe and Sortino ratios suggest that the above-average returns are worth the potential volatility of this strategy. A Sharpe above 1 and Sortino above 2 are considered good for a portfolio. For the long positions, adding sentiment movement increased the Annualized Return by nearly 12%. While the effects were not as strong, adding sentiment decreased the Annualized Return of the short position by close to 7%. I then looked at how this strategy has performed since start of 2020.

The jump at the beginning is also during the lockdowns of the COVID-19 Pandemic. Each of the strategies jumped further in the direction the model predicted during this time. This time series graph follows the same behavior as the 10-year trend. The impact of the negative sentiment change on the positive momentum is more evident on this plot. Recently, the long position with sentiment strategy has performed even better than over the 10-year period. While maintaining strong Sharpe and Sortino ratios, the annualized return climbed to nearly 41%. Trading with this strategy would have given a 140% cumulative return since the first trading day of 2020. The short position with sentiment strategy also performed better in this time period. The negative overnight sentiment lowered the annualized returns by nearly 8%. Trading on the long/short positions with sentiment has been an effective trading strategy over time and shows no signs of slowing down.

The limitation with this strategy is the opening of the market being used as a part of the overnight momentum calculation and as the entry point for the trade. Therefore, there will be a delay in executing the trade. In practice this results in adding 5 cents to the opening price for the long positions and removing 5 cents to the opening price for the short positions. The returns will be a bit smaller than the ones calculated but will be very close.

Stocks have generally shown to revert to their mean following overnight movement. Adding sentiment changes appears to enhance the probability and magnitude of reversion. That is why trading on stocks where the overnight sentiment contradicts the overnight pricing momentum is a very profitable strategy. Following this strategy also removes holding stocks overnight where there is risk of news and events breaking after the market close. This research also exemplifies the predictive power of the S-Factors from Social Market Analytics. The overnight S-Score movement proved to have a significant impact on the open to close returns. Capturing the sentiment movement allows traders to identify securities where the price has not yet followed the direction of the public opinion.

To learn more about Social Market Analytics email us at ContactUs@SocialMarketAnalytics.com or schedule a demo using this link.

In this video we explain how our S-Factor feed uses the Twitter firehose to build trading signals, dashboards and widgets to help investors increase alpha. To find out how SMA can help your firm, email us at ContactUs@SocialMarketAnalytics.com or schedule a meeting using our 1 on 1 Meeting Signup.

Since the now famous GameStop (GME) short squeeze we have received inquiries about Twitter’s ability to identify short squeezes early and do we have any products to help identify these stocks prior to the squeeze?   In fact, Twitter is excellent at identifying early discussions of short squeezes.  We have a pre-market open Short Squeeze Alert Report to notify customers of securities with abnormally significant short squeeze discussions.

At Social Market Analytics we have identified a corpus of words and phrases that identify possible short squeezes.  As we read Tweets in real-time, we identify the topic of conversation. For the last year we have been publishing a list of securities with significant number of short squeeze markers prior to market open.  Clients have found this report to be very helpful for risk and security selection.  We can run this report multiple times a day.

To illustrate the power of this report let us look at GameStop.  GME had been an active member of the SMA short-squeeze list prior to the main squeeze in late January.  GameStop was number one on the short squeeze indicator from January 12th through the 31st.  The Blue bars below represent the number of squeeze indicator phrases for GameStop by day.  Short squeeze indicators are taken 20 minutes before the open.  The orange line represents the subsequent open price.  As you can see from the graph, we started indicating strongly on the 14th with very strong indicator on the weekend of the 23rd.  This report also indicated the recent Silver short squeeze.

SMA Short Squeeze Data Feed

To identify potential short squeezes for customers we built the below report.  Again, this report is sent prior to market open to identify high potential short squeeze stocks for the coming day. SMA’s Short Squeeze data feed is available through a RESTful JSON and XML API or as FTP files. The data can be packaged at different timestamps throughout the day. (e.g., pre-open at 9:10am ET, pre-close at 15:40pm ET, etc.)

Short Squeeze Data Feed Field Descriptions

To learn more about this or any SMA product, please ContactUs@socialmarketanalytics.com.

 

At Social Market Analytics we use proprietary techniques to return the most accurate Twitter volume for topics of interest.  First, we use a topic model not just $Ticker.  Most vendors use the CashTag concept to identify securities.  At SMA we believe only using CashTag excludes a lot of valuable conversation.  We return higher volume and cleaner conversations because we use a machine learning rules based system to return all conversations about a security that are not tagged with $Ticker.  In the diagram below we return about 500 extra Tweets a day for Tesla Motors versus just $TSLA.  Our topic model evolves with the conversation over time.

After applying our topic model filter we additionally filter Tweets based on our proprietary account validation metrics.  Only Tweets from our SMA approved accounts are included.  In the below example, there are about 500 Tweets for Tesla Motors from the certified accounts per day.   These accounts pass our multi-step algorithm.  One metric we use is weighted accuracy over time.  For example, when a Twitter account is bullish on a security what percentage of time does that security subsequently move higher.

Below is a visualization of SMA Twitter filtering process for the Tesla topic.

Below is a time series of Tesla Topic model versus $TSLA with the additional filter for SMA certified accounts.

As you can see from the charts SMA’s proprietary technology provides the truest view of the each securities topic model.  To learn more about our technology or receive a sample data set ContactUs@socialmarketanalytics.com.

 

Coinmetrics

Coin Metrics and Social Market Analytics (SMA) announced today a partnership to incorporate SMA’s Crypto Currency Data Feed into the Coin Metrics Market Data Platform.

Alternative data such as social media platforms and data feeds have become a vital source of information for traders, particularly in the Crypto Currency Markets. The SMA Crypto Currency Sentiment Feed will offer the Crypto Currency community a tool for including social media sentiment data in their trading and portfolio strategies and expand Coin Metrics market leading Crypto Asset market and network data products.

“As the Crypto Investing market continues to mature, institutional investors are demanding data from trusted partners. These institutions are looking to make data-driven decision by accessing sources of data that they understand from their legacy investing frameworks. We believe that the power of combining sentiment data with granular network and market data is fundamental to building a deeper understanding of crypto assets. Coin Metrics is excited to partner with SMA, who has a long history of providing sentiment data to traditional capital markets participants and share Coin Metrics’ principles and values. The ability to provide an all-in-one Crypto Financial Data solution is a huge convenience for institutions.” Comments Tim Rice Co-Founder and CEO of Coin Metrics.

“Artificial intelligence and Natural Language Processing are moving into our everyday lives at light speed, and perhaps into financial markets even faster than that. We feel strongly at SMA that participants in Crypto Currency markets will benefit from our unique process in this emerging field, both in its approach to filtering social media data and in the analytical methodology used to develop our proprietary metrics. We’re excited to partner with the Coin Metrics team to offer this service through a versatile industry leading platform” said Joe Gits, Co-Founder and CEO of SMA.

About Coin Metrics

Coin Metrics was founded in 2017 as an open-source project to provide the public with actionable and transparent network data. Today, Coin Metrics delivers market and network data, analytics and research to its community and wider industry. https://coinmetrics.io/

About Social Market Analytics, Inc.
Social Market Analytics quantifies social media data for traders, portfolio managers, hedge funds and risk managers using patent pending technology to detect abnormally positive or negative changes in investor sentiment. SMA produces a family of quantitative metrics, called S-Factors™, designed to capture the signature of financial market sentiment. SMA applies these metrics to data captured from social media sources to estimate sentiment for indices, sectors, and individual securities. A time series of these measurements is produced daily and on intraday time scales. For more information, including a User Guide to S-Factors™, please visit www.socialmarketanalytics.com

Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter.  We identify these professional investors using our proprietary twelve factor ranking system.  One factor is the forward accuracy of Twitter accounts.  If a Twitter account is Tweeting bullishly based on our patented NLP process and the security subsequently moves higher over specified periods that account is deemed to be accurate over that period.  Overall accuracy is aggregated across time for each account.  We have been tracking account accuracy out-of-sample for the past seven years. – it is impossible to recreate this data.  SMA is the only provider with out-of-sample account accuracy.  We found significant variability in account accuracy for supposed professional investors.  Social Market Analytics account scoring algorithms are extremely effective in excluding non-professional professionals.

SMA’s Accurate Account algos aggregate expectations from the most accurate Twitter accounts for individual securities for a specified time period: 1-Day, 2-Day, 1-Week, and 1-Month holding periods.   Definition of ‘Accurate’ – correctly identifying directional movement of the security’s price.  We do not include size of move – their sentiment is positive and the security moved higher.

We calculate consensus expectations of these accurate accounts on individual securities.  Accurate account universes differ across holding periods. Some accounts are more accurate in the short-term (Day trades), while others are more accurate for longer holding periods (up to one month).

Securities with significant consensus for both long and short are available through our API’s, Widgets and in Reports.  Below is a widget identifying securities with the most positive and negative consensus.   In this example, SMA’s accurate account universe is currently 100 bullish on MCO over the next 24 hrs.  Positive, negative and neutral are identified separately.

accurate accounts

To discuss getting access to these or any other SMA data feed or widget please contactus@socialMarketAnalytics.com

Thanks,

Joe

Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter.  We apply our patented filtering and natural language processing(NLP) to Tweets to proactively select Twitter accounts to use in our predictive metrics.  We track several metrics to gauge the predictive nature of our dataset.  For this blog I am going to illustrate one of these metrics.

2018 was a rough year for the SP500, it lost about 9% (rolling one year).  Given market loss and the high volatility we thought it would be an ideal dataset over which to run an experiment.  Two questions we get regularly are: How would your data perform in a bear market?  And what is the benefit of your NLP and account ratings systems? This blog will answer both questions from the perspective of 2018 market performance.

The table below illustrates performance of six theoretical portfolios.  These portfolios represent stocks with Social Market Analytics S-Scores of 2 or higher (Long signal) or Social Market Analytics S-Scores of -2 or lower (Short signal).  S-Score compares the tone of current Twitter conversations with average tone of Twitter conversations over the last twenty days.  Social Market Analytics has multiple baseline for multiple prediction periods.

Each security in our universe represents a proprietary Topic Model.  Each Topic is a collection of rules used to include or exclude specific Tweets from security buckets.  For example, if you are looking for Tweets about Ethan Allen furniture (ETH) you do not want to include Tweets about Ethereum Crypto Currency (Also symbol ETH) conversations.

We created portfolios with our account filtering algorithms and compared them with portfolios of all twitter accounts discussing our Equity Topic Models. The purpose of the run was to quantify the ability of our patented account filtering algorithms to identify professional, and hence more accurate, investors. Spoiler alert: Our account filtering improved the long/short return by 50% (18.73 for 2018 versus 12.53 NLP only)

NLP applied only:

The NLP only portfolios illustrate the power of our NLP process to accurately identify and fine grain score Tweets discussing securities and companies.  Our patented process reads each Tweet multiple times to identify if and how strongly someone is voicing a view of expected future performance.  The NLP only portfolios illustrate the predictive power of our NLP in isolation.  When you apply the Account filtering you get a predictive boost.

Account Filtered + NLP applied:

Account Filtered plus NLP portfolios illustrate the benefit of applying our account filtering metrics.  Early in the life of Social Market Analytics we learned its not just what is being said on Twitter but who is saying it. We developed proprietary metrics to identify investors more likely to be correct about the future direction of a security. When the conversation of these professional investors is significantly more positive than the average conversation over the last 20 days those securities significantly outperform.  When the conversation of these professional investors is significantly more positive than the average conversation over the last 20 days those securities significantly underperform.

 Portfolio Construction

Portfolios are constructed of securities with an S-Score of 2 or higher (long) or -2 or lower (short).  All portfolios are equally weighted.  A negative value for a short portfolio denotes a positive return to that portfolio.  Short portfolios are supposed to move lower.  All securities are entered on the Open based on a 9:10 am Eastern time S-Scores and exited on the Close.  There is no overnight exposure.

Result Analysis

We use SP500 as our performance benchmark.  SP return is calculated from open to close in the same manner as the selected securities. Using open to close performance the SP500 returned -16.89% for comparison.  As you can see from the table the S-Score > 2 outperformed the market and negative S-Score securities significantly underperformed the market (generating positive alpha).  The L/S portfolio with NLP only returned +12.54%, NLP plus account filtering improved that performance by 50% to +18.73%.  We do not illustrate this as a single factor model but removing 10% a year for slippage and commissions still significantly outperforms.

nlp-accountratingPlease contact us with any questions or to see how SMA’s NLP and filtering capabilities can be used in your investment process.  ContactUs@SocialMarketAnalytics.com

Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter.  SMA factors are highly predictive over various time frames.  In June of 2017 Social Market Analytics launched a weekly re-balanced large cap sentiment based index.  This index is comprised of twenty-five stocks with the highest average Twitter sentiment over the prior week selected and re-balanced Friday afternoons from the CBOE Large Cap 450 Index.  This index has been published daily since that date and is available on all major feeds.

Last year the SP500 Index had a return of -8.4%.  The CBOE SMLC Index had a return of +.87%.  Below is a comparative return chart over the last year compared to the SP500.

For more information or to license this index please contact us at ContactUs@SocialMarketAnalytics.com

smlcw performance

 

 

 

This year has been tough for most investment strategies.  Firms using traditional sources of data are generating the same underwhelming returns.  Two years ago, Social Market Analytics, Inc.  (SMA)  (Twitter)   launched the SMLCW index in partnership with the CBOE.  This index is re-balanced weekly and comprised of the twenty-five securities selected from the CBOE large cap universe with the highest average S-Score over the prior week.  It’s A long only index of super-cap stocks with unusually positive Twitter conversations.

SMA publishes a family of metrics providing a full representation of the Twitter conversation across equities (US and LSE), commodities, currencies, ETF’s & Cryptos.

S-Score is a normalized representation of the current Twitter conversation of professional investors as identified by Social Market Analytics patented algorithms.  SMA has access to the full Twitter feed through our licensed partnership with Twitter and listens in real-time for any mention of topics and securities of interest.  These Tweets are scanned in real-time for sentiment and influence of the poster and compared to prior conversations over the look back period.  Securities with higher S-Scores subsequently outperform and securities with negative S-Scores under-perform.

SMA S-Scores are predictive over multiple prediction periods.  With seven years of out-of-sample data we can extend our comparison baselines and predict over longer periods.

Year-To-Date the SMLCW index is up over 7.5% while the SP500 is flat.  Subtracting a couple percent for commissions/slippage and the index is still significantly positive. This is not a back-test, this index has been live and on your quote screens for nearly two years.  YTD actual performance chart from the CBOE site is below.

SMLCW - YTD

As mentioned, this is a long only index.  During the recent market drawdown this long index has been performing.  SMA negative S-Score stocks have been moving lower at a significant rate – generating positive alpha.  Below is a chart of the SMLCW index compared to the SP500.  for any questions or to learn more please contact us at:  ContactUs@SocialMarketAnalytics.com.

Thanks,

Joe