China Charts

Share this post
#18. EQUITY QUANT: From Consumer Complaints to Investment Alpha
www.realchinacharts.com

#18. EQUITY QUANT: From Consumer Complaints to Investment Alpha

Read Time: 8 min

China Charts
Feb 1
3
Share this post
#18. EQUITY QUANT: From Consumer Complaints to Investment Alpha
www.realchinacharts.com

This is EQUITY QUANT, a new experimental column where we add explicit and tangible value to equity investors. Its geographic remit is wider than that of our staple columns on China (i.e. LONG VIEW, THE BRIEF), and may cover both emerging (incl. ex-China) and developed markets.


Contents:

  1. Data Sourcing

  2. Feature Selection

  3. Textual Analysis

  4. Deriving Variables

  5. Quantile Segmentation

  6. Plotting Portfolio Performance


Exec Summary:

Can publicly available data on consumer complaints be reworked in such a way as to generate discernible alpha in investment returns? Turns out that yes. Here are the portfolio results.

Specifically, we use the publicly available Consumer Financial Protection Bureau’s (CFPB) Complaints Database. We tie the data to a universe of 25 publicly traded US consumer facing financial institutions. We extract features from the data and derive variables using textual analysis of the consumer complaint narratives.

The derived variable is year-on-year similarity / change score in the consumer complaint narratives. The main idea is that if the complaints a company receives change year on year, this means something. In this case, it turns out that companies with the largest change (i.e. changers) in their complaints outperform non-changers. Inspecting the differences visually, it’s clearly visible that Portfolio Q1 performed worst and Q5 second best. The 130/30 long-short portfolio is best.

*Not investment advice. Do your own research.
**change_score = 1 - similarity_score


1. Data Sourcing

We use the Consumer Financial Protection Bureau’s (CFPB) Complaints Database. These are complaints made against financial products of particular companies (e.g. credit scoring, credit cards, loans). The database updates daily.


2. Feature Selection

We’re mostly interested in the “Consumer complaint narrative” column, which contains unstructured text. While the database goes back to 2011, these consumer narratives only become available from 2015, giving us 7 years of data (Jan 2015 - Dec 2021) with a total of 841,219 non-empty entries (consumers elect whether to disclose their narrative).

Since not all companies which receive complaints are publicly traded, for the purpose of this exercise we’ll just peel the top 25 from the top and ticker tie them later in the workflow.


3. Textual Analysis

Now we run the textual analysis:

  • Convert dataframe to corpus format

  • Make tokens out of the words

  • Remove stop words (e.g. “a”, “the”)

  • Stem the words (e.g. so we count “fraudulent” and “fraud” as same since stem would be “fraud”)

Just for demonstration purposes, after all this data wrangling we can run a word cloud visualisation. In this case we’re comparing across companies for 2021.

And here’s a word cloud across time for the same company, Wells Fargo.

For building intuition, we can also run a lexical dispersion plot for particular keywords that may be relevant in the financial context. In this case we plot “fraud”, “identity”, “violat*” for Wells Fargo by year.


4. Deriving Variables

Here’s the key part. We want to calculate the similarity of each Company-Year’s worth of complaints to one another. E.g. We compare Wells Fargo complaints from 2021 to Wells Fargo complaints from 2020 using a similarity score for each year.

Here’s a visualisation of the similarity score (i.e. derived variable). A score of 1 means that the nature of this year’s versus the previous year’s complaints is identical. The closer to zero, the more different is one Company-Year to another. By definition then: 1 - similarity_score = change_score.

Going further, we can plot company migration through time and quantiles like so:


5. Quantile Segmentation

To prepare the similarity score for testing, we need to:

  • Rank companies by their similarity score by year

  • Assign these to quintiles

  • Create portfolio weights

  • Run calculations and plot the portfolios

Here’s a visualisation of the weights. Since we have 25 stocks in our universe, a quintile will be 5 stocks. Within each quintile, each stock will take a 20% weighting.

And here’s a visualisation of the Q5-Q1 portfolio weight. This is a long-short portfolio. Long the stocks from the favoured quintile and short the stocks from the disfavoured quintile. The favoured quintile was then weighted at 130 and the disfavoured at 30 to give a 130/30 allocation. Notice that the long stocks sum to above 1.00 and the short stocks have negative weights and sum to -1/3.


6. Plotting Portfolio Performance

Here are the resulting portfolios for each quintile. Inspecting the differences visually, it’s clear that Portfolio Q1 performed worst and Q5 second best. The 130/30 long-short portfolio is best.


Next Steps:

  • For our Brazilian subscribers, here’s a company complaints database for Brazil. The analysis done in this post can be replicated regardless of language. Instead of English stop-words, use Portuguese ones. Same goes for other markets, Chinese and Japanese included.

  • For a single company case study using the similarity / change score, have a look at our previous post on AAPL here.


Thank you for reading EQUITY QUANT. If you like what we are doing, please subscribe or share this post. Your support helps a lot!

Share

Comment
Share
Share this post
#18. EQUITY QUANT: From Consumer Complaints to Investment Alpha
www.realchinacharts.com

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNewCommunity

No posts

Ready for more?

© 2022 China Charts
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing