#17. EQUITY QUANT: What Can We Glean From Company Filings (Without Actually Reading Them)?
Read Time: 5 min
This is EQUITY QUANT, a new experimental column where we add explicit and tangible value to equity investors. Its geographic remit will be wider than that of our staple columns on China (i.e. LONG VIEW, THE BRIEF), and may cover both emerging (incl. ex-China) and developed markets. We’re still a fairly young newsletter (started Nov 2021), are approaching 600 subscribers, and are still experimenting with the content. Thank you for your continued support! Please share and spread the word, it helps a lot!
Plotting Change in Filings
Top Words by Salience
Company filings (e.g. 10-K, 10-Q, 20-F) can be dense and lengthy. For equity investors, it’s not easy getting through them. Are there any methods we can apply to prioritise which filings to read first and identify what to keep an eye on, all before actually digging in and reading them? This issue we sift through Apple’s 10-K filings at the SEC as a case study.
*Not investment advice. Do your own research.
1. Plotting Change in Filings
We first extract one of the sections of most interest — “Item 7: Management Discussion & Analysis” — and try to quantify the year-on-year change.
Our change score is a textual (word composition & frequency) comparison of the section from one year to the year prior. For example, a data point for 2020 indicates the scoring of the 2020 filing versus the 2019 filing. A score == 0 means no change (likely comparing the same document). At a glance, the change score answers the question "Did something change?" or “Should this be the first filing I read?.
In the case of AAPL, it’s clear that something was already afoot in 2005 and again in 2006, all before the spike in 2007. Jobs only announced the iPhone on Jan. 9, 2007 at the MacWorld Expo (source here).
2. Top Words by Salience
We extract words from the filings and then score them based on their frequency within a particular 10-K versus their frequency in the company's other 10-Ks. The point is to try to identify words that are rare (so they don't show up high in frequency counts) but which could be significant in the context of a particular filing. The words are stemmed, e.g. "fraud" and "fraudulent" are treated the same as their common stem is "fraud"; "restructur" is the stem of "restructure", "restructuring", "restructured", which all are semantically similar. Wearables become a thing in 2019 and 2020.
And here are the salience scores for the words from the above table. Just to illustrate, the top_1 word in 1999 has a score of 69.90, meaning it doesn’t really show up anywhere else, what’s so special about it? Go back up and find that it corresponds to word “y2k”, i.e. the year 2000 when humanity was kind of panicking whether computer clocks would blow up going into 2000. Another example, 2007, very exceptional MD&A section versus Apple’s other filings; mostly about compensation to Jobs for pulling it all off? (iPhone announced January, filing filed end of September).
3. Sentiment Analysis
Here we use the Loughran & McDonald Master Dictionary, a finance specific dictionary with codings for sentiment. Just for flavour, see lower down for sample list of words for the sentiment codings.
Litigious wording increased strongly going into the 2000s before slope of trend abated somewhat after;
Negative wording fell sharply after 2005;
Strong wordings have their 1-3 year periods of surfacing now and then;
Uncertainty wordings for the company bottomed in 2007 and remained incredibly steady thereafter. Again, after the release of the iPhone things became more certain, which is reflected in the textual discussions in the 10-K.
In EQUITY QUANT’s next issue, we’ll run a case study (different data set) where we actually tie the scoring to investment returns. Fingers crossed it’ll pan out. Stay tuned!