Methods

Our data set consisted of excerpts of 10-k filings relating to “climate risk” extracted from a subset of the Russell 3000 companies (download the full dataset). We dealt with roughly 600 companies across 5 industry sectors -- Oil and Gas, Electric Utilities, Insurance, Food and Agriculture, and Textiles and Apparel.

Corporate 10-k filings typically stretch to a 100 pages and not every company is necessarily disclosing climate-related risks either because they have not started analyzing them or because they consider them immaterial. So isolating the bits of text (if any) addressing climate risk within these broader reports takes some clever extractive thinking.

According to the SEC’s interpretive guidance on climate risk, there are various sections of the 10-K report where it is appropriate to make climate-related disclosures. These include:

The description of a public company’s business and that of its subsidiaries (Part 1, Item 1), which should include costs of complying with environmental laws.
The risk factors section (Part 1, Item 1a), which should include a discussion of factors constituting an investment risk particular to the company and its subsidiaries.
A description of pending legal actions considered material by various standards, including environmental litigation, which should be included under Part 1, Item 3: ‘Legal Proceedings’.
Part 2, Item 7, ‘Management’s Discussion and Analysis of Operations (MD&A)’, which should include further narrative discussion of climate-related considerations that may in the future impact the financial condition or operating performance of the company positively or negatively.
And finally, companies should also consider disclosure of the impacts of federal and state legislation and regulation, international accords, indirect consequences or opportunities arising out of regulation or business trends, and the likely impacts on the business of physical forces such as severe weather or sea level rise.

Jackie Cook, from CookESG Research, devised a series of rule based algorithms that, based on a series of key-word queries working in sequence, find these bits of text in the larger reports. Each piece of extracted text (ranging from a couple sentences to a few paragraphs), was compiled into a year-by-year corpus of climate disclosures statements for each company. We then brought these .csv files into the natural language processing platform called CorText to parse the texts and identify the most salient terms and phrases being used to discuss climate risk in the corporate filings, find their relations to each other through co-occurence algorithms, and then spatialize and organize these relationships using various visual strategies (network graphs, term histograms and radial diagrams). There is a more detailed description of the methods and data treatments used for each visualization on the each visualization page.

If you request more information about the methodology applied in compiling the data set please contact jcook@cookesg.com. If you are interested in the data set and the digital methods applied to its analysis please contact ian.gray@sciencespo.fr.