To read this content please select one of the options below:

Preliminary Data Analysis*

Torben Juul Andersen (Copenhagen Business School, Denmark)

A Study of Risky Business Outcomes: Adapting to Strategic Disruption

ISBN: 978-1-83797-075-9, eISBN: 978-1-83797-074-2

Publication date: 29 September 2023

Abstract

This chapter first analyzes how the data-cleaning process affects the share of missing values in the extracted European and North American datasets. It then moves on to examine how three different approaches to treat the issue of missing values, Complete Case, Multiple Imputation Chained Equations (MICE), and K-Nearest Neighbor (KNN) imputations affect the number of firms and their average lifespan in the datasets compared to the original sample and assessed across different SIC industry divisions. This is extended to consider implied effects on the distribution of a key performance indicator, return on assets (ROA), calculating skewness and kurtosis measures for each of the treatment methods and across industry contexts. This consistently shows highly negatively skewed distributions with high positive excess kurtosis across all the industries where the KNN imputation treatment creates results with distribution characteristics that are closest to the original untreated data. We further analyze the persistency of the (extreme) left-skewed tails measured in terms of the share of outliers and extreme outliers, which shows consistent and rather high percentages of outliers around 15% of the full sample and extreme outliers around 7.5% indicating pervasive skewness in the data. Of the three alternative approaches to deal with missing values, the KNN imputation treatment is found to be the method that generates final datasets that most closely resemble the original data even though the Complete Case approach remains the norm in mainstream studies. One consequence of this is that most empirical studies are likely to underestimate the prevalence of extreme negative performance outcomes.

Keywords

Citation

Andersen, T.J. (2023), "Preliminary Data Analysis*", A Study of Risky Business Outcomes: Adapting to Strategic Disruption (Emerald Studies in Global Strategic Responsiveness), Emerald Publishing Limited, Leeds, pp. 29-45. https://doi.org/10.1108/978-1-83797-074-220231003

Publisher

:

Emerald Publishing Limited

Copyright © 2023 Torben Juul Andersen