The Black Swan by Nassim Nicholas Taleb

A Critic's Meta Review: 4/5

The Black Swan by Nassim Nicholas Taleb is a philosophical treatment of Taleb's research on highly improbable, high-impact events. These events, which Taleb calls “Black Swans,” are so improbable that they are unpredictable. However, pundits and scholars are often inclined to fit such extreme events into a causal narrative after the fact, in order to make history appear more organized.

In fact, predictive models rely on data from the past, and this past data biases predictive models against unprecedented, disruptive, and possibly calamitous events. By definition, such Black Swan events cannot be predicted using these models. For models of wealth or creative work, for example, the normal bell curve distribution is an inadequate model because a single extreme observation can shift the entire distribution. For such data sets, Mandelbrotian fractal distributions are more useful. Fractal models account for the apparent smoothness of data that include huge outliers like Black Swans, as well as the inequality of specific subsets of that data that do not include these outliers.

Professional forecasters and scholars struggle to grasp the unpredictability of extreme events, and they prefer to invent a narrative to make the unpredictable seem predictable in retrospect. They also think extreme events can be predicted with models of statistical probability, but those models eliminate complexities and externalities that lead to Black Swan outliers. The human mind also has difficulty comprehending the difference between randomness and chaos. Randomness is due entirely to chance, while chaos might obey an unknown organizing principle.

The best way to withstand a Black Swan event is to understand which risks not to take, to utilize redundancy and balance risk on each extreme of a distribution, to be humble about the knowledge available, and to avoid predictions made with unsound methods.

A Black Swan is a high-impact event well beyond expectations set by a normal bell-curve distribution. Humans are inclined to narrate causes for Black Swan outliers after the fact.

Black Swan events are unpredictable in part because they have never happened before. Thus, past events and trends do not suggest their possibility. When this is the case, as with a huge stock market crash or natural disaster, they will not appear in historical data regarding stock market behavior or weather events. If similar events have happened before, the cause may be explained in a way that appears predictable, but which does not present a usable model for future predictions. This would be the case if a stock market crash is blamed on an event to which it was not really connected.

By definition, Black Swans are impossible to predict, but they can be imagined. It is common knowledge in the scientific community that the San Andreas and New Madrid faults are due for significant movement, although that fact is not always so well known among the residents who live directly on the faults. This is particularly the case with respect to the New Madrid fault, which runs through several states in the Midwest where significant earthquakes occur infrequently. A significant earthquake on either fault would cause untold millions in property damage. The consequences could include loss of life, as well.

Although projections for these faults are based on activity from the 1800s or the long-term absence of significant movement, and can therefore be used to indicate the likelihood of a quake, no model can account for possible but unknown extreme effects of such a quake. Since many dams, power plants, and other infrastructural features in the Midwest were constructed without accounting for large earthquakes, the damage could set off an insurance-industry financial crisis that leads to serious repercussions throughout the world.

The normal bell-curve distribution assumes a range of results that does not account for Black Swan outliers, which would shift the distribution noticeably. The scale of such an extreme outlier can be the result of pure chance and the tendency for large outliers to become even larger due to their specific circumstances.

Data such as the heights of people are modeled adequately by bell-curve distributions because they tend toward a mean value and observed events become less frequent the further one gets from that mean. No individual height measurement will significantly move that mean in a sample. However, book sales are poorly modeled with a bell curve because a single author might sell many times as many books as several other authors put together in a year. Such an author’s sales are in “Extremistan,” which is Taleb’s term for scenarios populated by extreme outliers.

Analyzing company returns in a particular industry would also be a poor fit for normal distribution, because within a single industry one company may hold a functional monopoly while others struggle to make a profit while balancing costs. The large company’s initial success might have been the result of chance or a seemingly minor circumstance, such as having its name listed first in the phone book. This small edge might have allowed it to expand and increase its capacity while competitors struggled for business and name recognition. A company that already has a chance advantage over others in the same market can afford to advertise more, invest in customer service and sales, and build more factories, allowing it to rapidly dominate even if there is no historical precedent for such a large company in the industry.

The human mind creates the illusion of understanding things that are too complex for it to grasp, making chaotic events seem orderly and predictable in retrospect, and overvaluing neat classifications. Taleb calls this Platonifying, after the Greek philosopher Plato.

Platonifying is the practice of assigning simple causes and classifications to complex events. For example, on the day that Iraqi dictator Saddam Hussein was captured, news articles blamed the event for both a rise and a drop in the stock market.

Platonifying may be a comforting practice when trying to identify the causes of disasters, but it can also obstruct efforts to prevent similar disasters. For example, knowing that one cause of a home-destroying flood was a hurricane might not prevent someone from rebuilding in the same place if that person believes hurricanes happen very rarely. That person’s home would be destroyed again if a hurricane occurs during the next hurricane season, or if another event causes a flood, such as extreme rains or an earthquake that causes a tsunami. On the other hand, if that person grasps the complexity of factors that caused the home to be destroyed, from its location and the management of nearby rivers to its method of construction, that person could take preventative steps such as building a home on stilts to avoid the next flood, regardless of cause, and advocating for better civic management of the dams and rivers that could complicate an otherwise mild flooding event.

Predictability is relative to knowledge. Complex systems are perceived as random due to a lack of available or comprehensible information.

Whether an event is truly unpredicted is relative, especially if it was planned in secret by one person or a small group. The terrorist attacks of September 11, 2001, were Black Swans to the victims, but not to the plotters. It could not have been predicted by looking at past terrorist attacks or aircraft disasters. But nor was it a random event. Instead, events that are unknown to one person but not to another are considered chaotic.

Things that appear random, but which are actually chaotic, include the frequency of transactions on the stock market and the movements of weather systems. To an outside observer, the rapidly changing volume of transactions would appear unpredictable and governed only by general trends like time of day or stock market index values. In reality, anyone could predict trade volume if he or she had the capacity to ask all people who will trade in a day when, and how much, they will trade. Similarly, as in the example of predicting the paths of billiard balls, predicting the route of a tornado requires someone to collect data about the movements of air currents and pressure systems, the consistency of the ground the tornado might encounter, the obstacles in its way, the integrity of those obstacles, and so on. Such an enormous quantity of data makes predictions such as these practically impossible.

Generating simplified narratives of the causes of Black Swans and demonstrating their probability after the fact prevents people from considering the many complex factors that actually caused the event, which then distorts predictive models.

Before the adoption of modern critical methods, historians tended to craft a narrative of human events taking place over thousands of years by portraying events in sequence, with one event logically leading to the next. This narrative style makes wars and economic crashes understandable and predictable. However, it also prevents students of history from grasping the actual unexpectedness of events. Unfortunately most students and teachers of history still approach the subject in this narrative-predictive way. This encourages learners to apply historically particular lessons of the past to irrelevant or inappropriate contexts, and distracts from present complexities that might produce a Black Swan. The manner in which probability and statistics are taught also leads to misconceptions: the typical education in statistics demonstrates everything in terms of dice rolls and coin tosses, confining probability to a simplified context that does not often occur in reality, where coins are weighted and the probability of one event depends on the probability of many others.

Thus, a student who takes literally school lessons in probability and history might encounter problems interpreting the likelihood of events in reality. For example, a student who learns that the Great Depression was the direct result of the stock market crash that occurred on October 24, 1929, might assume that a similar depression is unlikely due to all the legislation that was enacted to prevent the excessive speculation that led to the crash. This deduction ignores not only the complexity of factors that caused and exacerbated the Depression, but also the fundamentally altered nature of the world economy and the complexity and unpredictability of world events that could generate an economic catastrophe as bad as or worse than the Black Tuesday crash in 1929.

Most predictions about potential catastrophic events are made by extrapolating from the past, which does not always account for actual, current threats faced by systems, companies, communities, or nations.

Predictions modeled on the past will naturally include only events that have not ended civilization as we know it, so those predictions will be unlikely to account for events so disastrous that they could cause the annihilation of certain industries, entire societies, or even the entire planet.

Taleb quotes an aphorism derived from Voltaire’s Candide: some people believe ours is the best of all possible worlds. A philosopher tells Candide that this must be the best possible world because humans have two legs, and pants have two holes for those legs. This joke is based on the common mistake of conflating cause and effect, one that becomes very common when people attempt to craft a narrative for improbable events. People may choose to attribute as the specific cause of an observation, such as the apparently perfect match between human legs and their pants, an explanation that does not align with reality: that pants were invented because most humans have two legs.

Significant serendipitous discoveries and opportunities can also be Black Swans.

Black Swans can be positive as well as negative. They can also be the result of effort or the result of chance, or a combination thereof: the discovery of the universe’s background radiation, which provided evidence of the Big Bang, was the result of scientists cleaning bird droppings off satellite dishes. These events are sometimes attributed to the serendipitous presence of the right people at the right place and the right time. Creating more such opportunities would require people to make themselves available for more opportunities, make more connections, explore, and investigate.

Maximizing serendipity may seem like an oxymoron because serendipity is associated with luck. But the reality is that serendipity is really connected to how available someone is for luck. Someone with a skill who never leaves his or her home would be unlikely to serendipitously encounter someone who needs that skill. A well known example is Christopher Columbus, who sailed across the Atlantic to discover what he believed to be a shorter passage to the East Indies. Columbus failed to locate this passage, but instead made the first recorded European contact with Native peoples in the Americas—an unintended and unforeseen event that had Black Swan implications for Columbus, the Spanish Empire, and indeed, the entire world.

Most people are naturally overconfident in what they know, especially experts in changing fields. Often, giving someone more information on a topic makes them more confident about an uninformed determination, and unwilling to revise their determinations and predictions.

Even when people are given the opportunity to make the safest possible guess about an unknown, they have a very high error rate. Experts in probability and mathematics fail some simple tests of comprehension in their fields, and experts making predictions in many fields are not more accurate than someone with no expertise. In general, studies indicate that when someone has the chance to make a decision based on very limited evidence, they will not change that decision even if more evidence proves them wrong.

Politicians offer a clear demonstration of the principle that some people will be overconfident in a belief without the evidence to support it, and view all evidence as support of that belief. They demonstrate this excess of confidence in order to appear like capable representatives to their constituents. Rather than admit to being mistaken about something, they may point to contradictory evidence that could be interpreted as support. For example, during his run for the 2016 Republican nomination for president of the United States, Donald Trump stated that he witnessed thousands of New Jersey residents celebrating the terrorist attacks on the World Trade Center on September 11, 2001. Despite thorough fact-checking that contradicts his claims, Trump’s campaign continued to assert that his memory is infallible on this point.

The best strategy for withstanding a Black Swan event is to be honest about ignorance, to make decisions based on the available information and not long-term predictions, and to use methods such as insurance and redundancy to reduce the risk of possible Black Swan outcomes.

Anyone seeking to mitigate the risk of a Black Swan should not rely on predictions that are based on historical trends or which otherwise do not account for Black Swans. Risks can be partially mitigated with insurance for a particular consequence that might result from a Black Swan, or by building redundancies into systems to protect them from catastrophic failure.

As an example, a power generation company that observes the rising frequency of cyberattacks may imagine that such an attack could destroy its entire power grid and lead to completely unexpected problems. No risk models based on the past can give the company’s decision-makers a clear idea of the type and scale of damages to expect, because the historical data provides no trends. If management sufficiently understands this, the company will not waste energy trying to determine when and how a specific attack might occur, and will instead make changes and investments to improve its overall robustness. The company might undertake a thorough audit of its connected systems, identify points of vulnerability, mitigate individual risks that might have resulted in an attacker gaining access to customer data or building elevators, and train staff on best practices for cyberattacks. Then, they may purchase an insurance policy that compensates them for certain damages that could be caused by a cyberattack, such as fires, thereby limiting the potential losses that could result. The company might also invest in redundant generators and security systems to continue producing power if a cyberattack were to knock out vulnerable core systems.

The Gaussian bell curve assumes a situation where every event is independent of every other event and where the interval between possible observations is fixed. Very few data sets in reality can be effectively analyzed with the normal bell distribution.

A bell curve assumes that one observation is not dependent on another, and that the possible distance between one observation and the next is knowable. The heights of people and the sizes of animals can fit a bell curve, but most other real-life data sets, particularly those relating to human behavior, would require the researcher to remove outlier data from the set.

The reason heights work within a normal distribution is that one person’s height does not noticeably affect someone else’s height. No one grows taller because another person happened to be shorter. Also, every person’s height is measured in units that are predictable, and heights can be grouped according to the closest inch by rounding up or down. However, observing the height of all living things on the planet would not fit a bell curve, because it does not account for scale. A bacterium’s height would be meaningless compared to that of a visible animal, and bacteria vastly outnumber every other living thing at a scale that would further invalidate the bell curve. The analyst would have to choose between arbitrarily ruling out bacteria as outliers, or ruling out visible animals as outliers.

Fractal distributions can approximate a model of scenarios that involve the possibility of Black Swans. At a distance, they appear uniform, but varying levels of inequality are exposed when subsets are sampled.

Fractals are geometric forms of infinite complexity at every scale. They are present in nature and demonstrate important features of scenarios vulnerable to disruption by a Black Swan. As with a fractal—and quite unlike a normal distribution—observations that include extreme outliers may have one distribution when a set of only the highest values is displayed, and a similar distribution when the size of the set is expanded exponentially.

When it comes to predictions, a fractal model cannot be used in the same way as a Gaussian bell curve. A normally distributed bell curve is an expression of probability; a fractal model is not. Analyses of fractals are also imprecise, so using a fractal model for predicting future events can be of dubious benefit. For example, analysis of a bell curve would reflect a small probability that anyone born in the United States will live to 97. By contrast, fractal distributions allow analysts to identify the most likely age to which a person will live depending on their current age. If he or she is 20 years old, that likely age will be in the 70s: another 50-plus years. But when that person is 96, the distribution is adjusted. A 96-year-old’s expected age might only be 97: just one more year. Thus, readers seeking models for data that include outliers, including extreme outliers like Black Swans, may find fractal models more useful than the Gaussian bell curve.