OpenHazards Forecasts

Forecasts versus predictions

OpenHazards forecast
An OpenHazards forecast — earthquakes are forecast as brightly colored spots.
While in common language “forecasting” can be used as a synonym for “predicting,” earthquake science demands precise and distinct definitions of these two terms. A seismic forecast specifies the odds, or probability, of an earthquake occurring at a given location, during a given time window, within a given magnitude range.

An example forecast: "There is a 40% probability that an earthquake having a magnitude between 6.5 and 7.0 will occur within a 20 km radius around location X during the next 3 months."

By contrast, a prediction specifies whether an earthquake either will or will not occur at a given location, during a given time window, within a given magnitude range.

An example prediction: "There will be a magnitude 5.0 or greater earthquake within a 50 km radius around location Y during the next year."

In short, a forecast is a probability (percent chance), whereas a prediction is a "binary statement" (yes or no). An individual prediction can be validated by a single observation: either an earthquake did or did not occur when and where it was predicted. By contrast, a forecast cannot be validated by a single observation: If an earthquake occurs, that event will neither prove nor disprove that there was a 40% probability it would occur. Forecasts can, however, be tested and validated by analyzing the results of many observations.

How forecasts are tested

Forecasts can be validated by processes called backtesting and monitoring. OpenHazards validates its forecasts using both backtesting and monitoring.

In backtesting, one point in history is chosen as a hypothetical "present time" for testing purposes. Around this point, historical data is divided into a training period (prior data) and a testing period (posterior data). Forecasts are made using prior data to forecast events that occur during the testing period. The accuracy of the forecasting method is then scored using a variety of statistical tests that compare the forecast for the testing period with the actual posterior data. These tests determine resolution, how well the forecast discriminates between alternative outcomes; reliability, how closely the predicted frequency of events matches the observed frequency of events; and sharpness, how extensively the forecast deviates from the mean. To be validated, forecasts must achieve a pre-determined level of accuracy for the testing period.

Backtesting is used for forecast methods that rely on data that was recorded over a long period of time. OpenHazards' earthquake forecasts, for example, use the historical earthquake record. Backtesting essentially pretends that the forecast is being run at some time in the past. Then, no data that was collected after that time is used in the backtesting forecast. For example, if researchers wanted to backtest their forecasting method from the year 2002, only data that was collected before 2002 would be fed into the method. This is the important point: assuming they already had developed their forecast method in 2002, the researchers could have made this exact forecast in 2002, because they are using only the data they would have had in 2002. The forecast from this backtesting is actually a forecast beginning from the cutoff time in the data; in the example, a forecast from 2002 onward. This retroactive forecast can then be compared with the actual data from the cutoff time (2002 in the example) to the present day. Backtesting is, at least in theory, essentially no different than having made the forecast in the past, and then waiting until the present to see what actually happened and how accurate the forecast was.

The accuracy of the forecasting method is scored using a variety of statistical tests. These tests determine resolution, or how well the forecast discriminates between alternative outcomes; reliability, or how closely the predicted frequency of events matches the observed frequency of events; and sharpness, or how extensively the forecast deviates from the average. To be validated, forecasts must achieve a pre-determined level of accuracy for the testing period.

In monitoring, actual forecasts for future events are computed, and then actual events are observed in real time. The results are scored using the same types of statistical analysis used in backtesting. Many researchers consider monitoring to be a higher level of validation than backtesting, since the “answer” is not known in advance. Such knowledge might influence decisions made when developing the forecast method. However, monitoring can take many years of real time observation to determine the accuracy of a forecast method, whereas backtesting results can be computed within days, weeks, or a few months at most.

The OpenHazards method and other forecasts

Trenching study
Photograph from a 2001 trenching study near Tule Pond, Fremont, California.
Local earthquake forecasts have been used for highly affected areas like California for several decades. The Working Group on California Earthquake Probabilities, which includes scientists from the US Geological Survey, the California Geological Survey, and many experts from the academic and industrial arenas, have been developing long-term earthquake forecasts for regions in California since 1988. The calculations are based on historic averages of major earthquakes, as well as data from both modern earthquake-detection instruments and paleoseismic trenching studies (digging trenches on active fault traces to study evidence of ancient earthquakes). The official results of these methods are 30-year probabilities for major earthquakes (typically magnitude 6.7 and larger) on major earthquake faults in California. These forecasts are the basis of California's earthquake insurance rates.

The process of computing these probabilities is an extensive consultation and collaboration among more than one hundred scientists. Consensus of expert opinion plays a significant role in deciding on individual probability values. Producing a single new forecast this way requires coordinated effort from the seismology community and takes several years.

NTW forecasts for eight Japanese cities
OpenHazards can make data-driven forecasts worldwide. This image, from a recent OpenHazards presentation, forecasts earthquake probabilities for eight major Japanese cities.
The OpenHazards method is innovative in that it can be applied in a uniform way worldwide and updated daily to adapt to new changes in earthquake probabilities based on the location and magnitude of the thousands of tremors that are detected by seismographs each day. These methods are data-driven, using the ANSS catalog of earthquakes from online sources, together with well-known observational laws such as the Gutenberg-Richter relation and the Omori-Utsu aftershock frequency law. Our methods fit the parameters of these laws to past observations in order to compute future probabilities.

The OpenHazards forecast also notably has a working “memory” — the model considers recent earthquake activity when computing future risk. Statistical forecasting models are based on some type of statistical distribution, which defines the odds that a certain number of "events" (in this case, earthquakes) will happen within a given period of time. Commonly, earthquake forecasting models use what is called a Poisson distribution, but the OpenHazards model uses a new kind of Weibull distribution. One important difference between the two distributions is that the Weibull distribution allows the odds of an earthquake happening to change over time based on the occurrence of past earthquakes, while Poisson probabilites never change over time. OpenHazards' "Natural Time Weibull" (NTW) method reflects the changing probabilities for both primary earthquakes and their aftershocks.

Papers describing the basis of methods for the OpenHazards forecast have been published in the peer-reviewed literature for over a decade. The first of several papers describing details of the NTW forecast has just been accepted for publication as well. Our innovative work in computing has made it possible to use these well-documented methods to produce cutting-edge daily forecasts.

Risk Alert