The pros and cons of Latin Hypercube sampling

Apr 07, 2015 1366 Comments

Most risk analysis simulation software products offer Latin Hypercube Sampling (LHS). It is a method for ensuring that each probability distribution in your model is evenly sampled which at first glance seems very appealing.

The technique dates back to 1980[1] (even though the @RISK manual[2] describes LHS as “a new sampling technique”) when computers were very slow, the number of distributions in a model was extremely modest and simulations took hours or days to complete. It was, at the time, an appealing technique because it allowed one to obtain a stable output with a much smaller number of samples than simple Monte Carlo simulation, making simulation more practical with the computing tools available at the time.

However, desktop computers are now at least 1,000 times faster than the early 1980s, and the value of LHS has disappeared as a result. LHS does not deserve a place in modern simulation software. We are often asked why we don’t implement LHS in our ModelRisk software, since nearly all other Monte Carlo simulation applications do, so we thought it would be worthwhile to provide an explanation here.

What is Latin Hypercube sampling?

Latin Hypercube Sampling (LHS) is a type of stratified sampling. It works by controlling the way that random samples are generated for a probability distribution. Probability distributions can be described by a cumulative curve, like the one below. The vertical axis represents the probability that the variable will fall at or below the horizontal axis value. Imagine we want to take 5 samples from this distribution. We can split the vertical scale into 5 equal probability ranges: 0-20%, 20-40%, …, 80-100%. If we take one random sample within each range and calculate the variable value that has this cumulative probability, we have created 5 Latin Hypercube samples for this variable:

Description: https://media.licdn.com/mpr/mpr/p/3/005/071/17f/2da2a6e.jpg

When a model contains just one variable, the distribution can be stratified into the same number of partitions as there are samples: so, if you want 1000 samples you can have 1000 stratifications, and be guaranteed that there will be precisely 1 sample in each 0.1% of the cumulative probability range.

But risk analysis models don’t have just one distribution – they have many. LHS controls the sampling of each distribution separately to provide even coverage for each distribution individually, but does not control the sampling of combinations of distributions. This means that the extra precision offered by LHS over standard Monte Carlo sampling rapidly becomes imperceptible as the number of distributions increases.

What you gain from LHS

You gain a small level of precision, but it is a very small level. To illustrate this, consider a model that is summing nine normal distributions as follows:

Description: https://media.licdn.com/mpr/mpr/p/4/005/071/180/0190829.jpg

I’ve chosen to sum nine distributions because this represents a very small model – nearly all models will have more than this number of variables - and the extra precision from LHS is more apparent with small numbers of variables. I’ve chosen a model that adds Normal distributions because we already know from probability theory that the resultant sum is another Normal distribution with mean = 81, standard deviation = 3.836665, and P95 (95 percentile) = 87.31075.

I ran 500 samples of this simulation model using both Monte Carlo sampling and Latin Hypercube sampling. I chose 500 samples because this is a very modest number (simulation models are typically run for 3,000 – 5,000 samples or iterations) and LHS offers the greatest improvement in precision over Monte Carlo sampling when the number of samples is small. I repeated the exercise 200,000 times – each time recording the resultant distribution’s standard deviation and P95, both measures of spread which is what, as risk analysts, we are really interested in. The following plots present the findings.

Description: https://media.licdn.com/mpr/mpr/p/4/005/071/17e/3fd4923.jpg

Plot 1: This shows that if you run a model for 500 samples, the resultant standard deviation can vary from around 3.5 to 4.2, whereas the theoretical value (i.e. the value you’d achieve with an essentially infinite number of samples) is 3.836665. There is essentially no difference between the LHS and Monte Carlo simulation results.

Description: https://media.licdn.com/mpr/mpr/p/3/005/071/17f/2628cd7.jpg

Plot 2: This shows that if you run a model for 500 samples, the resultant P95 can vary from around 86.4 to 88.2, whereas the theoretical value (i.e. the value you’d achieve with an essentially infinite number of samples) is 87.3108. There is a small increase in precision by using LHS over Monte Carlo simulation but this is drowned out by the imprecision of taking so few (500) samples.

In summary: the increase in precision offered by LHS is extremely modest, even when one applies it to simulations where it offers the greatest benefit (i.e. few distributions and few samples) and this increase in precision is trivial in comparison to the imprecision of the results achieved from running so few samples.

What you lose with LHS

LHS would still be marginally beneficial in simulation if it did not bring with it a number of restrictions and penalties, some of which are very important. The following list describes some of those restrictions:

Memory and speed

With LHS, the stratified sampling has to be done for each distribution prior to starting the simulation. That means, for example, that if you have 100 distributions in your model and you want to run 5,000 samples, LHS will have to generate, shuffle and store 500,000 random numbers prior to starting the simulation. That is why, with most simulation software products, you will experience a long delay before the simulation of large models starts. The Crystal Ball manual[3] explains the memory issue: “The increased accuracy of this method comes at the expense of added memory requirements to hold the full Latin Hypercube sample for each assumption.”

LHS also only works if one generates samples from probability distributions using the inversion method described above (i.e. reading backwards along the cumulative probability curve). There are many excellent algorithms for generating random samples from distributions that are as precise as the inversion method or even better and can be hundreds of times faster than the inversion method, but because they don’t use the inversion method they cannot be implemented with LHS. Here are just two examples:

  • The normal and lognormal distributions

The normal and lognormal probability distributions are among the most commonly used in risk analysis, yet there is no equation for their cumulative probability curves – which means that applying LHS relies on approximations to those curves. However, if you don’t use LHS, there are several much faster and perfectly precise algorithms for generating these variables.

  • The Negative Binomial distribution

The NegBin distribution is discrete, and discrete distributions are generally much harder to sample from than continuous distributions (which is probably why Crystal Ball limits the allowed values for parameters of a lot of its discrete distributions[4]). There does exist a cumulative probability function for the NegBin that looks like this:

Description: https://media.licdn.com/mpr/mpr/p/2/005/071/180/3eb9f30.jpg

The inversion method used by LHS generates a cumulative probability that has to be compared with the above F(x) value. Let’s say the x value that matches the cumulative probability is 200 – the inversion method requires the computer perform 201 summations to be able to find x. That is a lot of calculations! The number of calculations is proportional to x, so if x is very large the simulation time can be very long indeed. In contrast, if one is using Monte Carlo sampling, there is a very clever method for generating NegBin samples that places no restrictions on the parameters and that simulates extremely fast no matter what the value of x.

Correlation

Up until the introduction of ModelRisk, every Monte Carlo Excel add-in had used the rank order correlation method for correlating variables. It is another simulation technique dating back to the early ‘80s[5]. @RISK, Crystal Ball, and nearly all other Monte Carlo simulation add-ins still use this very old and inaccurate method. Using one of these products, when a user specifies in a model that two variables are 90% correlated the software assumes that the correlation looks like this:

Description: https://media.licdn.com/mpr/mpr/p/1/005/071/182/1b34034.jpg

Risk modelers seem to accept having just one correlation pattern available to them, probably because they are unaware that others exist. To illustrate how limiting the rank order correlation method is, the following scatter plots show data generated from different copulas (proper correlation models used in ModelRisk) which all have the same 90% rank order correlation as the plot above:

Description: https://media.licdn.com/mpr/mpr/p/2/005/071/183/26beb20.jpg

The implications of using the wrong correlation structure can be really enormous - in fact it was one of the critical factors in the financial collapse of 2007/8. Copulas can also be extremely flexible - for example, ModelRisk's unique empirical copula will statistically match any correlation pattern exhibited by data with any number of variables no matter how unusual those patterns might be. The following plots show the rather unusual (we made it up for fun) correlation structure between three variables. The ‘observations’ are plotted as red points, the simulated copula values are in blue. This technique, like all the copula methods, would not be possible with LHS:

Description: https://media.licdn.com/mpr/mpr/p/3/005/071/184/0643009.jpg

Precision of results

The more samples (iterations) one performs in a simulation model, the closer the results approach the theoretical distribution that would be given with an infinite number of samples. The proximity of the actual results to the theoretical is called the level of precision. There are statistical tests for determining the level of precision that one has achieved by running a Monte Carlo simulation. For example, one can calculate that, with 90% confidence, the mean of a simulation output is within +/- $100 of the true mean using precision control functions and allows the user to tell the software to keep running the model until one or more precision criteria have been achieved. However, no such statistical tests are available if one uses LHS.

Stopping the simulation run before completion

A simulation run is often stopped before the pre-determined number of samples is complete, either because the results already look fine, we run out of time or patience, or the precision control determines that the required level of precision has been achieved. LHS relies on the model being run for the entire number of samples originally specified, and will provide little if any extra precision if this does not occur.

Extending the number of samples

It often happens that one runs say 5,000 samples of a model and after reviewing the results, we think it worth running another 1,000 or so to get better looking graphs or more precise statistics (with ModelRisk, clicking the Run button again after a simulation will give a pop-up dialog asking if you want to restart or continue the simulation). While this is perfectly possible with both Monte Carlo sampling and LHS, extending a simulation requires that LHS makes a new set of stratifications which will overlap with the old reducing any benefit that LHS might offer.

So when is Latin Hypercube Sampling useful?

LHS is useful in a couple of particular circumstances:

  1. You have only one or two distributions in your model and you need the answers very fast. In this situation, for most models the mean will stabilise more quickly with LHS, though the spread and shape of the output distribution will not stabilise much more quickly than with Monte Carlo sampling, and it is the spread and the tails of the output distribution that we are most concerned about in risk analysis. ModelRisk can simulate using Monte Carlo sampling a model with two distributions 100,000 times in under 11 seconds, or 10,000 times in a second, by which time the precision of the results will be essentially indistinguishable from LHS anyway.
  2. You are using Monte Carlo sampling to perform a numerical integration. A very specific technical situation that mostly applies in scientific and engineering work, but if you need to do a numerical integration there are better methods than simulation (for example the VoseIntegrate function) which will give a far greater precision than is possible to achieve with simulation, and performs the calculation in a fraction of a second.

[1] Iman, R. L., Davenport, J. M., and Zeigler, D. K. 'Latin Hypercube Sampling (A Program User's Guide)': Technical Report SAND79-1473, Sandia Laboratories, Albuquerque (1980).

[2] @RISK Users Manual. Palisade Corporation Page 648.

[3] Oracle Crystal Ball User's Guide. Release 11.1.2.2

[4] Oracle Crystal Ball User's Guide. Release 11.1.2.2

[5] Iman, R. L. and Conover, W. J., (1982). 'A Distribution-Free Approach to Inducing Rank Order Correlation Among Input Variables', Commun Statist-Simula Computa 11(3) 311-334.

David Vose

Source: https://www.linkedin.com/pulse/20140708131747-483951-the-pros-and-cons-of-latin-hypercube-sampling

Tags:

0 Comment on this post