Rejection sampling is a popular method for generating random variates. It’s based on the idea that, if you generate a number from some probability distribution and that number turns out to be outside the bounds of distribution, you can just discard it and try until you find one that works.
In this article, we will discuss the advantages and disadvantages of rejection sampling in research, along with when and how rejection sampling can be used in a study.
Rejection Sampling is a method of statistical inference. It involves drawing random samples and rejecting those that don’t meet some threshold until you reach the number of samples you need.
It is a method for creating samples from one distribution by using an easier distribution. For instance, imagine you have a coin that lands on heads 60% of the time.
You want to use this coin to create samples from another distribution that also has a probability of 60% for an outcome. Only that this other distribution is much harder to sample from than just flipping the coin. You could write a program that flips the coin over and over again until there are 60 “heads” and 40 “tails” or to your desired ratio.
However, to get a good number of samples, you will have to flip the coin thousands of times. This would take a lot of time and still wouldn’t give you perfect results.
On the other hand, if you had another coin that already has the desired ratio built-in, this coin would be much easier to work with because you could just flip it and use the results you get. So what rejection sampling does is build this second “easy-to-sample” distribution so that it closely matches the first one.
Read: Consecutive Sampling: Definition, Examples, Pros & Cons
Rejection Sampling has several advantages over other methods for sampling, some of which include:
Read: Convenience Sampling: Definition, Applications, Examples
Rejection Sampling has some disadvantages such as:
Rejection sampling is a type of sampling that’s often used when you’re estimating a quantity, but it’s not always the best option. However, you can use rejection sampling in the following cases:
You can also make use of rejection sampling when you want to:
The rejection sampling process consists of two steps. In the first step, a sample is selected from a distribution that has a known probability density function (pdf).
In the second step, the sample is accepted or rejected based on a probability density function that’s related to the pdf in the first step. If the sample is accepted, it’s returned to the calling routine; if it’s rejected, you go back to the first step and select another sample.
The process works because it can simulate any random variable whose distribution matches the pdf in step one. To understand how this works, consider the following example.
The goal of a sampling process is to generate random values that follow a normal distribution with mean 0 and standard deviation 1. These values will be generated by starting with samples from an exponential distribution with a mean of 1 and then accepting those samples based on their proximity to 0.
Read: What is Stratified Sampling? Definition, Examples, Types
The exponential distribution has an easy-to-use inverse function, while the normal distribution doesn’t. This process can be used to approximate any distribution you could want.
Therefore, rejection sampling involves three steps:
The PDF is used either to evaluate the probability that an event will occur under some circumstances or to represent the relative likelihood of different events. If accepted, then you have a sample from your distribution, if otherwise, you go back and start over.
Read: Probability Sampling: Definition, Types, Examples, Pros & Cons
Rejection sampling has a relative simplicity of algorithm that can be used to generate samples. The process is as follows:
Select an arbitrary distribution function, f(x). This function should be the same as the distribution of your data. In other words, if you are trying to draw a random sample that is uniformly distributed between 1 and 10, then this function would be f(x) = x.
Select a probability density function (p.d.f.) from which you can sample easily. The p.d.f. should be greater than 0 everywhere, but does not need to approach zero quickly (e.g., p(x) = 1).
A constant p.d.f. works well for most purposes; however, any p.d.f will work so long as it is greater than 0 everywhere and does not approach zero quickly at either end of the interval in which it is defined (this might seem like a tall order, but such functions do exist). The process of rejection sampling can be illustrated with an example.
Read: Cluster Sampling Guide: Types, Methods, Examples & Uses
Suppose we wish to sample from a uniform distribution in the interval [0, 1], but we only have a normal distribution with a mean of 0 and a standard deviation of 1. Since there is no easy way to generate uniform random numbers, we will accept or reject samples from the normal distribution until we have generated a uniform random number.
The procedure is as follows:
Sample x from the normal distribution
Set y = f(x) where f is the density function of the target distribution, which in this case would be the uniform density function in [0, 1]. In other words, y is a uniform random variable multiplied by the density of the normal distribution at x.
Generate another uniformly distributed random variable u between 0 and 1. If u < y, accept x as a sample; otherwise reject it. Repeat until the desired number of samples is obtained.
Example 1
Imagine that you want to generate samples from the distribution shown in the graph below. The distribution has a sharp peak over the interval (0,1) but falls quickly to zero outside this range.
This distribution is difficult to sample directly because of its narrow peak. To sample this distribution using rejection sampling, we first need to choose an envelope distribution that has no regions where it drops to zero.
In this example, we will use a uniform distribution that spans the entire range of the target distribution (in this case from 0 to 1). The envelope and target distributions are shown in the graph below.
Notice that there are no regions where the envelope distribution drops to zero within the region where the target distribution does not drop to zero, so this pair of distributions satisfy our requirements for rejection sampling.
Example 2
Let’s say you were applying to three different jobs at the same company, but only got offered one of them. You could say that you were rejected by the other two jobs and accepted by the third job.
Or if you’re trying to get a promotion at work and your boss says no, but then tells you about an opening at another company that could help you advance in your career, you can think of it as getting rejected by your boss, but then discovering another option through that rejection. For all its benefits (like helping us grow!), rejection sampling can be tough at first.
Researchers should note that rejection sampling works by taking samples from an “envelope” distribution and accepting them with a probability that depends on the target distribution. If the sample is rejected, you can be assured that another sample will be taken until one is accepted.
You may also like:
In this article, we will discuss what population of interest means, how it differs from a parameter of interest, how to determine the...
Read this article to learn more about the types, advantages and disadvantages of this researcher sampling technique.
In this guide, we’d explore different types of cluster sampling and show you how to apply this technique to market research.
In this article, we’d look at why you should adopt convenience sampling in your research and how to reduce the effects of convenience...