install.packages("remotes") # Allows you to install packages from GitHub
::install_github("inqs909/inqstools") # Installs my R package for the course remotes
RLab 2: Probability Distributions
Setup
If you want to use the custom R Markdown templates for this course install the following packages:
Probability
We will discuss several R functions used for different probability distributions
Distributions
To see what distributions R supports, use
?Distributions
or
help(Distributions)
for more information. Every distribution has four different functions associated to them. The letter at the beginning of the distribution indicates the functions capabilities.
Letter | Functionality |
---|---|
“d” | returns the height of the probability density function |
“p” | returns the cummulative density function value |
“q” | returns the inverse cummulative density function (percentiles) |
“r” | returns a randomly generated number |
Probabilities
R can compute the probabilities of a distribution given the correct parameters. If you need the cummulative probability, p
in front of the distribution function is needed. If you need the probability for a discrete distribution, the d
in front of the distribution function is needed. Below are a few examples.
To find \(P(X \leq 3 )\) where \(X \sim N(5,2)\). Here we will use the pnorm()
function and we set q = 3
, mean = 5
and sd = sqrt(2)
.
pnorm(3, 5, sqrt(2))
[1] 0.0786496
To find \(P(X \geq 7 )\) where \(X \sim Norm(5, 2)\). Here we will use the pnorm()
and we set q = 7
, mean = 5
, sd = sqrt(2)
, and lower.tail = F
.
pnorm(7, 5, sqrt(2), F)
[1] 0.0786496
Percentiles
Percentiles identify the \(x_0\) that satisfies \(P(X \le x_0)=p\), where p is the percentile.
Finding the \(95^{th}\) percentile from a \(N(10, 5)\), we will use the qnorm()
. We will set mean=10
, sd=sqrt(5)
, and p=0.95
.
qnorm(.95, 10, sqrt(5))
[1] 13.678
Finding the \(95^{th}\) percentile from a \(Gamma(6,9)\), we will use the qgamma()
. We will set shape=6
, scale = 9
, and p=0.95
.
qgamma(.95, shape = 6, scale = 9)
[1] 94.61731
Random Number Generator
R is capable of generating random numbers. For example if we want to generate a random sample of size fifty from a normal distribution with mean eight and variance three, we will use the rnorm()
. If we want to generate a random sample from any distribution, use the distribution function with r
in front of it.
Let’s first generate the random sample of fifty from \(X \sim Normn(2.8, 1.3)\). This is done with the rnorm()
and setting n = 50
, mean = 2.8
and sd = sqrt(1.3)
.
rnorm(50, 2.8, sqrt(1.3))
[1] 2.3984352 1.4008082 1.2924412 3.4322535 3.6152810 1.9523617 4.1170462
[8] 3.7906858 3.0900068 5.1273407 3.0034423 3.6245478 3.6448887 2.4031711
[15] 2.0416685 4.2866305 2.8720997 3.9598131 2.6707746 3.9459731 3.5395160
[22] 4.1360112 2.6852912 3.4666843 1.2614398 1.2829051 4.2200997 1.7934212
[29] 4.1501066 2.9224111 4.4917564 4.0302493 3.4382405 4.1481276 0.5492955
[36] 1.8696131 4.0539968 4.7393620 0.6735766 2.3967131 2.8062328 2.2085940
[43] 1.4721390 3.0471010 0.9531207 2.6963201 1.9368524 3.2875979 2.2220100
[50] 3.4363829
No let’s generate a random sample of 100 form an \(X \sim Beta(2,4)\). This is done by using the rbeta()
and setting n = 100
, shape1 = 2
, and shape2 = 4
.
rbeta(100, 2, 4)
[1] 0.45048094 0.29911046 0.39158462 0.23092726 0.15540259 0.16178897
[7] 0.31284933 0.63717347 0.26597619 0.20518293 0.64686145 0.22807687
[13] 0.49658783 0.53139689 0.59426850 0.36964069 0.21429172 0.27978432
[19] 0.58111280 0.15238081 0.34764626 0.10334779 0.05634440 0.22302732
[25] 0.34684400 0.44708954 0.37575899 0.22659533 0.53716318 0.15734513
[31] 0.17497741 0.19120091 0.56198257 0.56324203 0.33913040 0.65788353
[37] 0.22170993 0.49862582 0.57057629 0.05067749 0.40472211 0.68350947
[43] 0.11824307 0.06236188 0.15753464 0.54591359 0.33642585 0.29199375
[49] 0.05156889 0.27769844 0.08791769 0.54504105 0.07507829 0.36047290
[55] 0.37077282 0.24285707 0.19516605 0.59634502 0.17325232 0.67318238
[61] 0.30216733 0.18851617 0.12069659 0.55605024 0.28275480 0.33518883
[67] 0.18489209 0.01219168 0.19893678 0.20761076 0.38899473 0.13329773
[73] 0.40731814 0.21692830 0.80399415 0.15949113 0.42604846 0.38839914
[79] 0.43795522 0.40032065 0.41024243 0.52763295 0.39468933 0.28112947
[85] 0.54849943 0.17862052 0.29072740 0.21216697 0.46346533 0.54896485
[91] 0.68470050 0.17837589 0.17619992 0.26564852 0.16730040 0.46155221
[97] 0.27720365 0.80356860 0.05962084 0.37511875
Histograms
Histograms are used to plot the frequencies of observed values of a random variable. This provides a rough estimate of how a distribution function will look like. Below, we will use the hist()
function to plot the distribution of a random sample of 1000 generated from a \(Exp(1.3)\):
<- rexp(1000, 1.3)
x hist(x)
Looking at the plot above, we can see that the mos common values are around 6. This is to be expected because the \(E(X)=6\) from the binomial distribution. Another way to roughly estimate the expected value is by taking the mean:
mean(x)
[1] 0.7588649
Notice how close it is to the expected value. If we are interested in computing the \(E(X^2)\), we can roughly estimate it by squaring all the values and taking the mean of it:
mean(x^2)
[1] 1.158253
Now, we know that the variance of a binomial distribution is \(np(1-p)\) which leads the \(Var(X)=4.2\) with the above distribution. We can roughly estimate the variance with the rough estimates of \(E(X^2)-E(X)^2\):
mean(x^2) - mean(x)^2
[1] 0.5823773
While it is not exact, it can give you a rough idea.
Problems
Use an RMD file to answer the following questions:
- Find the following probabilities:
\(X\sim N(3, 3.5)\); Find \(P(2 \le X \le 6)\)
\(X\sim Beta(6,8)\); Find \(P(0.5 \le X \le .7)\)
\(X\sim Unif(6,25)\); Find \(P(16 \le X \le 23)\)
- Find the following percentiles:
- \(X\sim Unif(1,5)\); Find the \(68\)th percentile.
- \(X\sim N(17, 6)\); Find the \(52\)th percentile.
- \(X\sim Exp(1)\); Find the \(43\)th percentile.
- Generate the 50 realizations of the following random variables:
\(X\sim Weibull(3,2)\)
\(X\sim Laplace(2,1/2)\)1
\(X\sim LogNorm(0.5, \sqrt 2)\)
- Generate 500 realizations of the following random variables and create a histogram:
\(X\sim Exp(1.5)\)
\(X \sim Exp(2)\)
\(X \sim Exp(5)\)
\(X\sim Exp(10)\)
- Generate 5000 realizations of \(X\sim Gamma(7, 12)\) and compute the following rough estimates:
\(E(X)\)
\(E(X^2)\)
\(Var(X)\)
Submit your Lab assignment as an RMD file in Canvas on 3/10/2023 by 11:59 PM.
Footnotes
You will need to install the VGAM R package.↩︎