Marginal Distributions
Conditional Distributions
Independence
Expectations
Covariance
Definition: If \(X\) and \(Y\) take values in countable sets \(\mathcal{X}, \mathcal{Y}\), then
\[ p_{X,Y}(x,y) = \Pr(X=x, Y=y), \quad (x,y)\in \mathcal{X}\times \mathcal{Y}. \]
Properties:
Definition: If \((X,Y)\) is continuous, it is described by a joint density \(f_{X,Y}(x,y)\) such that for any region \(A \subseteq \mathbb{R}^2\):
\[ \Pr((X,Y)\in A) = \iint_A f_{X,Y}(x,y)\, dx\,dy. \]
Properties:
Let \(X_1, X_2, \dots, X_n\) be random variables.
Discrete case:
The joint pmf is
\[
p_{X_1,\dots,X_n}(x_1,\dots,x_n) = \Pr(X_1=x_1, \dots, X_n=x_n)
\]
Continuous case:
The joint pdf is
\[
f_{X_1,\dots,X_n}(x_1,\dots,x_n)
\]
library(MASS)
library(ggplot2)
# Parameters
mu <- c(0, 0) # mean vector
sigma_x <- 2
sigma_y <- 1
rho <- 0.7 # correlation
Sigma <- matrix(c(sigma_x^2, rho*sigma_x*sigma_y,
rho*sigma_x*sigma_y, sigma_y^2), ncol = 2)
# Simulate data
set.seed(123)
n <- 2000
XY <- as.data.frame(mvrnorm(n = n, mu = mu, Sigma = Sigma))
names(XY) <- c("X", "Y")
# Scatter plot with density contours
ggplot(XY, aes(x = X, y = Y)) +
geom_point(alpha = 0.3, color = "steelblue") +
stat_density_2d(aes(z = ..level..), bins = 8,
geom = "polygon", alpha = 0.2, fill = "tomato") +
theme_bw(base_size = 14)
A Marginal Density Function is density function of one random variable from a random vector.
Let \(X_1\) and \(X_2\) be 2 discrete random variables, with a joint distribution function of
\[ p_{X_1,X_2}(x_1, x_2) = P(X_1=x_1, X_2 = x_2). \]
The marginal distribution of \(X_1\) is defined as
\[ p_{X_1}(x_1) = \sum_{x_2}p_{X_1,X_2}(x_1,x_2) \]
Let \(X_1\) and \(X_2\) be 2 continuous random variables, with a joint density function of \(f_{X_1,X_2}(x_1,x_2)\). The marginal distribution of \(X_1\) is defined as
\[ f_{X_1}(x_1) = \int_{x_2}f_{X_1,X_2}(x_1,x_2)dx_2 \]
\[ f_{X,Y}(x,y) \left\{\begin{array}{cc} 2x & 0\le y \le 1;\ 0 \le x\le 1\\ 0 & \mathrm{otherwise} \end{array}\right. \]
Find \(f_X(x)\)
A conditional distribution provides the probability of a random variable, given that it was conditioned on the value of a second random variable.
Let \(X_1\) and \(X_2\) be 2 discrete random variables, with a joint distribution function of
\[ p_{X_1,X_2}(x_1, x_2) = P(X_1=x_1, X_2 = x_2). \]
The conditional distribution of \(X_1|X_2=x_2\) is defined as
\[ p_{X_1|X_2}(x_1) = \frac{p_{X_1,X_2}(x_1,x_2)}{p_{X_2}(x_2)} \]
Let \(X_1\) and \(X_2\) be 2 continuous random variables, with a joint density function of \(f_{X_1,X_2}(x_1,x_2)\). The conditional distribution of \(X_1|X_2=_2\) is defined as
\[ f_{X_1|X_2}(x_1) = \frac{f_{X_1,X_2}(x_1,x_2)}{f_{X_2}(x_2)} \]
Let the joint density function of \(X_1\) and \(X_2\) be defined as
\[ f_{X_1,X_2}(x_1,x_2)=\left\{\begin{array}{cc} 30x_1x_2² & x_1 -1 \le x_2 \le 1-x_1; 0\le x_1\le 1\\ 0 & \mathrm{elsewhere} \end{array}\right. \]
Find the conditional density function of \(X_2|X_1=x_1\).
Random variables are considered independent of each other if the probability of one variable does not affect the probability of another variable.
Let \(X_1\) and \(X_2\) be 2 discrete random variables, with a joint density function of \(p_{X_1,X_2}(x_1,x_2)\). \(X_1\) is independent of \(X_2\) if and only if
\[ p_{X_1,X_2}(x_1,x_2) = p_{X_1}(x_1)p_{X_2}(x_2) \]
Let \(X_1\) and \(X_2\) be 2 continuous random variables, with a joint density function of \(f_{X_1,X_2}(x_1,x_2)\). \(X_1\) is independent of \(X_2\) if and only if
\[ f_{X_1,X_2}(x_1,x_2) = f_{X_1}(x_1)f_{X_2}(x_2) \]
\[ A = \left(\begin{array}{cc} a_1 & 0\\ 0 & a_2 \end{array}\right) \]
\[ \det(A) = a_1a_2 \]
\[ A^{-1}=\left(\begin{array}{cc} 1/a_1 & 0 \\ 0 & 1/a_2 \end{array}\right) \]
\[ \left(\begin{array}{c} X\\ Y \end{array}\right)\sim N \left\{ \left(\begin{array}{c} \mu_x\\ \mu_y \end{array}\right),\left(\begin{array}{cc} \sigma_x^2 & 0\\ 0 & \sigma_y^2 \end{array}\right) \right\} \]
Show that \(X\perp Y\).
\[ f_{X,Y}(x,y)=\det(2\pi\Sigma)^{-1/2}\exp\left\{-\frac{1}{2}(\boldsymbol{w}-\boldsymbol\mu)^T\Sigma^{-1}(\boldsymbol w-\boldsymbol\mu)\right\} \]
where \(\Sigma=\left(\begin{array}{cc}\sigma_y^2 & 0\\0 & \sigma_y^2\end{array}\right)\), \(\boldsymbol \mu = \left(\begin{array}{cc}\mu_x\\ \mu_y \end{array}\right)\), and \(\boldsymbol w = \left(\begin{array}{cc} x\\ y \end{array}\right)\)
Let \(X_1, X_2, \ldots,X_n\) be a set of random variables, the expectation of a function \(g(X_1,\ldots, X_n)\) is defined as
\[ E\{g(X_1,\ldots, X_n)\} = \sum_{x_1\in X_1}\cdots\sum_{x_n\in X_n}g(X_1,\ldots, X_n)p(x_1,\ldots,x_n) \]
or
\[ E\{g(\boldsymbol X)\} = \int_{x_1\in X_1}\cdots\int_{x_n\in X_n}g(\boldsymbol X)f(\boldsymbol X)dx_n \cdots dx_1 \]
Let \(X_1,\ldots,X_n\) and \(Y_1,\ldots,Y_m\) be random variables with \(E(X_i)=\mu_i\) and \(E(Y_j)=\tau_j\). Furthermore, let \(U = \sum^n_{i=1}a_iX_i\) and \(V=\sum^m_{j=1}b_jY_j\) where \(\{a_i\}^n_{i=1}\) and \(\{b_j\}_{j=1}^m\) are constants. We have the following properties:
\(E(U)=\sum_{i=1}^na_i\mu_i\)
\(Var(U)=\sum^n_{i=1}a_i^2Var(X_i)+2\underset{i<j}{\sum\sum}a_ia_jCov(X_i,X_j)\)
\(Cov(U,V)=\sum^n_{i=1}\sum^m_{j=1}Cov(X_i,Y_j)\)
Let \(X_1\) and \(X_2\) be two random variables, the conditional expectation of \(g(X_1)\), given \(X_2=x_2\), is defined as
\[ E\{g(X_1)|X_2=x_2\}=\sum_{x_1}g(x_1)p(x_1|x_2) \]
or
\[ E\{g(X_1)|X_2=x_2\}=\int_{x_1}g(x_1)f(x_1|x_2)dx_1. \]
Furthermore,
\[ E(X_1)=E_{X_2}\{E_{X_1|X_2}(X_1|X_2)\} \]
and
\[ Var(X_1) = E_{X_2}\{Var_{X_1|X_2}(X_1|X_2)\} + Var_{X_2}\{E_{X_1|X_2}(X_1|X_2)\} \]
Let \(X_1\) and \(X_2\) be 2 random variables with mean \(\mu_1\) and \(\mu_2\), respectively. The covariance of \(X_1\) and \(X_2\) is defined as
\[ \begin{eqnarray*} Cov(X_1,X_2) & = & E\{(X_1-\mu_1)(X_2-\mu_2)\}\\ & =& E(X_1X_2)-\mu_1\mu_2 \end{eqnarray*} \]
If \(X_1\) and \(X_2\) are independent random variables, then
\[ Cov(X_1,X_2)=0 \]
The correlation of \(X_1\) and \(X_2\) is defined as
\[ \rho = Cor(X_1,X_2) = \frac{Cov(X_1,X_2)}{\sqrt{Var(X_1)Var(X_2)}} \]