If we model a set of observations as a random sample from an unknown joint probability distribution which is express in terms of s set of parameters. The goal of maximum likelihood estimation is to determine the parameters for which the observed data have the highest joint probability.

We write the parameters governing the joint distribution as a vector so that this distribution falls within a parametric family , where is called the parameter space, a finite-dimensional subset of Euclidean space. Evaluating the joint density at the observed data sample gives a real-valued function

which is called the likelihood function. For i.i.d. random variables, will be the product of univariate (that is, only one variable) PDF:

The goal of maximum likelihood estimation is to find the values of the model parameters that maximize the likelihood function over the parameter space, that is

Tip

Sometimes we use the log likelihood function, that is

Invariant Property