The Direct Maximum Likelihood Approach for Labels is a method to directly map data to class labels. This approach requires labeled data, denoted as where is the class label. It operates under the assumption that the class-conditional probability , is normally distributed, and for a two-class problem, the labels are:
Maximizing Likelihood
The objective is to find the model parameters ( that maximize the label-data-likelihood function, which is given as:
Computational Tricks
Instead of maximizing the likelihood function directly, we maximize the logarithm. We also remove all parameters that don’t depend on the specific parameter that’s being optimized. (Valid because these terms will become zero during differentiation).
For instance, to find the optimal parameter ,
Where,
-
is the proportion of data belonging to the first class (prior probability for class or )
-
is the total number of data points
-
is the total count of data points in class
-
is the label for the th data point, . (One of K Encoding Scheme)
Example: if out of data points in your training set belong to class , the formula gives you: