Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
(From Github PR description)
This implementation computes a simple gaussian classifier, e.g. it outputs the respective parameters which are needed for classifying.
As input, the function basically just receives a feature matrix, and a target vector (and some small value for smoothing along the variances, to prevent numerical errors).
The function computes and returns (per class):
- prior probability
- means
- determinants
- inverse covariance matrix
For classifying one can compute: p(C=c | x) = p(x | c) * p(c)
where p(x | c) is the (multivariate) Gaussian PDF for class c, and p(c) is the prior probability for class c.
–
One thing where I was quite unsure was the unit tests. Since calculating determinants and the inverse of the covariance matrices can lead to floating point errors, I was not quite sure how to compare the results. I did compare most of them, as suggested in the mailing list, with the avg. bit distance, with a quite high maxUnitsOfLeastPrecssion.
Although the values from the inverse covariance matrices can differ a lot (systemDS vs R), i am pretty sure that the computation is correct, since multiplying it with the covariance matrix itself, leads to the identity (which I tested during development).
Attachments
Issue Links
- duplicates
-
SYSTEMDS-1993 Implementation of Gaussian Process Classification
- Resolved
- links to