What exactly the ROC curve can tell us or can be inferred?

I wrote some codes to run a linear discriminant analysis based classification:

%%Construct a LDA classifier with selected features and ground truth information
LDAClassifierObject = ClassificationDiscriminant.fit(featureSelcted, groundTruthGroup, 'DiscrimType', 'linear');
LDAClassifierResubError = resubLoss(LDAClassifierObject);

Thus, I can get

Resubstitution Error of LDA (Training Error): 1.7391e-01
Resubstitution Accuracy of LDA: 82.61%
Confusion Matrix of LDA:
14 3
1 5

Then I run a ROC analysis for the LDA classifier:

% Predict resubstitution response of LDA classifier
[LDALabel, LDAScore] = resubPredict(LDAClassifierObject);
% Fit probabilities for scores (the groundTruthGroup contains lables either 'Good' or 'Bad')
[FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthGroup(:,1), LDAScore(:,1), 'Good');

I have got:

OPTROCPT =      0.1250    0.8667

Therefore, we can get:

Accuracy of LDA after ROC analysis: 86.91%
Confusion Matrix of LDA after ROC analysis:
13 1
2 7

My questions are:

1. After ROC analysis we obtained a better accuracy, when we report the accuracy of the classifier, which value we should use? What exactly the ROC curve can tell us or can be inferred? Can we say after ROC analysis we found a better accuracy of the LDA classifier?

2. Why the ROC can produce a better accuracy for the classifier, but the original ClassificationDiscriminant.fit can’t?

3. I have also done a cross validation for the LDA classifier, like

cvLDAClassifier                    = crossval(LDAClassifierObject, 'leaveout', 'on');

Then how to get the ROC analysis for the cross validation? ‘resubPredict’ method seems only accept ‘discriminant object’ as input, then how can we get the scores?

4. classperf function of Matlab is very handy to gather all the information of the classifier, like

%%Get the performance of the classifier
LDAClassifierPerformace = classperf(groundTruthGroup, resubPredict(LDAClassifierObject));

However, anyone knows how to gather these information such as accuracy, FPR, etc. for the cross validation results?

Thanks very much. I am really looking forward to see the reply to above questions.


Matlabsolutions.com provide latest MatLab Homework Help,MatLab Assignment Help , Finance Assignment Help for students, engineers and researchers in Multiple Branches like ECE, EEE, CSE, Mechanical, Civil with 100% output.Matlab Code for B.E, B.Tech,M.E,M.Tech, Ph.D. Scholars with 100% privacy guaranteed. Get MATLAB projects with source code for your learning and research.

1. You can report anything you like as long as you report an estimate obtained by cross-validation or using an independent test set. You can fine-tune a classifier on the training set, but then its accuracy measured on the same set is biased up.

2. Sure, you get a different accuracy by using a different threshold for assigning into the positive class.

3. All loss methods for classifiers return by default the classification error, not the mean squared error. This is stated in many places in the doc.

4. You have code in your post to obtain a ROC by from resubstitution predictions. Just replace resubPredict with kfoldPredict.

5. Any estimate of classification performance should be obtained using data not used for training. Otherwise the estimate is optimis




Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store