Nevus AI Research & Development
Nevus AI Technology
The AI Models
Nevus AI uses two AI models to classify an image of a mole as either low or high risk of melanoma:
The Training Dataset
Both AI models have been separately trained using a dataset of over 21,000 images of moles. Half of them classified as benign & the other half as malignant of melanoma.
The Testing Dataset
Both AI models were trained on another set of 4,000 images of moles, equally divided into benign and malignant of melanoma.
Nevus AI Accuracy
The Confusion Matrix
The results are based on the average of multiple tests for each AI model.
TP (True Positive) are the number of mole images predicted by the AI model to be of high risk of melanoma and were classified as high risk.
DalNet 1 predicted on average, 1955.5 images of moles correctly as high risk of melanoma.
ResNet 290 predicted on average, 1878.5 images of moles correctly as high risk of melanoma.
FP (False Positive) are the number of moles images predicted by the AI model to be of high risk of melanoma but were classified as low risk.
DalNet 1 predicted on average, 33.5 images of moles incorrectly as high risk of melanoma.
ResNet 290 predicted on average, 94.3 images of moles incorrectly as high risk of melanoma.
TN (True Negative) are the number of mole images predicted by the AI model to be of low risk of melanoma and were classified as low risk.
DalNet 1 predicted on average, 1966.5 images of moles correctly as low risk of melanoma.
ResNet 290 predicted on average, 1905.7 images of moles correctly as low risk of melanoma.
FN (False Negative) are the number of moles images predicted by the AI model to be of low risk of melanoma but were classified as high risk.
DalNet 1 predicted on average, 44.5 images of moles incorrectly as low risk of melanoma.
ResNet 290 predicted on average, 121.5 images of moles incorrectly as low risk of melanoma.
These test results are fed into the confusion metrics to determine the accuracy of both AI models.
The Confusion Metrics
What is Accuracy?
The proportion of all correct predictions (both true positives and true negatives) out of all predictions made. Simply put: (TP + TN)/(TP + TN + FP + FN)
DalNet 1 has an accuracy score of 98.05%
ResNet 290 has an accuracy score of 94.61%
What is Precision?
The proportion of true positive predictions among all positive predictions made. In other words, when the model predicts "yes," how often is it correct? Formula: TP/(TP + FP)
DalNet 1 has a precision score of 98.32%
ResNet 290 has a precision score of 95.22%
What is Recall?
Also known as Sensitivity or True Positive Rate, is the proportion of actual positive cases that were correctly identified as positive by the model. In other words, out of all the cases that are actually positive, how many did the model correctly catch? Formula: TP/(TP + FN)
DalNet 1 has a recall score of 97.78%
ResNet 290 has a recall score of 93.93%
What is F1 Score?
The harmonic mean of precision and recall, providing a single score that balances both metrics. Formula: 2 × (Precision × Recall)/(Precision + Recall)
DalNet 1 has a F1 Score of 98.04%
ResNet 290 has a F1 score of 94.57%
What is MCC (Matthews Correlation Coefficient):
A balanced measure that works well even with imbalanced classes. It considers all four confusion matrix categories and returns a value between -1 and +1. Formula: (TP × TN - FP × FN)/√((TP + FP)(TP + FN)(TN + FP)(TN + FN))
DalNet 1 has a MCC Score of 0.96
ResNet 290 has a MCC score of 0.89