The accuracy formula helps to know the errors in the measurement of values. If the measured value is equal to the actual value then it is said to be highly accurate and with low errors. Accuracy and error rate are inversely related. High accuracy refers to low error rate, and high error rate refers to low accuracy. The accuracy formula gives the accuracy as a percentage value, and the sum of accuracy and error rate is equal to 100 percent.
What is Accuracy Formula?
The accuracy formula provides accuracy as a difference of error rate from 100%. To find accuracy we first need to calculate the error rate. And the error rate is the percentage value of the difference of the observed and the actual value, divided by the actual value.
Accuracy = 100% – Error Rate
Error Rate = |Observed Value – Actual Value|/Actual Value × 100
Solved Examples on Accuracy Formula
Example 1: The length of a rectangular box is 1.2 meters, but it was measured with tape, and the length was measured as 1.22 meters. Find the accuracy of measurement.
Given the length of the rectangular box = 1.20 meters
The measured length of the rectangular box = 1.22 meters
Example 2: A measuring tape can measure with an accuracy of 99.8%. What is the possible range of length which can be obtained by using this measuring tape, to measure a cloth of length 2 meters?
The given accuracy of the measuring tape = 99.8%
The error rate for the measurement = 100% – 99.8% = 0.2%
The length of the cloth = 2 meters
The new measurement using this measuring tape =
Maximum value of the measurement would be 2m + 0.004 = 2.004m
Minimum value of the measurement would be 2m – 0.004m = 1.996m
Answer: Hence the range of measures that can be obtained is from 1.996m to 2.004m.
Accuracy, Precision, Recall or F1?
Often when I talk to organizations that are looking to implement data science into their processes, they often ask the question, “How do I get the most accurate model?”. And I asked further, “What business challenge are you trying to solve using the model?” and I will get the puzzling look because the question that I posed does not really answer their question. I will then need to explain why I asked the question before we start exploring if Accuracy is the be-all and end-all model metric that we shall choose our “best” model from.
So I thought I will explain in this blog post that Accuracy need not necessary be the one-and-only model metrics data scientists chase and include simple explanation of other metrics as well.
Firstly, let us look at the following confusion matrix. What is the accuracy for the model?
Very easily, you will notice that the accuracy for this model is very very high, at 99.9%!! Wow! You have hit the jackpot and holy grail (*scream and run around the room, pumping the fist in the air several times*)!
But….(well you know this is coming right?) what if I mentioned that the positive over here is actually someone who is sick and carrying a virus that can spread very quickly? Or the positive here represent a fraud case? Or the positive here represents terrorist that the model says its a non-terrorist? Well you get the idea. The costs of having a mis-classified actual positive (or false negative) is very high here in these three circumstances that I posed.
OK, so now you realized that accuracy is not the be-all and end-all model metric to use when selecting the best model…now what?
Precision and Recall
Let me introduce two new metrics (if you have not heard about it and if you do, perhaps just humor me a bit and continue reading? 😀 )
So if you look at Wikipedia, you will see that the the formula for calculating Precision and Recall is as follows:
Let me put it here for further explanation.
Let me put in the confusion matrix and its parts here.
Great! Now let us look at Precision first.
What do you notice for the denominator? The denominator is actually the Total Predicted Positive! So the formula becomes
True Positive + False Positive = Total Predicted Positive
Immediately, you can see that Precision talks about how precise/accurate your model is out of those predicted positive, how many of them are actual positive.
Precision is a good measure to determine, when the costs of False Positive is high. For instance, email spam detection. In email spam detection, a false positive means that an email that is non-spam (actual negative) has been identified as spam (predicted spam). The email user might lose important emails if the precision is not high for the spam detection model.
So let us apply the same logic for Recall. Recall how Recall is calculated.
True Positive + False Negative = Actual Positive
There you go! So Recall actually calculates how many of the Actual Positives our model capture through labeling it as Positive (True Positive). Applying the same understanding, we know that Recall shall be the model metric we use to select our best model when there is a high cost associated with False Negative.
For instance, in fraud detection or sick patient detection. If a fraudulent transaction (Actual Positive) is predicted as non-fraudulent (Predicted Negative), the consequence can be very bad for the bank.
Similarly, in sick patient detection. If a sick patient (Actual Positive) goes through the test and predicted as not sick (Predicted Negative). The cost associated with False Negative will be extremely high if the sickness is contagious.
Now if you read a lot of other literature on Precision and Recall, you cannot avoid the other measure, F1 which is a function of Precision and Recall.
F1 Score is needed when you want to seek a balance between Precision and Recall. Right…so what is the difference between F1 Score and Accuracy then? We have previously seen that accuracy can be largely contributed by a large number of True Negatives which in most business circumstances, we do not focus on much whereas False Negative and False Positive usually has business costs (tangible & intangible) thus F1 Score might be a better measure to use if we need to seek a balance between Precision and Recall AND there is an uneven class distribution (large number of Actual Negatives).
I hope the explanation will help those starting out on Data Science and working on Classification problems, that Accuracy will not always be the metric to select the best model from.
Our accuracy calculator is a simple tool that allows you to compute accuracy using three different methods. While the first two methods are widely used in the evaluation of diagnostic tests, the third one can be applied to a wide range of sciences ⚗️
A few minutes spent on the article below will teach you the use of accuracy in statistics, the fundamental differences between accuracy and precision, as well as all the formulas used in accuracy calculations.
How to use the accuracy calculator?
Calculating accuracy requires different solutions for different problems. Carefully read carefully the instruction below and decide which method is the best for your situation:
Take a look at your data:
Are you calculating the accuracy of a diagnostic test?
If the ratio of patients with the disease and patients without the disease does reflect the prevalence of the illness, use the standard method #1.
If the ratio of patients with the disease and patients without the disease does not reflect the prevalence of the illness, use the prevalence method #2.
Are you trying to find accuracy using simple percent error?
- Use the percent error method #3.
|💡 Our calculator will automatically calculate the sensitivity and specificity of a test needed for method #2 once you enter all the values of true negative/positive and false negative/positive for method #1.
How to calculate accuracy percentage?
Follow our simple tutorial to learn how to measure accuracy in all possible situations:
Standard accuracy equation for a diagnostic test, used in method #1
Used if the ratio of patients with the disease (true positive, false negative) and patients without the disease (true negative, false positive) reflects the prevalence of the illness.
Accuracy = (TP + TN) / (TP + TN + FP + FN)
- TP – true positive;
- TN – true negative;
- FP – false positive; and
- FN – false negative.
Formula for calculating accuracy based on prevalence – method #2
Accuracy = ((Sensitivity)* (Prevalence)) + ((Specificity)* (1 - Prevalence))
Sensitivity = TP / (TP + FN), given in %;
Specificity = TN / (FP + TN), given in %; and
- Prevalence – the amount of population that has the disease at a specific time, given in %.
Percent error/ percent accuracy formula – method #3
Percent error = (|(Vo - Vₐ)|/Vₐ) * 100
- Vo – observed value;
- Va – value accepted as truth; and
- |(Vo – Vₐ)| – is the absolute, non-negative value.
This informs us about the accuracy of a reading – how much the observed value derives from the truth.
The greater the error, the lower the accuracy.
Accuracy calculation example:
We’re trying out our new thermometer. Our measured temperature is equal to 95°F. We know that our average temperature is equal to 97.8°F.
Let’s use method #3:
- Observed value: 95
- Accepted value: 98.7
Percent error = (|95 - 97.8| / 97.8) * 100 = (2.8 / 97.8) * 100 = 0.0286 * 100 = 2.86%
Accuracy vs. precision
Accuracy measures how close a given value is to the truth (or the value agreed on and confirmed by many scientists).
Precision measures how close the given measurements are to each other. In other words, it describes how much a given result repeats – its reproducibility.
What is accuracy in chemistry?
Accuracy in chemistry requires calibration. The analytic method has to be first compared against a known standard.
The standard of a given substance must be pure, must not contain any water molecules, and must be stable.
Calibration is the process of comparing the results obtained with our device against the device of known and confirmed quality. Titration can be a nice example of the calibration process.
Accuracy is one metric for evaluating classification models. Informally, accuracy is the fraction of predictions our model got right. Formally, accuracy has the following definition:
For binary classification, accuracy can also be calculated in terms of positives and negatives as follows:
Where TP = True Positives, TN = True Negatives, FP = False Positives, and FN = False Negatives.
Let’s try calculating accuracy for the following model that classified 100 tumors as malignant (the positive class) or benign (the negative class):
|True Positive (TP):
|False Positive (FP):
|False Negative (FN):
|True Negative (TN):
Accuracy comes out to 0.91, or 91% (91 correct predictions out of 100 total examples). That means our tumor classifier is doing a great job of identifying malignancies, right?
Actually, let’s do a closer analysis of positives and negatives to gain more insight into our model’s performance.
Of the 100 tumor examples, 91 are benign (90 TNs and 1 FP) and 9 are malignant (1 TP and 8 FNs).
Of the 91 benign tumors, the model correctly identifies 90 as benign. That’s good. However, of the 9 malignant tumors, the model only correctly identifies 1 as malignant—a terrible outcome, as 8 out of 9 malignancies go undiagnosed!
While 91% accuracy may seem good at first glance, another tumor-classifier model that always predicts benign would achieve the exact same accuracy (91/100 correct predictions) on our examples. In other words, our model is no better than one that has zero predictive ability to distinguish malignant tumors from benign tumors.
Accuracy alone doesn’t tell the full story when you’re working with a class-imbalanced data set, like this one, where there is a significant disparity between the number of positive and negative labels.
In the next section, we’ll look at two better metrics for evaluating class-imbalanced problems: precision and recall.