Different F1 score for sequence tagging

May 18, 2022 1 minute read

F1 score is the most common evaluation metric for sequence tagging. It is the harmonic mean of precision and recall. But more often we see that there are different variants of the F1 score used in sequence tagging or sequence labeling tasks. Like

Micro F1 score
Macro F1 score
Weighted F1 score

In this blog post, I will explain what these F1 score is and how to calculate them.

Micro F1 score

In micro F1 score, we need to sum up all class true positive(TP), false positive(FP), and false negative(FN) and calculate the global F1 score.

Suppose we have three classes A, B, and C.

Class	TP	FP	FN
A	10	0	0
B	5	10	0
C	0	0	10
Total	15	10	10

So, the micro F1 calculation equation is:

$Figure$

Macro F1 score

Suppose we have three classes, A, B, and C. To calculate the macro F1 score, we need to calculate the F1 score for each class and then average them.

Class	Per-class F1 score
A	0.8
B	0.5
C	0.7

So macro calculation equation is: ,

$Figure$

Weighted F1 score

The weighted F1 score depends on two parameters

Each class F1 score
Support of each class

Support: Support is the number of actual occurrences of the class in the datasets. The support value for A is 10 means that there are only 10 A labels in the dataset.

Class	Per-class F1 score	Support	Support Percentage
A	0.8	3	0.3 (3/total support)
B	0.5	1	0.1
C	0.7	6	0.6
Total	-	10	-

So, the weighted F1 calculation equation is:

$Figure$

Twitter Facebook LinkedIn

Sagor Sarker

Different F1 score for sequence tagging

Micro F1 score

Macro F1 score

Weighted F1 score

Comments

You May Also Enjoy

Large text data token counting fast

Processing tips of huggingface datasets

Process large CSV file using pandas

Large CSV file multiprocessing