Languages and translations
English
File type1
2b.3 Identifying and mitigating misclassification_Canada.pdf (application/pdf, 998.45 KB)
While the application of Supervised Machine Learning (ML) to automate the classification of alternative data for official price indices has been widely demonstrated, the impact of misclassification within the ML lifecycle, from initial annotation of the training data to retraining models due to data drift, has been understudied in the literature. To support National Statistical Offices in understanding how to apply ML to support at-scale production needs, our research provides an empirical case study of how misclassification could be present at major stages of a ML lifecycle, its impact on elementary price indices and ways it can be mitigated through model retraining or validation processes.