Francesco Stranieri

Student

Programmer

Francesco Stranieri

Student

Programmer

Machine Learning aided Record Linkage

  • Date: 02/2020
See Demo

Record Linkage is the process of finding records in one or more datasets that refer to the same entity across different data sources. Traditionally, it is done by applying comparison rules between pairs of attributes from each dataset. In this project we investigate some possible Machine Learning applications to Record Linkage (and Data deduplication) by doing two different experiments. In the first experiment, we compare two different Indexing methods (Full vs Block) to figure out which one works better for our models in terms of time and quality of results. In the second experiment, we try to investigate the re-usability of the models we create, training on a dataset and testing on the other, and their viability in real-case scenarios.

%d blogger hanno fatto clic su Mi Piace per questo: