RSS 1 project tagged "Record linkage"

No download No website Updated 16 Feb 2014 Duke

Screenshot
Pop 236.98
Vit 12.57

Duke is a fast and flexible record linkage engine. It does not use the traditional blocking (sort by key) approach, but instead relies on Lucene. This makes it high-performance (able to process 1,000,000 records in ~10 minutes). Duke can be run from the command line, but also has an API allowing incremental linking applications to be built easily. It supports reading data from CSV, JDBC, SPARQL, and NTriples, and also supports a number of string comparators and string normalizers.

Screenshot

Project Spotlight

GNU Parallel

Software to build and execute shell command lines from standard input in parallel.

Screenshot

Project Spotlight

MDIUtilities

A set of utility classes that can be used for Desktop application development.