Duke is a fast and flexible record linkage engine. It does not use the traditional blocking (sort by key) approach, but instead relies on Lucene. This makes it high-performance (able to process 1,000,000 records in ~10 minutes). Duke can be run from the command line, but also has an API allowing incremental linking applications to be built easily. It supports reading data from CSV, JDBC, SPARQL, and NTriples, and also supports a number of string comparators and string normalizers.
Querydsl is a framework that enables the construction of statically typed SQL-like queries. Instead of writing queries as inline strings or externalizing them into XML files, they can be constructed via a fluentDSL/API like Querydsl. It supports JPA, JDO, Java Collections, SQL via JDBC, Lucene, and Hibernate Search.
jobhuntin is a recruitment portal engine. For job candidates, it features simple file upload, automatic parsing of resume text, a rich resume gallery, anonymous job searching, weekly job match reports, and advanced job filtering. For consultants, it features powerful searches, email alerts for new matches, the ability to save candidate lists, mass mailing, and the ability to post jobs.