Version 2.0.2 of DataCleaner

Release Notes: First-time ease of use was improved by disabling all buttons before source data is selected. When possible in a job, filters now have the ability to optimize the query of a job. This was implemented for the "Max rows", "Equals", and "Not null" filters. The visualization of execution flow now allows removing column items and filter outcome items, making the graph more comprehensible, especially for very large jobs. A bug was fixed when passing null values to the the email standardizer. "Mixed" tokens are properly presented in the the Pattern finder.

Other releases

Release Notes: This release adds a new filter for performing Change Data Capture, makes execution of jobs queued to avoid concurrent execution issues, and adds several minor bugfixes and improvements.

Release Notes: A major milestone for the data quality monitoring Web application: the addition of connectivity to Salesforce and SugarCRM. Addition of wizards and other user experience improvements. Enables clustered execution of jobs. New data visualization extension and a national identifier validation extension. Adds Pentaho Data Integration job scheduling and execution.

Release Notes: A Web service was added to the monitoring application for getting a (list of) metric values. The 'Table lookup' component has been improved by adding join semantics as a configurable property. The EasyDQ components have been upgraded, adding further configuration options and a richer deduplication result interface. Performance improvements have been a specific focus of this release. Improvements have been made in the engine of DataCleaner to further utilize a streaming processing approach in certain corner cases which was not covered previously.

  •  04 Jan 2013 21:50

Release Notes: The date and time related analysis options have been expanded, adding distribution analyzers for week numbers, months, and years. An optional "descriptive statistics" option has been added to the Number analyzer and the Date/time analyzer The lines in the timeline charts of the monitoring Web application now have small dots in them. Two new transformers have been added for generating UUIDs and for generating timestamps. Now ad hoc queries can contain DISTINCT clauses, *-wildcards, and subqueries, and are fault-tolerant towards text-case issues.

  •  18 Dec 2012 03:20

    Release Notes: Data Quality KPIs can now be defined as formulas (mathematical expressions), not just raw metrics. It is now possible to fire ad-hoc SQL queries towards all datastores (DB, CSV, Excel, and more). A new analysis option, the Value matcher, was added. With this analysis, it's easy to identify unexpected values in a field. Management of jobs, including copying and deleting jobs, has been made a lot easier by exposing the functionality directly in the UI. It has been made possible to change historic data quality metrics in order to reposition results into the timeline.

    Screenshot

    Project Spotlight

    QSMM

    A framework for development of intelligent systems with spur-driven behavior.

    Screenshot

    Project Spotlight

    libtld

    A library to extract a TLD from a URI and check email validity.