Projects / DataCleaner / Releases

All releases of DataCleaner

  •  16 May 2011 10:09
Avatar

    Release Notes: Quick filtering of datastores was added. Reference data for countries is now provided. Minor UI improvements were made. Support was added for adding extension packages. A command line interface for executing jobs was added. Number formatting options were added in the "Convert to Number" transformer.

    •  04 Apr 2011 08:31
    Avatar

      Release Notes: Window management was simplified by making most operations available through the single job builder window. Jobs are now stoppable before they have finished. Bar and line charts have been added to a lot of analyzer results. Preview data now contains paging controls to browse further into the data. Most common database drivers are included by default. Various minor improvements and bugfixes were made.

      •  07 Mar 2011 13:10
      Avatar

        Release Notes: First-time ease of use was improved by disabling all buttons before source data is selected. When possible in a job, filters now have the ability to optimize the query of a job. This was implemented for the "Max rows", "Equals", and "Not null" filters. The visualization of execution flow now allows removing column items and filter outcome items, making the graph more comprehensible, especially for very large jobs. A bug was fixed when passing null values to the the email standardizer. "Mixed" tokens are properly presented in the the Pattern finder.

        •  21 Feb 2011 12:04
        Avatar

          Release Notes: Minor bugfixes and improvements were made. Filter outcomes were added to the flow visualization. A bug was fixed in the widget for selecting the tokenizer's tokens. The "Equals" filter can now have multiple values with which to compare. Some minor cosmetic improvements were made.

          •  14 Feb 2011 08:29
          Avatar

            Release Notes: Data transformations can be used to preprocess, extract, refine, combine, and calculate data items in jobs. Filtering, sampling, and subflow management allow you to define criteria that exclude and include particular items of data. Reporting was enriched with charts, graphs, navigation trees, etc. New DQ functions were added for date gap analysis, phonetic similarity finding, synonym lookups, etc. More options and DQ measures were added for existing data quality functions like the pattern finder, string analyzer, and more. Profiling jobs can be reused, so you can define your processing flow once and run it on any data. Support for MS Excel 2007+ spreadsheets was added.

            •  15 May 2010 11:27
            Avatar

              Release Notes: The MetaModel version was updated to 1.2, which adds support for two new datastores: dBase databases (.dbf files) and MS Access databases (.mdb files). A bug pertaining to text-file dictionary "file not found" errors was fixed. A lot of the other underlying libraries have been updated, providing improvements to performance and stability.

              •  18 Oct 2009 15:52
              Avatar

                Release Notes: Improved Excel spreadsheet support. Improved SQL Server support. Improved performance for CSV files. A fix for a bug that caused certain database connection errors to be ignored in terms of user feedback. A fix for a bug that caused re-opening of database dictionaries to throw an NPE. A fix for a bug related to dictionary lookups of null values. Support for Teradata databases. Connection templates for SQL Server connections. Selection of file encoding when reading CSV files. A fix for a minor bug relating to reading files on the classpath when running in Java WebStart mode.

                •  14 Jul 2009 08:47
                Avatar

                  Release Notes: Memory use of the Value Distribution Profile was improved. It now does on-disk caching with the Berkeley db when necessary. The app is now a single JAR file that can be served through Java WebStart. The app automatically downloads regexes from the RegexSwap. A bug in matching number columns in dictionaries was fixed. A bug with invalid characters in XML formats was fixed. Suffix case is now ignored so that both .CSV and .csv files can be opened. The number of columns shown in the preview window is automatically restricted if there are too many to show on the screen.

                  •  20 Apr 2009 17:55
                  Avatar

                    Release Notes: An additional HTML export format has been added to the built-in export formats (usable when exporting Profiler results in the desktop app and when executing the runjob command-line tool). The export format can be chosen directly from the desktop app. Four new measures were added to the String Analysis profile: average characters and maximum/minimum/average white spaces.

                    •  15 Mar 2009 18:55
                    Avatar

                      Release Notes: The license was changed to LGPL. The profiler and validator can be executed using multiple threads. DataCleaner tasks can be executed from the command line for batch operation. More elaborate status information is given during profiler and validator execution. Date mask matcher and regex matcher profiles were added. A regex is loaded from the online RegexSwap repository. Popular database drivers are automatically downloaded and installed. More file types are supported, such as .dat and .txt. XML file support was improved. Memory improvements were made in the Time analysis profile. Logging when running profiling and validation was improved. An information schema is provided for file-based datastores. Columns in the datastore-tree are lazy-loaded.

                      Screenshot

                      Project Spotlight

                      episoder

                      A tool to tell you about new episodes of your favourite TV shows.

                      Screenshot

                      Project Spotlight

                      BalanceNG

                      A modern software IP load balancer.