tblutils is a collection of several utilities for working with tabular text files: data written in plain text, with one row per line and columns separated by a common character (usually TAB or semicolon). It complements the usual Unix tools like cut and paste by providing enhanced versions that support column labels through-out, so that you can extract columns by name (tblcut), filter data using a mathematical expression (tblfilter), re-order columns without caring about the column index (tblcsort), join multiple files on a common index without having to pre-sort them (tblmerge), and much more.
Concerted is a highly scalable multivariate in-memory data storage library. It has built in support for multivariate queries and performs well over many styles of workloads. The data is read and stored in various data structures, and the data from data structures written to disk. It is fully ACID compliant, has APIs that abstract details from the user, and provides an easy to use interface. Many different types of indexes are provided along with various systems that provide access for data analytics.
RecDB is a recommendation engine built entirely inside PostgreSQL 9.2. It allows application developers to build recommendation applications using a wide variety of built-in recommendation algorithms such as user-user collaborative filtering, item-item collaborative filtering, and singular value decomposition. Applications powered by RecDB can produce online and flexible personalized recommendations to end-users. It is easily used and configured and allows novice developers to define a variety of recommenders that fits their application's needs in few lines of SQL. It can seamlessly integrate recommendation functionality with traditional database operations.
statsmodels is a Python package which provides a complement to scipy for statistical computations, including descriptive statistics and estimation of statistical models. The main included model categories are linear, discrete, generalized linear, and robust linear, and, in time series analysis, AR, ARMA, and VAR. It also includes statistical tests mainly for regression diagnostics. statsmodels was renamed from scikits.statsmodels.
scikits.statsmodels is a Python package which provides a complement to scipy for statistical computations, including descriptive statistics and estimation of statistical models. The main included model categories are linear, discrete, generalized linear, and robust linear models, and, in time series analysis, AR, ARMA, and VAR. It also includes statistical tests mainly for regression diagnostics.
Kst is a fast real-time large-dataset viewing and plotting tool with built-in data analysis functionality. It contains many powerful built-in features and is expandable with plugins and extensions. It features powerful keyboard and mouse plot manipulation, a large selection of built-in plotting and data manipulation functions (such as histograms, equations, and power spectra), built-in filtering and curve fitting capabilities, a convenient command-line interface, a powerful graphical user interface with non-modal dialogs for an optimized workflow, support for several popular data formats, extended annotation objects similar to vector graphics applications, and high-quality export to bitmap or vector formats,
The CloverETL Profiler, software for data profiling, works to examine data in existing data sources and collect statistical information about said data. Through this process, the profiler eliminates the guesswork in analysis, revealing an understanding of what data actually exists and what needs to be improved. Part of the CloverETL Data Integration family, the Profiler is an added tool to the enhanced toolset. Whether employed as a standalone job or part of a greater project, the CloverETL Profiler, as named, operates with the same Engine as CloverETL, offering high performance, speed, and easy deployment in a data environment.
Cubes is a Python framework for online analytical processing (OLAP), multidimensional analysis, star and snowflake schema denormalization, and cube comptutation. It features a logical model that describes how data are being analyzed and reported, independent of physical data implementation, hierarchical dimensions (attributes that have hierarchical dependencies, such as category-subcategory or country-region), localizable metadata and data localization.