Opendedup is a deduplication-based filesystem and block device designed to provide inline deduplication and flexiblity for applications. It benefits services such as backup, archiving, NAS storage, and Virtual Machine primary and secondary storage, and can be deployed in both a stand alone and distributed, multi node configuration. Standalone, it provides inline deduplication, replication, and unlimited snapshot capabilities. In a multi-node configuration, it adds global intra-volume deduplication, block storage redundancy, and block storage expandability. and will store and share unique data blocks with other volumes within the cluster. These volumes can also specify a level of redundancy for data stored in the cluster.
GlusterFS is a clustered filesystem capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system. GlusterFS is based on a stackable user space design without compromising performance. It allows access via Swift API, SMB, NFSv3, QEMU/KVM, OpenStack Compute, OpenStack Block Storage, Xen, CloudStack, HDFS API, oVirt, and more, all in a unified backend which enables multiple, simultaneous access points to the same data stores.
S3QL is a file system that stores all its data online. It supports Amazon S3, Google Storage, and OpenStack and effectively provides you with a hard disk of dynamic, infinite capacity that can be accessed from any computer with Internet access. S3QL provides a standard, full featured Unix file system that is conceptually indistinguishable from any local file system. Additional features include compression, encryption, data de-duplication, immutable trees, and snapshotting, which make it especially suitable for online backup and archiving. The design favors simplicity and elegance over performance and feature-creep. Care has been taken to make the source code as readable and serviceable as possible. Solid error detection, error handling, and extensive automated test cases are provided.
io-util is a small, scalable Java library for slicing and dicing fixed width tables on disk. The objective is to provide reusable blocks of code for building efficient, custom binary data stores. It allows you to build, search, and maintain a large, externally stored, fixed width, sorted table. The library user specifies the row width (in bytes), a row comparison function (which implicitly defines any given row's key), and an optional delete codec.
btier is a Linux kernel module that creates an auto tiering block device. It can be used to aggregate various types of storage into a virtual block device. btier will automatically optimize data placement on the underlying devices according to a policy that can be set. The size of a btier device is equal to the combined size of all block devices that were assigned to it. Only a small amount of space is used to store the metadata of the device. A btier device can contain up to 16 physical devices or files. Next to the built-in data migration engine, btier also provides an user space API the allows user to write custom data migration engines. Python, C, and bash example code is included. btier can use raw devices or (sparse) files (even hard mounted NFS) as part of the tiering device. The last tier can therefore reside on a deduplicating or compressing filesystem when needed. The devices that are used with btier should be redundant, since a btier device will lose all data when one of the underlying devices is lost. The performance of btier is determined by the devices that are used for the first tier. It is known to scale up to 130k IOPS with a RAID1 that consisted of modern PCIe SSD's. btier has support for SSD trim / discard, and can be configured in writeback or writethrough mode.
Concerted is a highly scalable multivariate in-memory data storage library. It has built in support for multivariate queries and performs well over many styles of workloads. The data is read and stored in various data structures, and the data from data structures written to disk. It is fully ACID compliant, has APIs that abstract details from the user, and provides an easy to use interface. Many different types of indexes are provided along with various systems that provide access for data analytics.
EJDB is an embedded JSON database engine. It aims to be a fast MongoDB-like NoSQL library that can be embedded into C/C++/Nodejs/Python3/Lua applications. It features collection-level write locking, collection level transactions, string token matching queries, and a Node.js binding.
UnQLite is a in-process software library which implements a self-contained, serverless, zero-configuration, transactional NoSQL database engine.It is a document store database similar to MongoDB, Redis, CouchDB, etc. as well a standard key/value store similar to BerkeleyDB, LevelDB, etc. It reads and writes directly to ordinary disk files. A complete database with multiple collections is contained in a single disk file. The database file format is cross-platform, and you can freely copy a database between 32-bit and 64-bit systems or between big-endian and little-endian architectures.
TomP2P is a P2P-based high performance key-value pair storage library. Each peer has a table (either disk-based or memory-based) to store its values. A single value can be queried or updated with a secondary key. The underlying communication framework uses Java NIO to handle many concurrent connections.
Zfswatcher is ZFS storage pool monitoring and notification daemon. It periodically inspects the zpool status and sends configurable notifications on status changes such as disk failures. It also controls the disk enclosure LEDs. There is an embedded Web interface for displaying status and logs.