RSS 8 projects tagged "hadoop"

No download Website Updated 14 May 2013 Gfarm

Screenshot
Pop 98.47
Vit 5.32

Gfarm is a distributed filesystem, generally used for large scale cluster computing. It's implemented in userland, and can be mounted by FUSE. It utilizes locality of a file to access a data node, and supports Globus GSI for Wide Area Network. Users can explicitly control file replica location on Gfarm. Gfarm can be used as an alternative storage system to HDFS for Hadoop, Samba, MPI-IO, and GridFTP. Monitoring via ZABBIX and Ganglia is also supported.

No download No website Updated 14 Jul 2011 Beanstalker

Screenshot
Pop 20.42
Vit 31.77

Beanstalker is a set of Maven Plugins for Amazon Web Services (AWS) Elastic Beanstalk and Elastic MapReduce. Plugin Mojos are suitable not only for command-line usage, but for Continuous Integration as well.

Download No website Updated 17 Mar 2014 Hypertable

Screenshot
Pop 347.03
Vit 27.53

Hypertable is a high performance, scalable database modeled after Google's Bigtable. It is designed to manage the storage and processing of information on a large cluster of commodity servers, providing resilience to machine and component failures.

Download No website Updated 18 Jun 2012 Syoncloud Logs

Screenshot
Pop 70.55
Vit 1.46

Syoncloud Logs processes log files from various applications and many servers. It can capture business relevant information from everyday log files generated by Web servers, business applications, and back office applications. It uses Flume sinks that run on the machines that produce log files. This data is filtered and relevant events channeled to HBase. The HBase NoSQL database is used for actual data analysis. The number of HBase nodes depends on the amount of processed log files. Syoncloud Logs has an easy to use installer that includes all necessary components such as Hadoop, Flume, Hbase, and Zookeeper.

Download Website Updated 04 Jun 2012 MapReduce-BitDew

Screenshot
Pop 57.31
Vit 26.13

MapReduce-BitDew is an implementation of the MapReduce programming model proposed by Google for Internet Desktop Grids. Using MapReduce-BitDew, you can execute MapReduce applications on resources like Desktop PCs distributed on the Internet. MapReduce-BitDew features a firewall-friendly protocol, fault-tolerance, result-certification, 2-level schedulers, and more.

No download Website Updated 25 Oct 2012 dispy

Screenshot
Pop 68.23
Vit 2.19

dispy is a Python framework for parallel execution of computations by distributing them across multiple processors in a single machine (SMP), or among many machines in a cluster or grid. The computations can be standalone programs or Python functions. dispy is well suited for the data parallel (SIMD) paradigm where a computation is evaluated with different (large) datasets independently (similar to Hadoop, MapReduce, Parallel Python). dispy features include automatic distribution of dependencies (files, Python functions, classes, modules), client-side and server-side fault recovery, scheduling of computations to specific nodes, encryption for security, sharing of computation resources if desired, and more.

Download No website Updated 14 Apr 2014 Infovore

Screenshot
Pop 614.83
Vit 43.40

Infovore is a map/reduce framework for processing large RDF data sets such as Freebase and DBpedia. It is based on Hadoop.

No download No website Updated 18 Feb 2014 Telepath

Screenshot
Pop 234.72
Vit 2.61

Telepath provides map/reduce code for processing Wikipedia Pagecounts. These contain usage data for all Wikipedia pages in all languages on an hourly basis. Derived from the bakemono toolkit, this project can process this 3TB data set with ease.

Screenshot

Project Spotlight

Wenity

A multi-platform Zenity clone.

Screenshot

Project Spotlight

i18nspector

A checking tool for gettext POT, PO, and MO files.