RSS 6 projects tagged "mapreduce"

Download Website Updated 15 Apr 2013 jug

Screenshot
Pop 191.05
Vit 24.72

Jug is a task-based parallelism framework. Jug allows you to write code that is broken up into tasks and run different tasks on different processors. It uses the filesystem to communicate between processes and works correctly over NFS, so you can coordinate processes on different machines. Jug is a pure Python implementation and should work on any platform that can run Python.

No download Website Updated 09 Apr 2010 Hadoop Studio

Screenshot
Pop 127.22
Vit 2.72

Hadoop Studio is a map-reduce development environment (IDE) based on Netbeans. It makes it easy to create, understand, and debug map-reduce applications based on Hadoop, without requiring development-time access to a map-reduce cluster. The studio provides a real-time workflow view of a map-reduce job, which displays the individual inputs, outputs, and interactions between the phases of a map-reduce job. The workflow view of a job updates in real time with the developer's code changes. It then generates Java sources and compiles them into a binary jar file, which can be run on a normal Hadoop cluster.

No download Website Updated 25 Oct 2012 dispy

Screenshot
Pop 75.76
Vit 2.47

dispy is a Python framework for parallel execution of computations by distributing them across multiple processors in a single machine (SMP), or among many machines in a cluster or grid. The computations can be standalone programs or Python functions. dispy is well suited for the data parallel (SIMD) paradigm where a computation is evaluated with different (large) datasets independently (similar to Hadoop, MapReduce, Parallel Python). dispy features include automatic distribution of dependencies (files, Python functions, classes, modules), client-side and server-side fault recovery, scheduling of computations to specific nodes, encryption for security, sharing of computation resources if desired, and more.

No download Website Updated 03 Nov 2011 Plasma

Screenshot
Pop 64.05
Vit 1.94

Plasma implements the map/reduce framework on a compute cluster. It has its own distributed filesystem, PlasmaFS, which is transactional (ACID), reliable, and fast, and which provides a complete set of file operations. PlasmaFS can be accessed via an RPC protocol or via NFS (i.e., it is mountable). Additionally, there is a key/value database on top of PlasmaFS.

Download Website Updated 30 Jun 2010 PyMW

Screenshot
Pop 47.75
Vit 1.00

PyMW is a Python module for parallel master-worker computing in a variety of environments. With the PyMW module, users can write a single program that scales from multicore machines to global computing platforms.

No download No website Updated 14 Jul 2011 Beanstalker

Screenshot
Pop 21.95
Vit 26.06

Beanstalker is a set of Maven Plugins for Amazon Web Services (AWS) Elastic Beanstalk and Elastic MapReduce. Plugin Mojos are suitable not only for command-line usage, but for Continuous Integration as well.

Screenshot

Project Spotlight

jsBind

A simple and intuitive toolkit for applying data bindings to HTML5 applications and Web pages.

Screenshot

Project Spotlight

Lziprecover

A recovery tool for lzip files.