Python MapReduce library written in Cython.

Related tags

Miscellaneoushadoopy
Overview
Brandyn White 
  
Andrew Miller 
  
   

Source  
   https://github.com/bwhite/hadoopy/
Issues  
   https://github.com/bwhite/hadoopy/issues
Docs    
   http://bwhite.github.com/hadoopy/

IRC: #hadoopy @ freenode.net

Requirements
python development headers (python-dev), build tools (build-essential)

Optional
cython (>=.13) (without this it falls back to the pregenerated .c files)

Features
- oozie support
- Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch)
- typedbytes support (very fast)
- Local execution of unmodified MapReduce job with launch_local
- Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb)
- Works on OS X
- Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr)
- critical path is in Cython
- works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree)
- Simple HDFS access (readtb and ls) inside Python, even inside running jobs
- Unit test interface
- Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy)
- Supports design patterns in the Lin/Dyer book (
   http://www.umiacs.umd.edu/~jimmylin/book.html)

Limitations
- Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode.  Use psuedo-distributed instead for now.  (
   https://github.com/bwhite/hadoopy/issues/40)

Used in
- A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11)
- Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10)
- Vitrieve: Visual Search engine
- Picarus: Hadoop computer vision toolbox

Ubuntu Install (others are similar)
sudo apt-get install python-dev build-essential
sudo python setup.py install
  
 
Owner
Brandyn White
Brandyn White
A Python library to simulate a Zoom H6 recorder remote control

H6 A Python library to emulate a Zoom H6 recorder remote control Introduction This library allows you to control your Zoom H6 recorder from your compu

Matias Godoy 68 Nov 02, 2022
Modelling and Implementation of Cable Driven Parallel Manipulator System with Tension Control

Cable Driven Parallel Robots (CDPR) is also known as Cable-Suspended Robots are the emerging and flexible end effector manipulation system. Cable-driven parallel robots (CDPRs) are categorized as a t

Siddharth U 0 Jul 19, 2022
Open Source Management System for Botanic Garden Collections.

BotGard 3.0 Open Source Management System for Botanic Garden Collections built and maintained by netzkolchose.de in cooperation with the Botanical Gar

netzkolchose.de 1 Dec 15, 2021
Powerful Assistant

Delta-Assistant Hi I'm Phoenix This project is a smart assistant This is the 1.0 version of this project I am currently working on the third version o

1 Nov 17, 2021
A conda-smithy repository for boost-histogram.

The official Boost.Histogram Python bindings. Provides fast, efficient histogramming with a variety of different storages combined with dozens of composable axes. Part of the Scikit-HEP family.

conda-forge 0 Dec 17, 2021
Automated GitHub profile content using the USGS API, Plotly and GitHub Actions.

Top 20 Largest Earthquakes in the Past 24 Hours Location Mag Date and Time (UTC) 92 km SW of Sechura, Peru 5.2 11-05-2021 23:19:50 113 km NNE of Lobuj

Mr. Phantom 28 Oct 31, 2022
The third home of the bare Programming Language (1st there's my heart, the forest came second and then there's Github :)

The third home of the bare Programming Language (1st there's my heart, the forest came second and then there's Github :)

Garren Souza 7 Dec 24, 2022
Online HackerRank problem solving challenges

LinkedListHackerRank Online HackerRank problem solving challenges This challenge is part of a tutorial track by MyCodeSchool You are given the pointer

Sefineh Tesfa 1 Nov 21, 2021
Script to check if your Bistromatic handle everything as it should.

Bistromatic Checker Script to check if your Bistromatic handle everything as it should. The bistromatic is the project marking the end of the CPool at

Mathias 1 Dec 27, 2021
CHIP-8 interpreter written in Python

chip8py CHIP-8 interpreter written in Python Contents About Installation Usage License About CHIP-8 is an interpreted language developed during the 19

Robert Olaru 1 Nov 09, 2021
Account Manager / Nuker with GUI.

Account Manager / Nuker Remove all friends Block all friends Leave all servers Mass create servers Close all dms Mass dm Exit Setup git clone https://

Lodi#0001 1 Oct 23, 2021
A notebook explaining the principle of adversarial attacks and their defences

TL;DR: A notebook explaining the principle of adversarial attacks and their defences Abstract: Deep neural networks models have been wildly successful

1 Jan 22, 2022
Flask html response minifier

Flask-HTMLmin Minify flask text/html mime type responses. Just add MINIFY_HTML = True to your deployment config to minify HTML and text responses of y

Hamid Feizabadi 85 Dec 07, 2022
This repository can help you made a PocketMine-MP Server with Termux apps!

Hello This GitHub repository can made you a Server PocketMine-MP On development! How to Install Open Termux Type "pkg install git && python" If python

1 Mar 04, 2022
A collection of some leetcode challenges in python and JavaScript

Python and Javascript Coding Challenges Some leetcode questions I'm currently working on to open up my mind to better ways of problem solving. Impleme

Ted Ngeene 1 Dec 20, 2021
A simple script that shows important photography times. written in python.

A simple script that shows important photography times. written in python.

John Evans 13 Oct 16, 2022
Chemical Analysis Calculator, with full solution display.

Chemicology Chemical Analysis Calculator, to solve problems efficiently by displaying whole solution. Go to releases for downloading .exe, .dmg, Linux

Muhammad Moazzam 2 Aug 06, 2022
This Open-Source project is great for sensor capture and storage solutions.

Phase 1 This project helps developers in the creation of extended realities that communicate with Arduino and require the security of blockchain stora

Wolfberry, LLC 10 Dec 28, 2022
Python package that mirrors the original Nodejs ReplAPI-It.

Python-ReplAPI-It Python package that mirrors the original Nodejs ReplAPI-It. Contributing First fork the repo: $ git clone https://github.com/ReplAPI

The ReplAPI.it Project 10 Jun 05, 2022
A tool to replace all osu beatmap backgrounds at once.

OsuBgTool A tool to replace all osu beatmap backgrounds at once. Requirements You need to have python 3.6 or newer installed. That's it. How to Use Ju

Aditya Gupta 1 Oct 24, 2021