Brandyn WhiteAndrew Miller Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/ IRC: #hadoopy @ freenode.net Requirements python development headers (python-dev), build tools (build-essential) Optional cython (>=.13) (without this it falls back to the pregenerated .c files) Features - oozie support - Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch) - typedbytes support (very fast) - Local execution of unmodified MapReduce job with launch_local - Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb) - Works on OS X - Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr) - critical path is in Cython - works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree) - Simple HDFS access (readtb and ls) inside Python, even inside running jobs - Unit test interface - Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy) - Supports design patterns in the Lin/Dyer book ( http://www.umiacs.umd.edu/~jimmylin/book.html) Limitations - Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode. Use psuedo-distributed instead for now. ( https://github.com/bwhite/hadoopy/issues/40) Used in - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11) - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10) - Vitrieve: Visual Search engine - Picarus: Hadoop computer vision toolbox Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install
Python MapReduce library written in Cython.
Overview
A tool for checking if the external data used in Flatpak manifests is still up to date
Flatpak External Data Checker This is a tool for checking for outdated or broken links of external data in Flatpak manifests. Motivation Flatpak apps
Python library to decorate and beautify strings
outputformater Python library to decorate and beautify your standard output 💖 I
I³ Tracker for Essential Open Innovation Datasets
I³ Tracker for Essential Open Innovation Datasets This repository is set up to track, version, and contribute updates to the I³ Essential Open Innovat
Sudoku solver using backtracking
Sudoku solver Sudoku solver using backtracking Basically in sudoku, we want to be able to solve a sudoku puzzle given an input like this, which repres
ARK sõidueksami Matrixi bot
ARK Sõidueksami bot Küsib ARK-i lehelt uusimad eksami ajad ja saadab sõnumi Matrixi kanali Dev setup Linux python3 -m venv venv source venv/bin/activa
A conda-smithy repository for boost-histogram.
The official Boost.Histogram Python bindings. Provides fast, efficient histogramming with a variety of different storages combined with dozens of composable axes. Part of the Scikit-HEP family.
A collection of tips for using MISP.
MISP Tip of the Week A collection of tips for using MISP. Published via BelgoMISP (todo) and this repository. Available in MD and JSON. Do you want to
Automated moth pictures for biodiversity research
Automated moth pictures for biodiversity research
Add-In for Blender to automatically save files when rendering
Autosave - Render: Automatically save .blend, .png and readme.txt files when rendering with Blender Purpose This Blender Add-On provides an easy way t
万能通用对象池,可以池化任意自定义类型的对象。
pip install universal_object_pool 此包能够将一切任意类型的python对象池化,是万能池,适用范围远大于单一用途的mysql连接池 http连接池等。 框架使用对象池包,自带实现了4个对象池。可以直接开箱用这四个对象池,也可以作为例子学习对象池用法。
LiteX-Acorn-Baseboard is a baseboard developed around the SQRL's Acorn board (or Nite/LiteFury) expanding their possibilities
LiteX-Acorn-Baseboard is a baseboard developed around the SQRL's Acorn board (or Nite/LiteFury) expanding their possibilities
Update your Nintendo Switch cheats with one click, or a bit more~
Interactive-ASM-Cheats-Updater This updater unlocks your ability of updating most of the ASM cheats for Nintendo Switch. Table of Contents Functions Q
Darkflame Universe Account Manager
Darkflame Universe Account Manager This is a quick and simple web application intended for account creation and management for a DLU instance created
Animations made using manim-ce
ManimCE Animations Animations made using manim-ce The code turned out to be a bit complicated than expected.. It can be greatly simplified, will work
🍞 Create dynamic spreadsheets with arbitrary layouts using Python
🍞 tartine What this is Installation Usage example Fetching some data Getting started Adding a header Linking more cells Cell formatting API reference
The third home of the bare Programming Language (1st there's my heart, the forest came second and then there's Github :)
The third home of the bare Programming Language (1st there's my heart, the forest came second and then there's Github :)
Code and yara rules to detect and analyze Cobalt Strike
Cobalt Strike Resources This repository contains: analyze.py: a script to analyze a Cobalt Strike beacon (python analyze.py BEACON) extract.py; extrac
Here You will Find CodeChef Challenge Solutions
Here You will Find CodeChef Challenge Solutions
Unzip Japanese Shift-JIS zip archives on non-Japanese systems.
Unzip JP GUI Unzip Japanese Shift-JIS zip archives on non-Japanese systems. This script unzips the file while converting the file names from Shift-JIS
Python package for reference counting native pointers
refcount master: testing: This package is primarily for managing resources in native libraries, written for instance in C++, from Python. While it boi