Astyanax -> Cassandra PoolTimeoutException during Authentication failure?

Recently I was working on implementing a custom IAuthenticator and IAuthority for Cassandra 1.1.1 because really there is not much/any security out of the box. For those of you familiar with Cassandra, its distribution used to include a simple property file based implementation of the IAuthentication and IAuthority that you could reference in your cassandra.yaml file … Continue reading Astyanax -> Cassandra PoolTimeoutException during Authentication failure?

USPS AIS bulk data loading with Hadoop mapreduce

Today I pushed up some source to Github for a utility I was previously working on to load data from USPS AIS data files into HBase/Mysql using Hadoop mapreduce and simpler data loaders. Source @ https://github.com/bitsofinfo/usps-ais-data-loader This project was originally started to create a framework for loading data files from the USPS AIS suite of … Continue reading USPS AIS bulk data loading with Hadoop mapreduce

How to access your OpenShift MongoDB database remotely on OS-X

I recently started playing around with Redhat's Openshift PaaS and installed the MongoDB and RockMongo cartridges on my application. My use case was just to leverage the Openshift platform to run my MongoDB instance for me, and I really was ready (nor needing) to push an actual application out to the application running @ openshift; … Continue reading How to access your OpenShift MongoDB database remotely on OS-X

Reading fixed length/width input records with Hadoop mapreduce

While working on a project where I needed to quickly import 50-100 million records I ended up using Hadoop for the job. Unfortunately the input files I was dealing with were fixed width/length records, hence they had no delimiters which separated records, nor did they have any CR/LFs to separate records. Each record was exactly … Continue reading Reading fixed length/width input records with Hadoop mapreduce