Archive for the ‘NOSQL’ Category

Reading fixed length/width input records with Hadoop mapreduce

While working on a project where I needed to quickly import 50-100 million records I ended up using Hadoop for the job. Unfortunately the input files I was dealing with were fixed width/length records, hence they had no delimiters which separated records, nor did they have any CR/LFs to separate records. Each record was exactly [...]

Continue reading »

HBase examples on OS-X and Maven

Ok, so today I needed to get HBase 0.20.0 running on my local os-x box, simply in standalone mode. I am starting a project where I need to manage 50-100 million records and I wanted to try out HBase. Here are the steps I took, the steps below are a consolidation of some pointers found [...]

Continue reading »

Follow

Get every new post delivered to your Inbox.