USPS AIS bulk data loading with Hadoop mapreduce

Today I pushed up some source to Github for a utility I was previously working on to load data from USPS AIS data files into HBase/Mysql using Hadoop mapreduce and simpler data loaders. Source @ https://github.com/bitsofinfo/usps-ais-data-loader This project was originally started to create a framework for loading data files from the USPS AIS suite of … Continue reading USPS AIS bulk data loading with Hadoop mapreduce

Reading fixed length/width input records with Hadoop mapreduce

While working on a project where I needed to quickly import 50-100 million records I ended up using Hadoop for the job. Unfortunately the input files I was dealing with were fixed width/length records, hence they had no delimiters which separated records, nor did they have any CR/LFs to separate records. Each record was exactly … Continue reading Reading fixed length/width input records with Hadoop mapreduce