Fix: HDP, YARN, Spark “check your cluster UI to ensure that workers are registered and have sufficient resources”

Are you trying to submit a Spark job over YARN on an HDP Hadoop cluster and encounter these kinds of errors? (below) If so just add the following 2 lines to your [spark-home]/conf/spark-defaults.conf file:   ERRORS You will see the errors below, stem from the root issue that occurs on an Spark Executor node where … Continue reading Fix: HDP, YARN, Spark “check your cluster UI to ensure that workers are registered and have sufficient resources”

Fix: HDP “Unauthorized connection for super-user: oozie from IP 127.0.0.1”

Recently have been playing with HortonWorks HDP 2.2. Was starting to configure some oozie workflows and when submitting the job the¬†first step's Hive script failed with this error and stack. To fix this, SSH into your HDP instance VM and edit:¬†/etc/hadoop/conf/core-site.xml and change the following config to add "localhost". Save and restart the relevant services … Continue reading Fix: HDP “Unauthorized connection for super-user: oozie from IP 127.0.0.1”

USPS AIS bulk data loading with Hadoop mapreduce

Today I pushed up some source to Github for a utility I was previously working on to load data from USPS AIS data files into HBase/Mysql using Hadoop mapreduce and simpler data loaders. Source @ https://github.com/bitsofinfo/usps-ais-data-loader This project was originally started to create a framework for loading data files from the USPS AIS suite of … Continue reading USPS AIS bulk data loading with Hadoop mapreduce

Reading fixed length/width input records with Hadoop mapreduce

While working on a project where I needed to quickly import 50-100 million records I ended up using Hadoop for the job. Unfortunately the input files I was dealing with were fixed width/length records, hence they had no delimiters which separated records, nor did they have any CR/LFs to separate records. Each record was exactly … Continue reading Reading fixed length/width input records with Hadoop mapreduce