Reacting to filesystem io events with Node.js

I recently was working on a larger ETL process that started with the reception of various data files via SFTP that were delivered on varying schedules. The requirement was that as files are received we generate a unique event in a database, then execute a sequence of commands to archive the files out of the delivery directory and offline to a central immutable annotated file repository.

This new functionality had to integrate with an existing SFTP legacy server, and would likely have other uses outside of this initial use-case.

Looking around for simple solutions based on a scripting language, I really could not find any that would work or be extensible enough for the need. Hence I ended up writing io-event-reactor.

The basic concept is this; you have a monitor that listens for IO events for particular paths on the filesystem. As these IO events occur, they are passed on to one or more evaluators to decide whether or not the IoEvent should be reacted to by one or more configured reactors. The entire latter sequence is encapsulated in an IoReactor instance that manages the flow between the three described components.

diag1

With this module, you construct and configure a single IoReactorService which can manage and contain one or more IoReactor instances, as many as you wish, providing for lots of flexibility for reacting to filesystem events.

When you configure the IoReactorService and its IoReactor instances, you specify which plugins you would like to use to fulfill the monitor and reactor roles. For evaluators you simply provide one or more functions which evaluate whether or not an IoEvent should be passed on to one or more reactors.

https://github.com/bitsofinfo/io-event-reactor

The default monitoring plugin is implemented using the great Chokidar library at: https://github.com/paulmillr/chokidar

For reactor plugins, I developed two based on my initial needs.

For an real-world example of the kind of application you could build on top of this, check out io-overwatch (albiet a simple utility) at: https://github.com/bitsofinfo/io-overwatch

Docker container IP and port discovery with Consul and Registrator

Do you use Docker?

Does your containerized app have the need to discover both its own IP and one or more mapped ports?

How can another container access my exposed ports and how can I do the same of my peers?

As it stands today, simple self discovery of your container’s accessible IP and one or more of its mapped ports is not exposed to your Docker container process as a native feature of the engine itself.

If you’ve attempted to containerize an app that attempts to discover its peers in order to form its own peer-level cluster etc, you’ve likely run into this challenge.

That said there are several tools out there with can help you with this issue. One of which is Registrator which is a special container that listens for events from a Docker host and acts as service discovery bridge that relays this info into other tooling such as Consul and etcd etc.  In short, when your container is launched, the Registrator container collects all the info about the docker host it is running on and its exposed ports and registers this under a named service in one of the aforementioned backends.

This is all fine and great, however this still puts a lot of work on you, the container developer who needs to collect this info and then act upon it in order to form a higher level cluster between your containers.

I had this exact same problem for a Java based service that needed to form a Hazelcast cluster dynamically. Out of that use case I came up with a generic library that you can drop into your Java container application called docker-discovery-registrator-consul which is available at: https://github.com/bitsofinfo/docker-discovery-registrator-consul

The purpose of this library is for “self-discovery” from within your JVM based Docker application where you need to discover what your accessible docker-host bound IP and mapped port(s) are, as well as your peers within the same service. As noted above this is critical if your container has to do further peer discovery for other services it provides or clustering groups it must form.

You can read all the details of how it works and how to use it here: https://github.com/bitsofinfo/docker-discovery-registrator-consul

Hopefully it will be of use to you as well.

Reactive re-engineering with Akka

Everyone once in a while during the life cycle of any given piece of software comes that time where you have the opportunity to improve it in a major way….if that is, its lucky enough to still be in production.

One particular system I’ve been involved with is responsible for processing a lot of data and keeping that data in sync across many systems. For purposes of this little case study I’ve dumbed down the overall use-case, concept, architecture and implementation details to this simple idea. We need to synchronize data.

Use-Case

Something in the environment (i.e. a user or other process) makes a request for some operation to be done that generates a change operation against a “DataEntry”. This DataEntry is manipulated in the primary database and then the change needs to be synchronized numerous other systems to count. The changes could be described as “create DataEntry item number XYZ”, “Mutate DataEntry XYZ in fashion Z” or simply “Delete DataEntry item XYZ”.

Each target system where a DataEntry is to be synchronized is called a DataStore and involves its own complicated process of mutating our representation of a DataEntry into the target DataStore’s representation and the means to do it can vary wildly; i.e. web-service calls, RDBMS dml, nosql operations etc etc. Not to mention, as with any integration, each of these DataStore sync calls has the possibility being fast, very slow, not working at all, or experiencing random transient failures.

Version 1

For most of its life the system functioned as follows, each DataEntry mutated in the system was placed in a queue, and then processed by a consumer node’s DataSyncProvider who’s responsibility is to determine all the DataStores to process the DataEntry in via interrogating a DataStoreLocator and then make sure it happens.  It worked similar to the diagrams below (highly simplified!), and note the bottleneck.

Screen Shot 2016-03-19 at 5.37.13 PM

Version 1, synchronization flow, within one node

Screen Shot 2016-03-19 at 5.38.55 PM

Version 1. Overall cluster view

Version 1 issues

Version 1 functioned fine for most of its life, however the biggest issues with is were simply its lack of efficiency and speed in synchronizing any given DataEntry across all of the DataStores it was applicable for. More often than not any given DataEntry mutation would result in dozens of target DataStores that it needed to be synchronized against. Due to the sequential processing of each DataStore, accommodating for retries, and waiting for the overall result….before moving on to the next one, this would result in a sizable delay until the mutation materialized in all target DataStores. (not to mention lack of good core utilization across the cluster). What did this mean? Well an opportunity for improvement.

Version 2

Obviously, the choice here was to move to asynchronous parallel DataStore execution and decoupling from the main DataEntry mutation consumer thread(s)….. and there are many ways you could go about doing that. Fortunately the overall modeling of the synchronization engine enabled considerably flexibility in swapping out the implementation with a little refactoring. The key points being introducing the concept of a DataEntry logic execution engine; aptly named LogicExecutionEngine and adding a new implementation of our DataStoreLocator that could decouple any given DataStore’s location from any dependency on its actual residency within the local JVM.

Great. Now that the modeling is adjusted, what about implementation? For one, there was no interest it writing a multi-threaded execution engine, even though one could with the modeling in place; any implementation could have been be developed and plugged in. That said, after looking around for a good framework that provided location transparency, parallel execution management, clustering and good resiliency, it was decided that Akka, and moving to an Actor model for the new engine would be a good fit.

Screen Shot 2016-03-20 at 11.44.57 AM.png

Version 2. Actor based DataStore’s and LogicExecutionEngine

As shown above, the DataStores actually are now implemented via an ActorRef version which is then passed to the LogicExectionEngine who’s new Actor based implementation injects them into yet another Actor for the DataEntry logic processing awaiting a Future<Result>. This model increased overall execution time to completion by roughly 80% as everything now executed in parallel.

Another benefit was additional resiliency and distribution of load due to the location transparency of the actual DataStore itself. Utilizing Akka’s various Routers, such as in this case the ClusterRouterGroup Actor, we were able to further redistribute the processing of any given DataStore workload across the cluster and appropriately react as nodes came on and offline. See exploded view below.

Screen Shot 2016-03-20 at 11.22.38 AM

Version 2. Exploded view of DataStore location transparency

Lastly, the diagram below shows how execution of these DataEntry tasks is now more evenly distributed across the entire set of available nodes in the cluster. All nodes can now be potentially involved in processing any DataEntry workload. Also by feeding dynamic configuration into the construction of each ClusterRouterGroup Actor the system could also fine tune the distribution and amount of Actors in the cluster that are available to process entries targeted at any given DataStore. This permits for custom down-scaling based on the limitations or load ceilings that any given downstream target DataStore may present. In other words it permits throttling of loads.

 

Screen Shot 2016-03-19 at 6.02.24 PM.png

Version 2. Better utilization of core resources across cluster

Overall my experience with Akka was positive. After working some of the bugs out, so far in production this solution has been quite stable and Akka’s clustering protocol quite stable. If you are considering moving to a more reactive design approach for the back end of a system, I highly recommend giving Akka a consideration.

Lastly, as always I highly recommend going with a pure interface oriented design in any system you build. In this use-case, this system’s entire platform itself, having been designed from the ground up using interfaces extensively and then plugging in different “providers” (i.e. things like Camel or Akka) for each aspect of implementation has proved out to be very important as it has evolved over time. This gives the system tremendous flexibility as it matures over time and additional longevity.

Hazelcast discovery with Etcd

I’ve used Hazelcast for years and have generally relied upon the availability of multicast for Hazelcast cluster discovery and formation (within a single data-center). Recently was faced with two things, expand the footprint into a non-multicast enabled data-center and secondly pre-prep the service for containerization where nodes will come and go as scaling policies dictate it…. hardwired Hazelcast clustering via an XML configuration and/or reliance on multicast is a no-go.

With Hazelcast 3.6, they now support a pluggable implementation for a cluster discovery mechanism called the Discovery SPI. (Discovery Strategy) Perfect timing, given we are already playing with Etcd as part of our Docker container strategy, this was an opportunity to let our application’s native clustering mechanism (coded on top of Hazelcast) to leverage Etcd as well as discover/remove peers both within, and potentially across data-centers.

So I coded up hazelcast-etcd-discovery-spi available on GitHub.

diag.png

This works with Hazelcast 3.6-EA+ and Etcd to provide (optional) automatic registration of your hazelcast nodes as Etcd services and automatic peer discovery of the Hazelcast cluster.

Note that the automatic registration of each hazelcast instance as a Etcd service is OPTIONAL if you want to manually maintain these key-paths in etcd. I added that in simply because I think it will be convenient for folks, especially when containerizing a Hazelcast enabled app (such as via Docker) where the less “dependencies” and manual things to do (i.e. register your hazelcast nodes manually).. the better. You can totally embedded this functionality with this discovery strategy SPI.

I hope others find this helpful, and please leave your feedback, pull-requests or issues on the project!

NOTE, if you are running your app in Docker you have a separate issue where you need to determine your own externally accessible IP/PORT that the docker host has mapped for you on 5701… well how can you determine that so that you can publish the correct IP/PORT info to Etcd? Check out: https://github.com/bitsofinfo/docker-discovery-registrator-consul

NOTE! Interested in consul? There is a separate project which is built around Consul for your discovery strategy located here: https://github.com/bitsofinfo/hazelcast-consul-discovery-spi

 

Hazelcast discovery with Consul

I’ve used Hazelcast for years and have generally relied upon the availability of multicast for Hazelcast cluster discovery and formation (within a single data-center). Recently was faced with two things, expand the footprint into a non-multicast enabled data-center and secondly pre-prep the service for containerization where nodes will come and go as scaling policies dictate it…. hardwired Hazelcast clustering via an XML configuration and/or reliance on multicast is a no-go.

With Hazelcast 3.6, they now support a pluggable implementation for a cluster discovery mechanism called the Discovery SPI. (Discovery Strategy) Perfect timing, given we are already playing with Consul as part of our Docker container strategy, this was an opportunity to let our application’s native clustering mechanism (coded on top of Hazelcast) to leverage Consul as well as discover/remove peers both within, and potentially across data-centers.

So I coded up hazelcast-consul-discovery-spi available on GitHub.

diag.png

This works with Hazelcast 3.6-EA+ and Consul to provide automatic registration of your hazelcast nodes as Consul services (without having to run a local Consul agent) and automatic peer discovery of the Hazelcast cluster.

Note that the automatic registration of each hazelcast instance as a Consul service is OPTIONAL if you already have Consul agents running that define your Hazelcast service nodes. I added that in simply because I think it will be convenient for folks, especially when containerizing a Hazelcast enabled app (such as via Docker) where the less “dependencies” like a Consul agent available on the host, or in the container (or another container).. the better. You can totally embedded this functionality with this discovery strategy SPI.

I hope others find this helpful, and please leave your feedback, pull-requests or issues on the project!

NOTE, if you are running your app in Docker you have a separate issue where you need to determine your own externally accessible IP/PORT that the docker host has mapped for you on 5701… well how can you determine that so that you can publish the correct IP/PORT info to Consul? Check out: https://github.com/bitsofinfo/docker-discovery-registrator-consul

NOTE! Interested in etcd? There is a separate project which is built around etcd for your discovery strategy located here: https://github.com/bitsofinfo/hazelcast-etcd-discovery-spi

 

Aggregate, backup elasticsearch fs snapshots across a widely distributed cluster

One of the Elasticsearch clusters I’ve worked on is spanned across multiple data-centers around the world and stores some very large indexes. Sometimes, but not often we have the need to get a backup of one of these indexes off of the cluster for restoration onto another cluster, but due to the sheer size of these indexes, its not practical for us to snapshot it to S3 or even a shared NFS mount (as the cluster spans multiple data-centers). Therefore the local file-system “fs” snapshot type is the only one really usable for us in this scenario.. but what you end up with is parts of the snapshot distributed across individual nodes all over the world.

So there was a need for a tool to automate the task of collecting all of the individual snapshot “parts” and downloading them to a central machine. If you’ve ever looked into the actual format of an elasticsearch snapshot its a little tedious… i.e. you just can’t blindly copy over the contents of snapshot shard directory contents as ES smartly does snapshots via diffs and keeping track of what files are relevant for each snapshot in metadata files; see here for an excellent overview: https://www.found.no/foundation/elasticsearch-snapshot-and-restore/.

So in the end I came up with elasticsearch-snapshot-manager (Scala) as a tool for handling all of this (analyzing, aggregating, downloading).

This tool is intended to aid with the following scenario:

  1. You have a large elasticsearch cluster that spans multiple data-centers
  2. You have a “shared filesystem snapshot repository” who’s physical location is local to each node and actually NOT on a “shared device” or logical mountpoint (i.e due to (1) above), the snapshots reside on local-disk only.
  3. You need a way to execute the snapshot, then easily collect all the different parts of that snapshot which are located across N nodes across your cluster
  4. This tool is intended to automate that process…

Please see the github project for all the details @ https://github.com/bitsofinfo/elasticsearch-snapshot-manager , feedback appreciated.

Book review: Building Microservices

Screen Shot 2015-04-06 at 10.11.15 PMRecently I read Sam Newman’s “Building Microservices” , at ~280 pages its a fairly quick read. The reviews on this book overall are mixed and I can see where readers are coming from. By the title of this book one might expect some coverage of some of the microservices frameworks out there, concrete examples, maybe some actual code… but you won’t really find that here. Instead you will find a pretty good overview of various architectural approaches to modern application design in today’s world; covering general topics such a proper separation of concerns, unit-testing, continuous integration, automation, infrastructure management, service discovery, fault tolerance, high-availability and security etc.

In reality, none of the principles covered in this book are the exclusive domain of “microservice” application architectures, but rather can (and should be) applied to any application you are considering deploying; whether its a “monolithic” application or a suite of microservices interacting as parts of a larger functioning application.

In that right I think this book is definitely a good read and worth a look, if for nothing more than to ensure your team gets a refresher on good design principles and how they can be materialized with some of the newer frameworks and tool sets that have come out of our community in recent years. The material presented is sound.

Fix: HDP, YARN, Spark “check your cluster UI to ensure that workers are registered and have sufficient resources”

Are you trying to submit a Spark job over YARN on an HDP Hadoop cluster and encounter these kinds of errors? (below)

If so just add the following 2 lines to your [spark-home]/conf/spark-defaults.conf file:

# customize for your HDP version...

spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041
spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041

 

ERRORS

You will see the errors below, stem from the root issue that occurs on an Spark Executor node where its trying to do a substitution for ${hdp.version} for which a definition variable does not exist, the above fixes that.

Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

15/04/03 13:40:30 INFO yarn.YarnAllocator: Completed container container_1428072484378_0004_01_000003 (state: COMPLETE, exit status: 1)
15/04/03 13:40:30 INFO yarn.YarnAllocator: Container marked as failed: container_1428072484378_0004_01_000003. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1428072484378_0004_01_000003
Exit code: 1
Exception message: /hadoop/yarn/local/usercache/admin/appcache/application_1428072484378_0004/container_1428072484378_0004_01_000003/launch_container.sh: line 26: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

Stack trace: ExitCodeException exitCode=1: /hadoop/yarn/local/usercache/admin/appcache/application_1428072484378_0004/container_1428072484378_0004_01_000003/launch_container.sh: line 26: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

	at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
	at org.apache.hadoop.util.Shell.run(Shell.java:455)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1</pre>


2015-04-03 13:41:27,055 INFO  container.Container (ContainerImpl.java:handle(999)) - Container container_1428072484378_0004_02_000013 transitioned from LOCALIZED to RUNNING
2015-04-03 13:41:27,068 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:buildCommandExecutor(267)) - launchContainer: [bash, /hadoop/yarn/local/usercache/admin/appcache/application_1428072484378_0004/container_
1428072484378_0004_02_000013/default_container_executor.sh]
2015-04-03 13:41:27,614 WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(223)) - Exit code from container container_1428072484378_0004_02_000013 is : 1
2015-04-03 13:41:27,614 WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(229)) - Exception from container-launch with container ID: container_1428072484378_0004_02_000013 and exit code: 1
ExitCodeException exitCode=1: /hadoop/yarn/local/usercache/admin/appcache/application_1428072484378_0004/container_1428072484378_0004_02_000013/launch_container.sh: line 26: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hado
op-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/sh
are/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framewor
k/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitut
ion

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-04-03 13:41:27,614 INFO  nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(283)) - Exception from container-launch.
2015-04-03 13:41:27,615 INFO  nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(283)) - Container id: container_1428072484378_0004_02_000013
2015-04-03 13:41:27,615 INFO  nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(283)) - Exit code: 1

Fix: HDP “Unauthorized connection for super-user: oozie from IP 127.0.0.1”

Recently have been playing with HortonWorks HDP 2.2. Was starting to configure some oozie workflows and when submitting the job the first step’s Hive script failed with this error and stack.


JA002: Unauthorized connection for super-user: oozie from IP 127.0.0.1

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): Unauthorized connection for super-user: oozie from IP 127.0.0.1
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy39.getDelegationToken(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getDelegationToken(ApplicationClientProtocolPBClientImpl.java:306)
... 30 more

To fix this, SSH into your HDP instance VM and edit: /etc/hadoop/conf/core-site.xml and change the following config to add “localhost”. Save and restart the relevant services or just reboot your HDP VM instances.


<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>sandbox.hortonworks.com,127.0.0.1,localhost</value>
</property>