I recently was working on a larger ETL process that started with the reception of various data files via SFTP that were delivered on varying schedules. The requirement was that as files are received we generate a unique event in a database, then execute a sequence of commands to archive the files out of the delivery directory and offline to a central immutable annotated file repository.
This new functionality had to integrate with an existing SFTP legacy server, and would likely have other uses outside of this initial use-case.
Looking around for simple solutions based on a scripting language, I really could not find any that would work or be extensible enough for the need. Hence I ended up writing io-event-reactor.
The basic concept is this; you have a monitor that listens for IO events for particular paths on the filesystem. As these IO events occur, they are passed on to one or more evaluators to decide whether or not the IoEvent should be reacted to by one or more configured reactors. The entire latter sequence is encapsulated in an IoReactor instance that manages the flow between the three described components.
With this module, you construct and configure a single IoReactorService which can manage and contain one or more IoReactor instances, as many as you wish, providing for lots of flexibility for reacting to filesystem events.
When you configure the IoReactorService and its IoReactor instances, you specify which plugins you would like to use to fulfill the monitor and reactor roles. For evaluators you simply provide one or more functions which evaluate whether or not an IoEvent should be passed on to one or more reactors.
For reactor plugins, I developed two based on my initial needs.
- One for inserting records into a MySql database via node-mysql, available at: https://github.com/bitsofinfo/io-event-reactor-plugin-mysql
- The other for executing shell commands based on stateful-process-command-proxy, available at: https://github.com/bitsofinfo/io-event-reactor-plugin-shell-exec
For an real-world example of the kind of application you could build on top of this, check out io-overwatch (albiet a simple utility) at: https://github.com/bitsofinfo/io-overwatch
Hoping this will be useful for others out there, I’ve posted some code that could to serve as a lower level component/building block in a node.js application who has a need to mediate interaction with command line programs on the back-end. (i.e. bash shells, powershell etc.)
This is node.js module for executing os commands against a pool of stateful child processes such as bash or powershell via stdout and stderr streams. It is important to note, that despite the use-case described below for this project’s origination, this node module can be used for proxying long-lived bash process (or any shell really) in addition to powershell etc. It works and has been tested on both *nix, osx and windows hosts running the latest version of node.
This project originated out of the need to execute various Powershell commands (at fairly high volume and frequency) against services within Office365/Azure bridged via a custom node.js implemented REST API; this was due to the lack of certain features in the REST GraphAPI for Azure/o365, that are available only in Powershell (and can maintain persistent connections over remote sessions)
If you have done any work with Powershell and o365, then you know that there is considerable overhead in both establishing a remote session and importing and downloading various needed cmdlets. This is an expensive operation and there is a lot of value in being able to keep this remote session open for longer periods of time rather than repeating this entire process for every single command that needs to be executed and then tearing everything down.
Simply doing an child_process.exec per command to launch an external process, run the command, and then killing the process is not really an option under such scenarios, as it is expensive and very singular in nature; no state can be maintained if need be. We also tried using edge.js with powershell and this simply would not work with o365 exchange commands and heavy session cmdlet imports (the entire node.js process would crash). Using this module gives you full un-fettered access to the externally connected child_process, with no restrictions other than what uid/gid (permissions) the spawned process is running under (which you really have to consider from security standpoint!)
The diagram below should conceptually give you an idea of what this module does: process pooling, custom init/destroy commands, process auto-invalidation configuration and command history retention etc. See here for full details: https://github.com/bitsofinfo/stateful-process-command-proxy
Obviously this module can expose you to some insecure situations depending on how you use it… you are providing a gateway to an external process via Node on your host os! (likely a shell in most use-cases). Here are some tips; ultimately its your responsibility to secure your system.
- Ensure that the node process is running as a user with very limited rights
- Make use of the uid/gid configuration appropriately to further limit the processes
- Never expose calls to this module directly, instead you should write a wrapper layer around StatefulProcessCommandProxy that protects, analyzes and sanitizes external input that can materialize in a
- All commands you pass to
executeshould be sanitized to protect from injection attacks
- Make use of the whitelist and blacklist command features of this module
- WRAP this service via additional code that sanitizes all arguments to protect from command injection
Hopefully this will help others out there who have a similar need: https://github.com/bitsofinfo/stateful-process-command-proxy