Node modules provide a way to re-use code in your Node application. In some ways, they're similar to a class in other languages, like C# or Java. In many ways, they're completely different from a class. If you've written a Node application, you've used modules, you just may not have known it. At its core, a module is a piece of re-usable code with a defined interface. You bring a module into scope using require()
like this:
const path = require('path');
console.info(path.extname('index.js'));
This code example loads the core module path
and places a reference to it in the variable path
. Using the const
keyword for the variable declaration prevents it from being overwritten in the calling code. Bringing the path
module into scope allows use of its functionality by calling the extname
method on the path constant.
How Does require() Really Work
The core type Module
exposes the require()
function. The require()
function is an abstraction around an internal function, _load()
, which does the heavy lifting of loading a module. The _load
function follows these steps:
- Check
Module._cache
for a cached copy. - Create a new
Module
instance if not found in cache. - Save it to the cache.
- Load contents of the model.
- Compile the contents into a closure.
- If there was an error, delete from the cache.
- Return
module.exports
.
Let's focus on Steps 4 through7 first. Step 4 performs a complicated dance to find a module that matches the identifier passed to require()
. If the identifier begins with '/', '../' or './' it assumes a relative path and attempts to load the module at that path. If the identifier isn't a relative path, the following locations are searched:
- Node core modules
- Node_modules directory, which checks recursively by traversing the directory structure to the root directory
- Path pointed to by NODE_PATH
- $HOME/.node_modules
- $HOME/.node_libraries
- $PREFIX/lib/node_modules
The Node core modules include modules like path
, stream
, etc. Core modules are installed when you install Node itself. If a core module can't be found, Node assumes the identifier refers to a Node package. When attempting to locate a Node package, require()
looks for a node_modules
directory in the current directory and checks it for the package. If a package isn't found, it moves up a directory and repeats the check. Node continues all the way up to the root of the volume. Node also provides a way for packages to be installed globally. The NODE_PATH
environment variable points to the global package location. The environment variable contains a delimited string similar in format to the path environment variable of the target operating system. Node searches each location in the NODE_PATH next. Node docs recommend against using NODE_PATH and using either locally or globally installed packages instead.
The locations $HOME/.node_libraries and $HOME/.node_libraries are historical in nature and not recommended. Globally installed packages are placed in location $PREFIX/lib/node_modules. $PREFIX contains Nodes' configured Node_prefix, which points at the path containing the active version of Node. You'll find that you frequently use a relative identifier with require()
.
If you've used JavaScript in the browser, you may be wondering how a module defines an interface. Step 5 defines the interface when compiling the contents of the module. Node doesn't define an interface in the C# or Java style. A module's interface is defined by those variables and functions accessible to the calling code. During compilation, this is accomplished by wrapping the contents of the loaded module with a JavaScript closure that looks something like this:
(function(require, module, __filename, __dirname, process) {
<module content>
})(module.require, module, filename, dirname, processDetails);
This wrapper hides all of the internal implementation of your module inside the closure. This wrapper also exposes a number of global variables to your module. You can find a list of all the global variables here: https://Nodejs.org/api/globals.html#globals_module. To define the module's interface, use module.exports
. The module.exports
defines what the require()
call returns. You can set module.exports
to anything you can normally assign to a variable. Calling this step a compile is a bit of a misnomer. It does compile the JavaScript but it also executes the closure as though it were inline JavaScript in a Web page.
Creating Your First Module
Now that you know how require()
loads a module, let's create a simple module. A module ultimately simply returns an object. A simple module, myfirstmodule.js, might look like:
module.exports.name = function() {
return "Chris";
}
This returns an object with a single property, called name
. You can load and get the value of name
with code like this:
Const firstModule = require('./firstmodule');
console.log('Results: ' + firstModule.name());
I want to point out a couple of things in this last snippet. First, notice that my require started with './', which indicates a relatively pathed module and indicates to require()
that it should look in the same directory as the currently executing module. Second, notice that I didn't specify the .js suffix on the end of the module. Not specifying the .js suffix is a common convention in Node. It does create some ambiguity as to what will be loaded, though. It could refer to:
- myfirstmodule as a file
- myfirstmodule.js as a file
- myfirstmodule.json as a file
- myfirstmodule/index.js as a directory with a file in it
Node looks at all of the above options when attempting to load a file. For an example of loading a folder as a module, look in the downloadable samples directory, ModuleAsAFolder (available on the CODE Magazine website).
Module Caching
Now let's go back and look at a couple of steps in loading a module that I skipped over. The first step when loading a module is to check the cache to see if it's already been loaded. A reference to the loaded module is returned if the module was already loaded. Each module has only a single instance in memory. A new instance won't be created each time you require()
a module because require()
may return a cached instance based on the identifier it's passed. Like this:
const bara = require('FOO');
const barb = require('foo');
const barc = require('fOo');
A new instance won't be created each time you
require()
a module becauserequire()
may return a cached instance based on the identifier it's passed.
This code returns a new instance of foo
in each line because the identifier differs for each call to require()
. Be consistent in the identifiers that you pass to require()
. I go a step further. I use all lower-case
file names for modules and always use all lower-case identifiers. This technique ensures that I don't accidentally load multiple copies of a module. It also avoids a more subtle bug that I've encountered relative to case-sensitive
file systems. Node runs on a variety of systems from macOS to Windows to Linux. Some of these systems have case-insensitive
file systems (macOS) and others have case sensitive file systems (Ubuntu Linux). Although I typically develop on a Mac, all of my deployments target Ubuntu. More than once, I've had a build or deployment fail due to mismatched require()
identifiers and filenames. Avoid the issue entirely by using all lower-case
file names and require()
identifiers.
Circular Dependencies
Circular dependency issues can arise with modules that reference each other. Circular dependencies can occur when Module A requires Module B and Module B requires Module A. Here's the A module, a.js, that includes the following code:
console.log('a starting');
exports.done = false;
var b = require('./b');
console.log('in a, b.done = %j', b.done);
exports.done = true;
console.log('a done');
And the B module, b.js, that includes the following code:
console.log('b starting');
exports.done = false;
var a = require('./a');
console.log('in b, a.done = %j', a.done);
exports.done = true;
console.log('b done');
The a.js module starts running top to bottom. When it hits the require('./b') statement, execution is paused while it loads the b.js module. When the b.js module hits the require('./a') statement, it gets the paused copy of the a.js module as a reference. The value of a.done inside module B at load time is false
because module A hasn't finished loading. Once module A has finished loading, the value of done
is true
. Be careful with circular dependencies and module-level state, as it may not produce what you expect. I like to refactor modules with circular dependencies to avoid this problem.
There's no simple pattern that works 100% of the time for eliminating circular dependencies. That being said, one approach that works frequently is to extract the dependent logic in each module to a third common shared module. If you do this right, the above example would have module A depending on a new module C and module B depending on a new module C. Module C should not depend on either module A or B.
Module Style Tips
I like to follow a few style guidelines when consuming modules.
- I put all of my require statements at the top of each module. Any developer who comes after me can check the top of the module to see all the of the required dependencies and the modules/packages they map to.
- I also place my require statements into three groups: core modules, npm modules, and internal modules. Grouping the require statements makes it easy to scan the code and quickly identify the source of each module.
- I skip the .js suffix when requiring a module. Skipping the suffix lets me interchange the module as a folder pattern with the single module pattern.
- I define variables that hold modules as const so they never change. And I treat a module reference like a class or static type in another language; you don't expect it to change what it points to part way through execution.
Module Design Patterns
Modules can be exposed in a variety of ways in Node. In some cases, it's merely a style choice and you'll have to choose your preferred style. In other cases, which pattern you choose is based on whether you want static or instanced module behavior.
Static
The simplest module pattern consists of exporting an object. Most developers start with this. Module caching means that a single copy of each module exists, making it effectively a singleton. When you attach a variable or function to module.exports
as you have seen in previous examples, you're following this pattern. Although simple, this pattern can have some unexpected side effects. Beware of module level variables like this:
var messageCount = 0;
var SharedModule = {};
SharedModule.log = function log(message) {
msgCount++;
console.log(`Message ${msgCount}: ${message}`);
}
module.exports = SharedModule;
If you desire msgCount
to track the number of messages logged by each consumer of the module, the above snippet won't work. All consumers of the module share the same msgCount
and you'll get a global count of messages by all consumers of the module. If you want a per-instance count of messages, look at the next couple of patterns.
Instance
There a few instance-based patterns that you may encounter. Which one you use largely depends on which version of Node you need to support. In Node versions < 6 you'll see two main patterns: Revealing Module, and Revealing Prototype. You may be familiar with the Revealing Module pattern from client-side JavaScript. It provides a way to encapsulate private members while returning a public interface. Returning an object literal that exposes the interface and accessing internally scoped members through closure defines this pattern. The logger module provided next illustrates this technique:
module.exports = function(options) {
options = options || {};
var loggingLvl = options.logLevel || 1;
function logMessage(logLevel, message) {
if(logLevel <= loggingLvl) {
console.log(message);
}
}
return {
log: logMessage
};
}
The loggingLvl
variable can't be accessed outside the module. The logMessage
function has access to loggingLvl
through the closure. The object literal returned by the function renames logMessage
to log
. To consume this function, you use code like this:
var logger = require('./logger')();
var logger2 = require('./logger')({logLevel: 3});
logger.log(1, "Logger level 1");
logger2.log(1, "Logger2 level 1");
Notice the () after the require statement. Because the module exports a function, you have to call the function to get an object instance. The revealing module pattern does have a downside. Each instance of the module contains the complete implementation, including all of the method implementations. If you have an object with quite a bit of method code used in a way that creates thousands of instances, this overhead can really add up.
The second instance technique uses prototypal inheritance to eliminate this overhead. A prototype-based implementation looks like:
function Logger(options) {
options = options || {};
this.loggingLevel = options.logLevel || 1;
}
Logger.prototype.log = function log(logLevel, message) {
if(logLevel <= this.loggingLevel) {
console.log(message);
}
}
module.exports = Logger;
In this example, you export the function Logger
. The Logger
function serves as the constructor. In JavaScript, functions are objects and they all have a property called prototype
that's already defined. You then add the implementation code to this prototype and consume the module like this:
var Logger = require('./logger');
var logger = new Logger();
var logger2 = new Logger({logLevel: 4});
logger.log(1, "Logger1 level 1");
logger2.log(1, "logger2 level 1");
Notice that there are no () after the require statement. The require statement gives you what many other languages call the base type. In the next two lines, you create instances of the Logger
object. In JavaScript, using new
like this creates a new object with a reference (__proto__)
to all the functions defined on the base object's .prototype
property, calls the function with the new object bound to “this”, and returns the object. The returned object provides an instance that shares any methods that were defined on the .prototype
property with any derived objects. When you call logger.log
, JavaScript looks for a property defined on the logger
object. Because that property doesn't exist, JavaScript looks up the prototype chain (__proto__)
for a definition and calls that implementation.
I could add another method to the Logger .prototype
property AFTER creating logger and logger2 and it could still be called from logger and logger2. The ability to alter the behavior post construction, compared to C# or Java, defines prototypal inheritance. Behavior doesn't have to be defined on the base object at construction time; it only needs to be available at invocation. Behavior like this may seem odd to a developer coming from object-oriented languages like C# or Java.
Behavior like this may seem odd to a developer coming from object-oriented languages like C# or Java.
If you're using Node 6 or later, you have an additional option for instance-based modules. With Node 6 adopting ECMAScript 6, you can use the new class keyword to define instance-based modules. The class keyword provides syntactic sugar over the prototypal inheritance discussed previously. It also provides a syntax that looks familiar to developers coming to JavaScript from traditional object-oriented languages.
Reimplementing the Logger
module from the previous examples looks like this:
class Logger {
constructor(options) {
options = options || {};
this.loggingLevel = options.logLevel || 1;
}
log(logLevel, message) {
if(logLevel <= this.loggingLevel) {
console.log(message);
}
}
}
module.exports = Logger;
You start with the new class keyword that defines the class. Logger now includes a clearly defined constructor that handles initialization and method implementations are defined directly in the body of the class.
In my own practice, I'm moving to this style of instance module in code where I don't have to be backward compatible with Node versions prior to 6. For Node 4 and earlier, I use the technique shown in the prototypal examples.
Summary
If you're looking to re-use code in your Node projects, the patterns outlined in this article will be very helpful. Start with static modules and introduce instance-based modules as you find a need for instance-level differentiation. I recommend using lower-case filenames for all modules to prevent issues on case-sensitive
file systems. Have fun with Node!