MongoDB at a conceptual level
Right after my first look at MongoDB, I went further into understanding the main elements of MongoDB's ecosystem. The information contained within this post is the resultant notes to help me conceptualize some of the key ideas behind MongoDB.
Network Communication
Servers
MongoDB servers are processes that manages data for us. These processes are created from executables which go by the name "mongod
" and can be found in the bin
folder of the MongoDB package. Upon starting these processes, we feed it some configuration values, either via some command-line parameters or via a configuration file.
When started successfully, MongoDB servers listen on a port number.
Clients and Drivers
Clients are entities which connect to the servers via the relevant port number(s). Clients query, insert, update or delete data using the Mongo Wire Protocol. The MongoDB interactive shell is one example of a client. It goes by the name of mongo
and is included in the same folder as the MongoDB server.
Although it is possible to create clients that communicate with the servers directly with the Mongo Wire Protocol, it is easier and less error-prone to use one of the many drivers out there. And with the driver written for the programming language that we are using, we can create connections to one or more servers to manipulate data in MongoDB.
Connections
We create connections via the driver in order to communicate with the servers as a client within our own applications. In this aspect, connections are programming abstractions of the communication link between the client and the server. Gaining connection to the server(s) is the first step to gaining code access for our applications to manipulate data in MongoDB.
Data Organization
Documents
MongoDB is a document-oriented store and the smallest unit which data is being organized into is called a document. This kind of document is different from the word document that we write for our reports. Documents in MongoDB are JavaScript-styled objects/arrays with one or more key/value pairs. They are represented in the a format called BSON, which stands for Binary JSON.
A MongoDB document that contains a information about a Person could be as follows:
{
"_id" : ObjectId("50152889103d76bde4053fe4"),
"firstName" : "Clivant",
"lastName" : "Yeo"
}
In this document, the _id
attribute contains a unique id which was automatically generated when I added an object with the firstName
attribute having the value of Clivant and the lastName
attribute having the value of Yeo.
Documents that follow similar structures as the above-mentioned example are organized into collections.
Collections
Collections are named containers to hold documents that usually have the same structure. However, this practice is not mandatory as MongoDB is designed to be schema-less. A collection could also hold documents with varying key/value pairs.
There are two kinds of collections:
- The standard collection can hold as many objects as storage permits and is created at the time when a document is inserted into it.
- The capped collection can only hold a limited number of documents and is created explicitly using the
createCollection
command. The tradeoffs for limiting the number of documents in the collection are fast performance and assurance of insertion order for the documents that is contained within.
I would name a collection by the kind of documents that it stores. In addition, the name can only be make up of at most 128 characters and that the folks at MongoDB recommend a size of 80 to 90 characters. Other notes about the collection name is that it should begin with letters or an underscore and may include numbers. The $ character, however, should not be used in a collection name.
Collections are stored within databases.
Databases
A MongoDB server can support many databases. Each database is independent with the underlying data being stored separately. Apart from storing collections, a database also contains security access information that controls accesses to the data that is stored within.