Administration
mongod
A daemon is a program or process that's meant to be run but not interacted with directly.
- mongod is the main daemon process for MongoDB.
- It is the core server of the database, handling connections, requests, and most importantly, persisting data.
default configuration
- port: 27017 ->
mongod --port <port number>
- dbpath:
/data/db
->mongod --dbpath <directory path>
- auth: false ->
mongod --auth
- bound to localhost(127.0.0.1) ->
mongod --bind_ip <ip address>[, <ip address>...]
commands group
db.<method>()
db.<collection>.<method>()
rs.<method>()
sh.<method>()
command line
configuration file
🔗mongod configuration file options
# mongod.conf
storage:
dbPath: "/data/db"
systemLog:
path: "/data/log/mongod.log"
destination: "file"
replication:
replSetName: M103
net:
bindIp : "127.0.0.1,192.168.103.100"
tls:
mode: "requireTLS"
certificateKeyFile: "/etc/tls/tls.pem"
CAFile: "/etc/tls/TLSCA.pem"
security:
keyFile: "/data/keyfile"
processManagement:
fork: true
logging
Log Verbosity Levels:
- -1: Inherit from parent
- 0: Default Verbosity, to include informational messages
- 1-5: Increase the verbosity level to include Debug messages
Log Message Severity Levels:
- F - Fatal
- E - Error
- W - Warning
- I - Informational(Verbosity Level 0)
- D - Debug(Verbosity Level 1-5)
Basic Security
Authentication:
- SCRAM and X.509 are always available
- LDAP and KERBEROS are Enterprise-Only
Authorization: Role Based Access Control
- Each user has a set of roles.
- Each role has a set of privileges.
- A privilege is a permission to perform a specific operation on a specific resource.
Built-in Roles
- Database User
- read
- readWrite
- Database Administration
- dbAdmin
- userAdmin
- dbOwner
- Cluster Administration
- clusterAdmin
- clusterManager
- clusterMonitor
- hostManager
- Backup/Restore
- backup
- restore
- Super User
- root
Create user
Grant role to user
Show role privileges
Server Tools
List mongodb binaries: find /usr/bin/ -name "mongo*"
mongostat
mongorestore
restore a MongoDB collection from a BSON dump file
mongorestore --host <host> --port <port> --db <db> --drop --dir <directory>
# exp:
mongorestore --drop --port 30000 dump/
mongodump
get a BSON dump of a MongoDB collection
mongodump --host <host> --port <port> --db <db> --out <directory>
# exp:
mongodump --help
mongodump --port 30000 --db applicationData --collection products
ls dump/applicationData/
cat dump/applicationData/products.metadata.json
mongoexport
export a MongoDB collection to JSON or CSV (or stdout!)
mongoexport --host <host> --port <port> --db <db> --collection <collection> --out <file>
# exp:
mongoexport --help
mongoexport --port 30000 --db applicationData --collection products
mongoexport --port 30000 --db applicationData --collection products -o products.json
Differences between mongoexport
and mongodump
:
- mongoexport exports the entire collection, whereas mongodump exports only the documents.
- By default, mongoexport sends output to standard output, but mongodump writes to a file.
- mongodump can create a data file and a metadata file, but mongoexport just creates a data file.
- mongodump outputs BSON, but mongoexport outputs JSON.
mongoimport
create a MongoDB collection from a JSON or CSV file
mongoimport --host <host> --port <port> --db <db> --collection <collection> --file <file>
# exp:
mongoimport --port 27000 -u m103-application-user -p m103-application-pass --db applicationData --collection products --file /dataset/products.json --authenticationDatabase admin
Replication
Instructions to set up a replica set
- The configuration file for the first node (node1.conf):
storage:
dbPath: /var/mongodb/db/node1
net:
bindIp: 192.168.103.100,localhost
port: 27011
security:
authorization: enabled
keyFile: /var/mongodb/pki/m103-keyfile
systemLog:
destination: file
path: /var/mongodb/db/node1/mongod.log
logAppend: true
processManagement:
fork: true
replication:
replSetName: m103-example
- Creating the keyfile and setting permissions on it:
sudo mkdir -p /var/mongodb/pki/
sudo chown vagrant:vagrant /var/mongodb/pki/
openssl rand -base64 741 > /var/mongodb/pki/m103-keyfile
chmod 400 /var/mongodb/pki/m103-keyfile
- Creating the dbpath for node1:
- Starting a mongod with node1.conf:
- Copying node1.conf to node2.conf and node3.conf:
- Editing node2.conf and node3.conf:
storage:
dbPath: /var/mongodb/db/node2 # edited
net:
bindIp: 192.168.103.100,localhost
port: 27012 # edited
security:
keyFile: /var/mongodb/pki/m103-keyfile
systemLog:
destination: file
path: /var/mongodb/db/node2/mongod.log # edited
logAppend: true
processManagement:
fork: true
replication:
replSetName: m103-example
storage:
dbPath: /var/mongodb/db/node3 # edited
net:
bindIp: 192.168.103.100,localhost
port: 27013 # edited
security:
keyFile: /var/mongodb/pki/m103-keyfile
systemLog:
destination: file
path: /var/mongodb/db/node3/mongod.log # edited
logAppend: true
processManagement:
fork: true
replication:
replSetName: m103-example
storage:
dbPath: /var/mongodb/db/node4
net:
bindIp: 192.168.103.100,localhost
port: 27014
systemLog:
destination: file
path: /var/mongodb/db/node4/mongod.log
logAppend: true
processManagement:
fork: true
replication:
replSetName: m103-example
storage:
dbPath: /var/mongodb/db/arbiter
net:
bindIp: 192.168.103.100,localhost
port: 28000
systemLog:
destination: file
path: /var/mongodb/db/arbiter/mongod.log
logAppend: true
processManagement:
fork: true
replication:
replSetName: m103-example
- Creating the data directories for node2 and node3:
- Starting mongod processes with node2.conf and node3.conf:
- Connecting to node1:
- Initiating the replica set:
- Creating a user:
use admin
db.createUser({
user: "m103-admin",
pwd: "m103-pass",
roles: [
{role: "root", db: "admin"}
]
})
- Exiting out of the Mongo shell and connecting to the entire replica set:
exit
mongo --host "m103-example/192.168.103.100:27011" -u "m103-admin"
-p "m103-pass" --authenticationDatabase "admin"
- Getting replica set status:
- Adding other members to replica set:
rs.add("m103:27012") # m103 stands for hostname
rs.add("m103:27013")
rs.add("m103:27014")
rs.addArb("m103:28000")
- Getting an overview of the replica set topology:
- Stepping down the current primary:
- Checking replica set overview after election:
- Assigning the current configuration to a shell variable we can edit, in order to reconfigure the replica set:
- Editing our new variable cfg to change topology - specifically, by modifying cfg.members:
- Updating our replica set to use the new configuration cfg:
Replication Commands
rs.status()
- Report health on replica set nodes
- Uses data from heartbeats
rs.isMaster()
- Describes a node's role in the replica set
- Shorter output than
rs.status()
db.serverStatus()['repl']
- Section of
db.serverStatus()
that describes replica set status - Similar to the output of
rs.isMaster()
rs.printReplicationInfo()
- Only returns oplog data relative to current node
- Contains timestamps for first and last oplog events
Local DB
- Display collections from the local database (this displays more collections from a replica set than from a standalone node):
- Query the oplog after connected to a replica set:
- Store oplog stats as a variable called stats:
- Verify that this collection is capped (it will grow to a pre-configured size before it starts to overwrite the oldest entries with newer ones):
- Get current size of the oplog:
- Get size limit of the oplog:
- Get current oplog data (including first and last event times, and configured oplog size):
Sharding
We shoule consider sharding in these situations:
- Our organization outgrows the most powerful servers available, limiting our vertical scaling options.
- Generally, when our deployment reaches 2-5TB per server, we should consider sharding.
- Government regulations require data to be located in a specific geography.
In a sharded cluster, collection metadata is stored on the configuration servers.
Mongos is just a router - when mongos determines which shard to route a request, it consults the config servers for the collection's metadata.