Introduction
Effectively managing data, services, and information is crucial to a company’s success. And in a world of increasingly growing data, choosing the right data management solution is more important than ever. MongoDB is an emerging, open source NoSQL database that you can use for massive data management. It provides high performance, high availability, and easily scalable features. A critical element of data management is performing backups regularly to prevent the loss of data, and this is not something you should do manually. There are several different methods that you can use to perform backups in MongoDB, but this article shows you how to configure and run the backup scripts using mongodump
and mongorestore
.
Set up a MongoDB instance
Use the following steps to design MongoDB backup scripts that run automatically with a cron job. You can customize the backup script based on your business requirements.
Note: To follow the steps in this tutorial, create a separate user (named mongo
, for example) to perform the administrative operations in MongoDB.
- Install MongoDB from the MongoDB Download Center. The MongoDB binaries are located in the /opt/mongodb/mongodb/bin/ directory. Use the
root
credentials when you install MongoDB. Log in as
root
, and then create a group and user:# groupadd mongogrp # vi /etc/group mongogrp:x:1005:
Add the user
mongo
into the groupmongogrp
that you just created:# useradd mongo -d /home/mongo -f -1 -g mongogrp -m
Verify that the user was created:
# cd /home # ls -l drwxr-xr-x 2 mongo mongogrp 4096 Oct 18 07:25 mongo vi /etc/passwd mongo:x:1005:1005::/home/mongo:
Set the password for the user
mongo
:# passwd mongo
Invoke a MongoDB shell
Open a new session and log in as the user
mongo
:$ sudo su – mongo
Locate the bin directory of the MongoDB installation:
$ cd /opt/mongodb/mongodb/bin/ $ ls bsondump mongo mongod mongodump mongoexport mongofiles mongoimport mongooplog mongoperf mongorestore mongos mongosniff mongostat mongotop
- To run these utilities, the bin directory must have proper access to the user
mongo
. Use theroot
credentials and provide the proper access to themongo
user on the bin directory. - Verify that the utilities shown in Step 2 are under the bin directory.
You have two options when invoking a MongoDB shell: One uses SSL and one does not. The instructions for both are below.
To invoke a MongoDB shell, run the
mongo
command with these options:$ ./mongo - - host pre-mongo01.ibmcloud.com - - port 27017
To invoke a MongoDB shell using SSL, run the
mongo
command with these options:$ ./mongo --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem--sslPEMKeyPassword password123 pre-mongo01.ibmcloud.com:27017
Where:
Certification file :/opt/mongodb/mongodb/cert/mongo.server.trust-certs.pem password123 :sslPEMKeyPassword port no :27017 Hostname :pre-mongo01.ibmcloud.com
Note: Here, you will get MongoDB shell version 3.2. You can now perform various operations.
This article executes MongoDB commands and utilities using SSL.
> show dbs
testdb1 0.800GB
local 0.000GB
mydb 0.300GB
MongoDB backup and restore functions
To back up the MongoDB database, you use the mongodump
utility, which is located in the bin directory. This backs up all data into the dump folder at the default location /bin/dump. MongoDB uses a default port 27017. You can use the utility mongodump
for both hot (online) and cold (offline) backup.
Offline MongoDB backup
The MongoDB server has a primary daemon process, called mongod
. Mongod
manages data access, data requests, and background operations. To perform an offline backup, you first stop the mongod
service, which stops the MongoDB
instance. Then, perform the backup and start the MongoDB
instance.
To create an offline MongoDB backup:
Create a backup directory. Based on the size of the MongoDB database, create the backup directories at the respective backup locations. Here, you create the backup directories as /mongo_data/backup:
# mkdir /mongo_data/backup # ls –l
Change the owner to the user
mongo
:# chown -R mongo:mongogrp /mongo_data/backup/ # ls –l
Stop the
mongod
instance:service mongod stop
Perform the offline backup. First, log in to
mongo
user and locate the bin directory:$ pwd /opt/mongodb/mongodb/bin $./mongodump --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem --sslPEMKeyPassword password123 --host pre-mongo01.ibmcloud.com:27017 --out /mongo_data/backup/
- Start the
mongod
instance by typingservice mongod start
.
Online MongoDB backup
To perform an online backup, run the mongodump
command without stopping, and restart the mongod
instance. You can run this command with or without SSL. Both commands are shown below.
To perform the mongodump
command using SSL, your command looks something like this:
$./mongodump --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem
--sslPEMKeyPassword password123 --host pre-mongo01.ibmcloud.com:27017 --out /mongo_data/backup/
To perform the mongodump
command without using SSL, your command looks like this:
$ ./mongodump -- host pre-mongo01.ibmcloud.com --port 27017 --out /mongo_data/backup/
Small sharded cluster backup with mongodump
If your sharded cluster holds a small set of data, you can connect to a mongos
using mongodump
.
n a MongoDB sharded cluster, mongos
is a routing service that processes queries from the application layer. To perform the operation, it determines the location of this data in the sharded cluster.
You can create the MongoDB cluster backup if the infrastructure can perform the entire backup in a reasonable amount of time and a storage system can hold the complete MongoDB data set. By default, mongodump
issues its queries to the non-primary nodes.
To perform the backup of a sharded cluster, use mongodump
as shown in the following command:
$ ./mongodump -- host pre-mongo01.ibmcloud.com --port 27017
Note: Applications can continue to modify data while mongodump
captures the output. For replica sets, mongodump
provides the –oplog option to include in its output oplog entries that occur during the mongodump
operation. This allows the corresponding mongorestore
operation to replay the captured oplog. To restore a backup created with –oplog, use mongorestore
with the –oplogReplay option. However, for replica sets, consider MongoDB Cloud Manager or Ops Manager.
Recovery scenarios
You can better plan for and avoid failure scenarios if you understand how failures occur and how you can recover from them. The following sections simulate different types of failures and present a series of steps you can follow if one of them occurs in your environment.
Scenario 1. An entire database is accidentally dropped or becomes corrupted
A manual error or hardware failure can damage or corrupt an entire database. If this happens, you can recover the complete database by applying the last full backup of mongodump
and restore it using the mongorestore
utility.
In this scenario, we assume the database as testdb1 with the collections (for example, users and students) having some records, as shown below.
Step 1. Verify the database and the collections
Log in as a mongo user:
$ sudo su – mongo
Locate the bin directory:
$ cd /opt/mongodb/mongodb/bin/
Invoke the MongoDB shell:
$ ./mongo --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem --sslPEMKeyPassword password123 pre-mongo01.ibmcloud.com:27017
Verify the database and collections:
> show dbs testdb1 0.800GB local 0.000GB mydb 0.300GB > use testdb1 switched to db testdb1 > show collections student users > db.student.find({},{_id:0}) { "rollno" : 1, "name" : "amol", "subject" : "english", "marks" : 90 } { "rollno" : 2, "name" : "rachna", "subject" : "english", "marks" : 85 } { "rollno" : 3, "name" : "Bob", "subject" : "english", "marks" : 75 } > db.users.find({},{_id:0}) { "name" : "Amol", "age" : 39 } { "name" : "Bob", "age" : 30 } { "name" : "Rachna", "age" : 36 } { "name" : "Aadya", "age" : 3 }
Step 2. Back up the entire database
Use the following code to back up the entire database:
$./mongodump --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem --sslPEMKeyPassword password123 --host pre-mongo01.ibmcloud.com:27017 –db testdb1 --out /mongo_data/backup/
2017-01-24T04:14:07.252-0500 writing testdb1.student to
2017-01-24T04:14:07.253-0500 writing testdb1.users to
2017-01-24T04:14:07.254-0500 done dumping testdb1.student (3 documents)
2017-01-24T04:14:07.254-0500 done dumping testdb1.users (3 documents)
Step 3. Simulate a failure
To simulate a failure scenario, you need to completely drop the database.
Connect to the database:
> use testdb1 switched to db mydb
Verify the collections:
> show collections student users
Verify the current database:
> db testdb1
Drop the database:
> db.dropDatabase() { "dropped" : "testdb1", "ok" : 1 }
Step 4. Restore database testdb1
Restore the backup image using the utility mongorestore
as demonstrated below. In this example, you restore the latest backup image available at the backup location /mongo_data/backup/testdb1
.
$ ./mongorestore --ssl --sslCAFile
../cert/mongo.server.trust-certs.pem --sslPEMKeyPassword password123 --host
pre-mongo01.ibmcloud.com:27017 --db testdb1 /mongo_data/backup/testdb1
Step 5. Verify the database and collections
To verify that the database is restored, connect to it and query the collections:
> show dbs
testdb1 0.800GB
local 0.000GB
mydb 0.300GB
> use testdb1
switched to db testdb1
> show collections
student
users
> db.student.find({},{_id:0})
> db.users.find({},{_id:0})
Scenario 2. A collection is accidentally dropped
Sometimes, especially when there are thousands of collections in a database, a collection is dropped by mistake. To recover a collection that’s accidentally dropped, you need the latest backup of the collection (for instance, collection.name.bson
) at the backup location. In this scenario, I show you how to perform these functions using the collection student
, which is dropped and recovered using the mongorestore
utility.
Step 1. Verify the collection
Use the following code to verify the collection:
> use testdb1
switched to db testdb1
> show collections
student
users
> db.student.find({},{_id:0})
{ "rollno" : 1, "name" : "amol", "subject" : "english", "marks" : 90 }
{ "rollno" : 2, "name" : "rachna", "subject" : "english", "marks" : 85
}
{ "rollno" : 3, "name" : "Bob", "subject" : "english", "marks" : 75 }
Step 2. Back up the collection
Perform the backup only at the collection level:
$./mongodump --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem--sslPEMKeyPassword password123
--host pre-mongo01.ibmcloud.com:27017 –db testdb1 --collection student --out /mongo_data/backup
Note: Make sure that at the backup location (for instance, mongo_data/backup), a file student.bson is created.
Step 3. Simulate the failure
To simulate this failure, drop the collection completely.
Connect to the database:
> use testdb1 switched to db testdb1
Drop the collection:
> db.student.drop() true
Verify that the contents are dropped:
> db.student.find({},{name:1,age:1,_id:0})
Step 4. Restore the collection
Restore the latest backup image of the collection (for example, student.bson
) from the mongo_data/backup
location as shown:
$./mongorestore --ssl --sslCAFile
../cert/mongo.server.trust-certs.pem --sslPEMKeyPassword
password123 --host pre-mongo01.ibmcloud.com:27017 --db mydb --collection student
/mongo_data/backup/testdb1/student.bson
Note: You can perform database-level backup using mongodump
, and you can only restore the required collection from the backup directory if required.
Step 5. Verify the collections
To verify that the collection is restored, connect to the database and query the collection:
> use testdb1
switched to db testdb1
> show collections
student
users
> db.student.find({},{_id:0})
{ "rollno" : 1, "name" : "amol", "subject" : "english", "marks" : 90 }
{ "rollno" : 2, "name" : "rachna", "subject" : "english", "marks" : 85
}
{ "rollno" : 3, "name" : "Bob", "subject" : "english", "marks" : 75 }
Create and run a backup script using a cron job
To set up the backup strategy based on the business requirement, you need to set up the customized backup script: run_backup.sh
. You also need to set up a cron job to run this backup.
Create the backup script
The following sample backup script, run_backup.sh
, is based on the following conditions.
- The script first deletes the backup images older than 30 days from the backup location. It performs the daily backup of all the databases at the backup location.
- The script writes the comments into the log file backup.log about what the backup action deleted and when it was performed. By default,
mongodump
does not capture the contents of the local database. You should add it separately if required. To uniquely identify the individual database backup with the application name, you must define some parameters in the backup script. This is shown in the next section.
Create a backup directory
Log in as
root
user:# mkdir /mongo_data/backup
Change the owner as
mongo
user;# chown -R mongo:mongogrp /mongo_data/backup
Log in as mongo user:
mongo@pre-mongo01 $pwd /mongo_data/backup
Create a backup script:
mongo@pre-mongo01:/mongo_data/backup$ vi run_backup.sh
The following code shows sample content of the backup script:
#/bin/bash cd /opt/mongodb/mongodb/bin/ echo `date` >>/mongo_data/backup/backup.log APP_NAME="app1" MONGO_HOST="pre-mongo01.ibmcloud.com" MONGO_PORT="27017" TIMESTAMP=`date +%F-%H%M` MONGODUMP_PATH="/opt/mongodb/mongodb/bin/mongodump" BACKUPS_DIR="/mongo_data/backup/$APP_NAME-$TIMESTAMP" BACKUP_NAME="/mongo_data/backup/$APP_NAME-$TIMESTAMP" mkdir -p $BACKUPS_DIR cd /opt/mongodb/mongodb/bin/ #Delete all backups older than 30 days from /mongo_data/backup echo "Deleting following backup files older than 30 days:" >> /mongo_data/backup/backup.log find /mongo_data/backup/ -type d -name 'app1-*' -mtime +30 >> /mongo_data/backup/backup.log find /mongo_data/backup/ -type d -name 'app1-*' -mtime +30 -exec rm -rf {} + #Run the daily backup 'local' database only. for databaseName in local do echo "Starting daily backup of $databaseName ...." >> /mongo_data/backup/backup.log ./mongodump --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem --sslPEMKeyPassword password123 --host pre-mongo01.ibmcloud.com:27017 --db $databaseName >>/mongo_data/backup/backup.log #Run the daily backup of remaining databases. echo "Starting daily backup of all databases...." >> /mongo_data/backup/backup.log ./mongodump --ssl --sslCAFile ../cert/mongo.server.trust-certs.pem --sslPEMKeyPassword password123 --host pre-mongo01.ibmcloud.com:27017 >>/mongo_data/backup/backup.log if [ $? != 0 ]; then echo "Failed to make backup of $databaseName on `date +%F_%T`"|mailx -s "MongoDB backup failed" amolbarsagade@in.ibm.com fi done mv /opt/mongodb/mongodb/bin/dump $BACKUP_NAME echo `date` >> /mongo_data/backup/backup.log echo "End of backup run" >> /mongo_data/backup/backup.log echo "----------------------------------" >> /mongo_data/backup/backup.log
Grant permission
Save the file and grant it 755 permission:
mongo@pre-mongo01:/mongo_data/backup$ chmod 755 run_backup.sh
Create a cron job to run the backup script
Create a new cron job, named mycron.txt, at the mongo_data/backup
location and schedule it according to the backup strategy.
Log in as the
mongo
user:mongo@pre-mongo01 :/mongo_data/backup$ pwd /mongo_data/backup
Check the existing cron jobs for the
mongo
user:mongo@pre-mongo01:/mongo_data/backup$ crontab -l no crontab for mongo
Create a new cron job:
mongo@pre-mongo01:/mongo_data/backup$ vi mycron.txt
Add the backup schedule to run this cron job according to your backup strategy. For example:
30 02 * * * /mongo_data/backup/run_backup.sh >> /mongo_data/backup/run_backup.sh.out
Set the cron job and verify it:
mongo@pre-mongo01:/mongo_data/backup$ crontab mycron.txt mongo@pre-mongo01:/mongo_data/backup$ crontab -l 30 02 * * * /mongo_data/backup/run_backup.sh >> /mongo_data/backup/run_backup.sh.out
Conclusion
You’ve configured and run the backup scripts for MongoDB database servers, and you have a better understanding of how to use the mongodump
and mongorestore
utilities for backup and restore purposes. You scheduled and ran the backup script using a cron job. You can use the backup script to schedule, maintain, and manage the backup of your MongoDB database servers.