Compound indexing with MongoDB

We have our own reasons to go for NoSQL databases. But that doesn’t mean you can’t enjoy blazing-fast indexing features of old school relational databases. It is provided out of the box with smart solutions & even MongoDB.

So, I have data in MongoDB that looks like this.

{
    "_id" : ObjectId("54f612b6029b47909a90ce8d"),
    "Posting Company" : "Zeta",
    "Posting Title" : "Backend Engineer",
    "Division" : "Software developer",
    .....
}
{
    "_id" : ObjectId("2hs7s7s6029b47909a90ce8d"),
    "Posting Company" : "Flock",
    "Posting Title" : "Frontend Engineer",
    "Division" : "Software developer",
    .....
}
.....

Now, under normal fetch operations like below, the web app had a bad time.

db.jobs.find(
    "Posting Company" : "Flock",
    "Posting Title" : "Frontend Engineer",
    "Division" : "Software developer",
);

It took more than 10 seconds to fetch data (even 20 – 30 seconds when the match was large in number). I mean, this is simple stuff right. Let’s do some indexing to nail it.

db.jobs.createIndex( { "Posting Company": -1 } );

But that did not work great. Even for a single record match, it took quite a time. It was still sorting through "Posting Title" and "Division" by going through all records of "Flock". Hard work! I started recording time now to benchmark. I wrote a python code that spat time taken at me like this.

5156 records
db: 3.685041904449463
tat@mmpbacula$ python3 test10.py
5156 records
db: 3.6494805812835693
tat@mmpbacula$ python3 test10.py
5156 records
db: 3.5948400497436523

But it still took 3.5 seconds approx. (I would say 4 seconds). If you ask the old sage relational guys they will tell you to stay away from NoSQL because of these reasons. “Man NoSQL will slow you down”.

But no. Hear something called Compound Index of MongoDB. To quote from the docs…

A single index structure holds references to multiple fields within a collection’s documents .

MongoDB docs on compound index

So, I had three criteria to match at a time & I applied Compound indexing.

db.jobs.createIndex( { 
    "Posting Company": -1,
    "Posting Title": -1,
    "Division": -1 
} );

The measured time is like this now. ( ~ 0.3 seconds)

6245 records
db: 0.343005895614624
tat@mmpbacula$ python3 test10.py
6245 records
db: 0.34383082389831543
tat@mmpbacula$ python3 test10.py
6245 records
db: 0.40097570419311523
tat@mmpbacula$ python3 test10.py
6245 records
db: 0.4061148166656494

It works like lightning and loads under a second. I am impressed. So, refer to manuals of your stuff. They tell you a lot about how to use them efficiently.

Design a site like this with WordPress.com
Get started