Home » How to Find Duplicates in MongoDB

How to Find Duplicates in MongoDB

by Tutor Aspire

You can use the following syntax to find documents with duplicate values in MongoDB:

db.collection.aggregate([
    {"$group" : { "_id": "$field1", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

Here’s what this syntax does:

  • Group all documents having the same value in field1
  • Match the groups that have more than one document
  • Project all groups that have more than one document

This particular query finds duplicate values in the field1 column. Simply change this value to change the field to look in.

The following example shows how to use this syntax with a collection teams with the following documents:

db.teams.insertOne({team: "Mavs", position: "Guard", points: 31})
db.teams.insertOne({team: "Mavs", position: "Guard", points: 22})
db.teams.insertOne({team: "Rockets", position: "Center", points: 19})
db.teams.insertOne({team: "Rockets", position: "Forward", points: 26})
db.teams.insertOne({team: "Cavs", position: "Guard", points: 33})

Example: Find Documents with Duplicate Values

We can use the following code to find all of the duplicate values in the ‘team’ column:

db.teams.aggregate([
    {"$group" : { "_id": "$team", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

This query returns the following results:

{ name: 'Rockets' }
{ name: 'Mavs' }

This tells us that the values ‘Rockets’ and ‘Mavs’ occur multiple times in the ‘team’ field.

Note that we can simply change $team to $position to instead search for duplicate values in the ‘position’ field:

db.teams.aggregate([
    {"$group" : { "_id": "$position", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

This query returns the following results:

{ name: 'Guard' }

This tells us that ‘Guard’ occurs multiple times in the ‘position’ field.

Additional Resources

The following tutorials explain how to perform other common operations in MongoDB:

MongoDB: How to Add a New Field in a Collection
MongoDB: How to Group By and Count
MongoDB: How to Group By Multiple Fields

You may also like