DEV Community

Cover image for $graphLookup (Connect By / Recursive Query)
Franck Pachot for MongoDB

Posted on • Edited on

$graphLookup (Connect By / Recursive Query)

In this series, I present various access patterns for a specific document model. These patterns are supported by a limited set of secondary indexes designed to make queries efficient, without modifying the document schema.

This article explores recursive searches through graph-like relationships between documents (imported in the forst post of this series), with each video in this collection showcasing related content with an array of related videos:

[
  {
    _id: '---U8lzusKE',
    category: 'Entertainment',
    relatedVideos: [
      'x9LRHlMdZmA', '5P5nxdJAFdE',
      'jdg8Sp1HUKM', 'xdxVBiJe8Co',
      'qLSA0gQ9z28', 'WHZPEkZCqwA',
      'y3VMhFCLxRc', 'hHjGtBnSv50',
      '_vx1OVLX5Rc', 'V4LnorVVxfw',
      'l56K8eAtCig', 'dHpCoFyMCHU',
      'XO5BYR39te8', 'yWy0cuxNWDw',
      '4SiXdhL7wxU', '5EaZTxQeQMQ',
      'mOvmBNLQIi4', 'fa2CvFa2xY8',
      'CpbYBZKdi3s', 'lBxzoqTSILc',
      'RBumgq5yVrA', 'EoN8RKubbO0',
      'zIHQPgz_Iwg', '7PCkvCPvDXk',
      't1NVJlm5THo'
    ],
...
Enter fullscreen mode Exit fullscreen mode

With this structure, I can easily navigate from one video to its related ones, and from there to further related content, effectively building a graph of interconnected videos. There’s no need for an additional index since each video references the "_id" of its related videos, which is always indexed.

Access Patterns: forward traversal of related documents

The following query identifies a video and explores down to three levels of related videos, constructing a graph of connections based on the associated video array. It filters these connections by daily views and restructures the output for improved readability:

db.youstats.aggregate([  
  {  
    $match: { _id: 'YoB8t0B4jx4' }   
  },  
  {  
    $graphLookup: {  
      from: "youstats",  
      startWith: "$relatedVideos", 
      connectFromField: "relatedVideos", 
      connectToField: "_id", 
      as: "allRelatedVideos", 
      maxDepth: 3,  
      restrictSearchWithMatch: {  
        "views.daily.data": { $gt: 1e6 } 
      },  
      depthField: "level" 
    }  
  },  
  {  
    $project: {  
      _id: 1,  
      title: 1,  
      author: 1,  
      allRelatedVideos: {  
        $map: {  
          input: "$allRelatedVideos",  
          as: "video",  
          in: {  
            number: { $add: [ { $indexOfArray: ["$allRelatedVideos", "$$video"] }, 1 ] },  
            _id: "$$video._id",  
            title: "$$video.title",  
            author: "$$video.author",  
            level: "$$video.level"  
          }  
        }  
      }  
    }  
  }  
])
Enter fullscreen mode Exit fullscreen mode

The execution plan shows the IXSCAN only during the $match stage, but the subsequent iterations utilize the same method.

    stage: 'EXPRESS_IXSCAN',
    keyPattern: '{ _id: 1 }',
Enter fullscreen mode Exit fullscreen mode

With $graphLookup, you need to have an index on connectToField.

Access Patterns: backward traversal of related documents

To navigate the graph in the opposite direction and find the parent with the _id in its related videos array, an index on that field is essential for quick access. In MongoDB, indexes are created similarly for both scalar fields and arrays:

db.youstats.createIndex({ 
 relatedVideos: 1, _id: 1 
});  
Enter fullscreen mode Exit fullscreen mode

The following query, where the connectToField is the related videos array, is fast:

db.youstats.aggregate([  
  {      
    $match: { _id: 'x9LRHlMdZmA' }       
  },      
  {      
    $graphLookup: {      
      from: "youstats",      
      startWith: "$_id",      
      connectFromField: "_id",      
      connectToField: "relatedVideos",      
      as: "parentVideos",      
      maxDepth: 3,       
      depthField: "level",  
      restrictSearchWithMatch: {      
        "views.daily.data": { $gt: 1e6 }      
      }      
    }      
  }      
]);  
Enter fullscreen mode Exit fullscreen mode

Using $graphLookup in an aggregation pipeline effectively retrieves a limited number of documents, as long as the work area remains within the 100MB memory limit and results do not exceed the BSON limit of 16MB. For utilizing MongoDB as a document database, consider PuppyGraph (Querying MongoDB Atlas Data as a Graph). The same indexes allow fast recursive search and can be on a scalar identifier or an array or of child, depending if you implemented the one-to-many in the one-side or many-side.

Top comments (0)