In this series, I present various access patterns for a specific document model. These patterns are supported by a limited set of secondary indexes designed to make queries efficient, without modifying the document schema.
This article explores recursive searches through graph-like relationships between documents (imported in the forst post of this series), with each video in this collection showcasing related content with an array of related videos:
[
{
_id: '---U8lzusKE',
category: 'Entertainment',
relatedVideos: [
'x9LRHlMdZmA', '5P5nxdJAFdE',
'jdg8Sp1HUKM', 'xdxVBiJe8Co',
'qLSA0gQ9z28', 'WHZPEkZCqwA',
'y3VMhFCLxRc', 'hHjGtBnSv50',
'_vx1OVLX5Rc', 'V4LnorVVxfw',
'l56K8eAtCig', 'dHpCoFyMCHU',
'XO5BYR39te8', 'yWy0cuxNWDw',
'4SiXdhL7wxU', '5EaZTxQeQMQ',
'mOvmBNLQIi4', 'fa2CvFa2xY8',
'CpbYBZKdi3s', 'lBxzoqTSILc',
'RBumgq5yVrA', 'EoN8RKubbO0',
'zIHQPgz_Iwg', '7PCkvCPvDXk',
't1NVJlm5THo'
],
...
With this structure, I can easily navigate from one video to its related ones, and from there to further related content, effectively building a graph of interconnected videos. There’s no need for an additional index since each video references the "_id" of its related videos, which is always indexed.
Access Patterns: forward traversal of related documents
The following query identifies a video and explores down to three levels of related videos, constructing a graph of connections based on the associated video array. It filters these connections by daily views and restructures the output for improved readability:
db.youstats.aggregate([
{
$match: { _id: 'YoB8t0B4jx4' }
},
{
$graphLookup: {
from: "youstats",
startWith: "$relatedVideos",
connectFromField: "relatedVideos",
connectToField: "_id",
as: "allRelatedVideos",
maxDepth: 3,
restrictSearchWithMatch: {
"views.daily.data": { $gt: 1e6 }
},
depthField: "level"
}
},
{
$project: {
_id: 1,
title: 1,
author: 1,
allRelatedVideos: {
$map: {
input: "$allRelatedVideos",
as: "video",
in: {
number: { $add: [ { $indexOfArray: ["$allRelatedVideos", "$$video"] }, 1 ] },
_id: "$$video._id",
title: "$$video.title",
author: "$$video.author",
level: "$$video.level"
}
}
}
}
}
])
The execution plan shows the IXSCAN only during the $match stage, but the subsequent iterations utilize the same method.
stage: 'EXPRESS_IXSCAN',
keyPattern: '{ _id: 1 }',
With $graphLookup, you need to have an index on connectToField
.
Access Patterns: backward traversal of related documents
To navigate the graph in the opposite direction and find the parent with the _id in its related videos array, an index on that field is essential for quick access. In MongoDB, indexes are created similarly for both scalar fields and arrays:
db.youstats.createIndex({
relatedVideos: 1, _id: 1
});
The following query, where the connectToField
is the related videos array, is fast:
db.youstats.aggregate([
{
$match: { _id: 'x9LRHlMdZmA' }
},
{
$graphLookup: {
from: "youstats",
startWith: "$_id",
connectFromField: "_id",
connectToField: "relatedVideos",
as: "parentVideos",
maxDepth: 3,
depthField: "level",
restrictSearchWithMatch: {
"views.daily.data": { $gt: 1e6 }
}
}
}
]);
Using $graphLookup in an aggregation pipeline effectively retrieves a limited number of documents, as long as the work area remains within the 100MB memory limit and results do not exceed the BSON limit of 16MB. For utilizing MongoDB as a document database, consider PuppyGraph (Querying MongoDB Atlas Data as a Graph). The same indexes allow fast recursive search and can be on a scalar identifier or an array or of child, depending if you implemented the one-to-many in the one-side or many-side.
Top comments (0)