⚙️ Customize Spring MongoDB DBRef Resolution
I think I will not open a secret saying that many developers love Spring for the ability to quickly make sometimes complex things relatively simple and elegant. A huge community of developers with their questions and answers helps to quickly find a solution to a particular problem.
Today I would like to share one solution that allowed me to significantly improve performance when reading data from MongoDB.
In my scenario, I’m actively using DBRefs to refer from one object to another instead of embedding objects within a document. Perhaps you can argue and say that it should not be used taking into account that on the MongoDB web site there is
Unless you have a compelling reason to use DBRefs, use manual references instead.
But my point is if MongoDB has such type, where I can explicitly see the information about collection and ID why don’t use it. Another point is my data layer is build on top of Spring Data MongoDB which simplifies many things one thing is how Spring Data MongoDB mapping framework works with DBRefs
The mapping framework does not have to store child objects embedded within the document. You can also store them separately and use a DBRef to refer to that document. When the object is loaded from MongoDB, those references are eagerly resolved so that you get back a mapped object that looks the same as if it had been stored embedded within your top-level document.
_ Source _: https://docs.spring.io/spring-data/mongodb/docs/3.1.3/reference/html/#mapping-usage-references
And everything is good until you have relatively small objects without deep nesting. When nesting becomes quite deep or when you need to load more data or when DB grows you might notice performance degradation like happened in my scenario, i.e. each time when the mapping framework fetches the data, even if data is repeated it tries to fetch each time ignoring the fact the same portion already fetched from the DB.
To resolve DBRefs by default Spring uses DefaultDbRefResolver. Enabling logging for org.springframework.data.mongodb.core.convert.DefaultDbRefResolver
might be very useful to see what happens.
In my application, I noticed too much Fetching DBRef...
and Bulk fetching DBRefs...
which pushed me to move in that direction and come up with a solution to avoid unnecessarily fetching. I decided to create a custom DbRefResolver
and use Spring caching. Below is my CachedDbRefReolver
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
@Slf4j
public class CachedDbRefResolver extends DefaultDbRefResolver {
private final Cache cache;
public CachedDbRefResolver(MongoDatabaseFactory mongoDbFactory, CacheManager cacheManager) {
super(mongoDbFactory);
this.cache = cacheManager.getCache(CacheConstants.DBREF_CACHE_NAME);
}
@Override
public Document fetch(@NotNull DBRef dbRef) {
Document document = cache.get(dbRef.getId().toString(), Document.class);
if (document == null) {
document = super.fetch(dbRef);
cache.put(dbRef.getId().toString(), document);
}
return document;
}
@Override
public @NotNull List<Document> bulkFetch(List<DBRef> dbRefs) {
List<DBRef> missingDbRefs = new ArrayList<>();
List<Document> result = new ArrayList<>();
for (DBRef dbRef : dbRefs) {
Document document = cache.get(dbRef.getId().toString(), Document.class);
if (document == null) {
missingDbRefs.add(dbRef);
} else {
result.add(document);
}
}
if (!missingDbRefs.isEmpty()) {
List<Document> missingDocuments = super.bulkFetch(missingDbRefs);
for (Document missingDocument : missingDocuments) {
cache.put(missingDocument.get(MongoUtils.UNDERSCORE_ID).toString(), missingDocument);
result.add(missingDocument);
}
}
return result;
}
}
As you can see I use DefaultDbRefResolver
as the base and override two methods:
fetch
to fetch individual DocumentsbulkFetch
for bulk fetching to fetch only items, not existing in the cache
In the code above I use a few constants:
CacheConstants.DBREF_CACHE_NAME
ispublic static final String DBREF_CACHE_NAME = "DBREF";
MongoUtils.UNDERSCORE_ID
ispublic static final String UNDERSCORE_ID = "_id";
My CacheManager
is ConcurrentMapCacheManager
registered as
1
2
3
4
5
6
7
8
@Configuration
public class CacheBeans {
@Bean
public CacheManager cacheManager() {
return new ConcurrentMapCacheManager(CacheConstants.DBREF_CACHE_NAME);
}
}
See more details here Supported Cache Providers.
Another important point with cache is a cache eviction.
There are only two hard things in Computer Science: cache invalidation and naming things.
– Phil Karlton
Since I know when the data could be evicted from the cache. In my scenario, I have a custom repository to get access to the data and any update means that an item needs to be evicted. Considering all of the above I see two options:
- remove the documents by ID from the cache directly
- use
@CacheEvict
annotation, see docs here
For simplicity sake, I decided to use the second option because at the moment this is more than enough for me, below is how it looks like
1
2
3
4
5
6
7
8
9
10
@Slf4j
public class MongoUserRepository implements UserRepository {
@Override
@CacheEvict(value = CacheConstants.DBREF_CACHE_NAME, key = "#user.id", beforeInvocation = true)
public User update(User user) {
// update user in the DB
return user;
}
}
Hopefully, this information will be useful!