Skip to content

Advanced MongoDB Interview Questions for Senior Developers in 2026

A deep dive into senior-level MongoDB interview questions covering performance, replication, sharding, transactions, aggregation, and production debugging.

Cover image for an advanced MongoDB interview guide, showing database documents, indexes, and query performance concepts.
Advanced MongoDB interview questions for senior developers, covering performance, replication, sharding, transactions, and production debugging.

A practical guide for senior engineers , or interviewers screening for them. The questions here go past syntax and into the parts of MongoDB that only show up at scale: WiredTiger's cache behavior, sharding decisions, replication trade-offs, multi-document transactions, and the production-debugging patterns that you only learn from being on call.

If you can answer the fundamentals (BSON, indexing, basic aggregation) but want to handle senior-level questions, this is the layer above. The fundamentals article , MongoDB Interview Questions for Beginners , covers the prerequisite material.


1. Storage Engine & Performance Internals

Q1. What's the storage engine? What changed with WiredTiger?

WiredTiger has been the default since 3.2 and the only option since 4.2. The relevant properties:

  • Document-level locking instead of collection-level (MMAPv1's bottleneck).
  • Snapshot-isolation MVCC: readers see a consistent snapshot, writers don't block readers.
  • Compression: Snappy by default for documents, zlib/zstd optional. Indexes use prefix compression.
  • Configurable cache (storage.wiredTiger.engineConfig.cacheSizeGB). The rule of thumb is 50% of RAM minus 1 GB, leaving the rest for the OS page cache.

Q2. How does the WiredTiger cache interact with the OS page cache?

WiredTiger holds uncompressed working-set data in its own LRU cache. The OS holds the compressed on-disk blocks in the page cache. So a hot working set ends up cached twice in different forms , which is why memory sizing is non-trivial. Hit ratios live in db.serverStatus().wiredTiger.cache.

The implication: tuning cacheSizeGB is a balance, not a maximum. Too high and you starve the OS page cache, defeating the second-level caching. Too low and WiredTiger evicts hot pages it could have kept.

Q3. Why is my query slow even with an index?

A short checklist that resolves 90% of real-world cases:

  1. Not actually using the index , explain() will say COLLSCAN. Common causes: regex without anchor, negation operators ($ne, $nin), type mismatch.
  2. Index isn't selective , e.g., indexing gender in a dataset that's 50/50. The optimizer might still prefer a COLLSCAN.
  3. Sort spills to disk , in-memory sort cap is 100 MB; past that, the query fails unless you set allowDiskUse: true.
  4. Working set exceeds RAM , every query becomes I/O-bound. Look at cache.bytes read into cache over time.
  5. Index doesn't cover the sort , even with IXSCAN, the explain shows SORT_KEY_GENERATOR afterwards.
  6. Compound index in wrong order , ESR violation.
  7. $lookup joining without an index on the foreign collection , the join becomes O(N×M).

Q4. What's $hint and when do you reach for it?

hint() forces the query planner to use a specific index. The query planner usually picks well, but it occasionally picks poorly under shifting data distributions (it caches winning plans for 1000 ops / 5 minutes by default; replans on cache invalidation). hint() is a band-aid , useful for production-pinning a known good plan while you fix the underlying index design.

Q5. What does db.collection.stats() tell you, and what do you look at?

The metrics that matter day-to-day:

  • size / storageSize , uncompressed vs on-disk. The ratio is your effective compression.
  • count , document count.
  • avgObjSize , large avgObjSize is often a smell (oversize embedded arrays).
  • nindexes and totalIndexSize , index count and total index bytes.
  • wiredTiger.cache.bytes currently in the cache , how much of this collection is hot.

2. Replication

Q6. Explain a replica set.

A group of mongod processes maintaining the same data. One primary accepts writes; the rest are secondaries replicating asynchronously via the oplog (a capped collection). If the primary fails, the secondaries hold an election (Raft-based since 3.2) and elect a new primary, usually within ~10,12 seconds.

Roles you might add:

  • Arbiter: votes in elections but holds no data. Use sparingly , it can prevent rollbacks but doesn't increase fault tolerance for data.
  • Hidden: replicates but is invisible to clients. Used for backups.
  • Delayed: replicates with a lag, used as a "human-error" recovery target.

Q7. Read preferences and write concerns , what do they actually mean?

  • Write concern (w): how many nodes must acknowledge a write before it's "done."
    , w: 1 , primary only (default).
    , w: 'majority' , durable across failover.
    , w: <number> , exact node count.
    , Plus j: true for journal flush, wtimeout for upper bound.
  • Read concern: what level of isolation reads see.
    , local , most recent, may be rolled back.
    , majority , only data acknowledged by a majority.
    , linearizable / snapshot , strict serializability and transactional snapshots respectively.
  • Read preference: which member to read from.
    , primary (default), primaryPreferred, secondary, secondaryPreferred, nearest.

Read from secondaries only if you can tolerate stale data , replication is async.

Q8. What's a rollback?

If a primary accepts writes that aren't replicated to a majority, then loses its primary status (network partition, crash), the new primary won't have those writes. When the old primary rejoins, the unreplicated writes are saved to a rollback directory and not applied. Using w: 'majority' prevents rollback losses by definition.

The interview signal: if a candidate proposes "always w: 1 for speed," ask them how they'd recover if the just-acknowledged write was lost to a rollback. The good answer admits that w: 1 trades durability for latency, and that the right choice depends on the data.


3. Sharding

Q9. When do you shard, and what's a shard key?

Shard when (in order of importance):

  1. The working set no longer fits in any reasonable single node's RAM.
  2. Write throughput exceeds what a single primary can handle.
  3. Storage exceeds practical single-node limits.

A shard key is one or more fields MongoDB uses to partition documents into chunks. Properties of a good shard key:

  • High cardinality , many distinct values.
  • Even distribution of writes , avoid monotonic keys (_id, timestamps) for hashed-style write distribution.
  • Aligned with common queries , queries that include the shard key are targeted (sent to one shard); queries without it are scatter-gather (sent to all).

Q10. Hashed vs ranged sharding?

  • Ranged: chunks span contiguous shard-key ranges. Range queries on the shard key hit only the necessary chunks. Risk: hot spots if the key is monotonic.
  • Hashed: MongoDB hashes the shard key before assigning chunks. Writes distribute evenly even with a monotonic key. Range queries become scatter-gather.

Real systems often use compound shard keys that combine an even-distribution prefix with a query-aligned suffix.

Q11. What's a mongos? What's a config server?

  • mongos is the routing process the client connects to. It knows the chunk → shard map, routes queries, and merges results.
  • The config server replica set stores cluster metadata (shard map, chunk locations). It's not in the data path of queries; it's queried by mongos and the shards themselves.

A senior follow-up: what happens when the config servers go down? The cluster keeps serving reads and writes from cached metadata, but chunk migrations and metadata changes stop until the config servers come back.


4. Transactions & Consistency

Q12. Does MongoDB support ACID transactions?

Yes , multi-document transactions since 4.0 (replica sets) and 4.2 (sharded). They're real ACID with snapshot isolation. The catch:

  • Default 60-second transaction limit. Tune transactionLifetimeLimitSeconds if needed.
  • They cost more than single-doc operations because of the snapshot-tracking overhead.
  • Best practice: structure your schema so most operations stay single-document (atomic by default), and reach for transactions only when you genuinely need multi-document atomicity.

The senior take: transactions are an escape hatch. If a candidate proposes them as the default solution for multi-collection workflows, they're treating MongoDB like Postgres. The first question to ask is "can this be modeled as a single document?"

Q13. How does single-document atomicity work?

Every write to a single document is atomic, even across embedded subdocuments and arrays. That's the most underused feature in MongoDB schema design , if you embed related data, you get transactional updates for free. $set, $inc, $push, etc., on the same document in one update are atomic.

The follow-up: when does this break? Cross-document operations. As soon as you need atomicity across two documents (or two collections), single-doc atomicity is no help and you're either reaching for a transaction or restructuring the schema to embed.


5. Aggregation Deep Dive

Q14. What is $merge vs $out?

Both write aggregation results to a collection:

  • $out replaces the target collection entirely.
  • $merge does an upsert-style merge: insert new, update existing (you control the matching field and conflict resolution).

$merge is what you use for materialized views and ETL pipelines that need to be re-runnable without dropping the target. $out is fine for one-shot exports but dangerous in any pipeline that another process reads from , readers will see the collection vanish and reappear.

Q15. How does $graphLookup work?

Recursive graph traversal inside aggregation , find ancestors/descendants by following a chain of references. Useful for org charts, comment trees, manager chains. The relevant fields: from, startWith, connectFromField, connectToField, as, plus optional maxDepth and depthField.

Caveats: it can be slow without a good index on the connected field, it loads results into memory (subject to the 100 MB limit unless allowDiskUse: true), and it can't cross shards.

Q16. How do you model a hierarchy / tree?

Five common patterns, pick based on access:

Pattern What you store Good for
Parent reference { parent: ObjectId } Frequent inserts, infrequent traversal
Child references { children: [ObjectId] } Reading immediate children fast
Array of ancestors { ancestors: [a, b, c] } Subtree lookups by $elemMatch
Materialized path { path: ',a,b,c,' } Path-prefix regex queries
Nested set { left, right } Read-heavy, write-rare hierarchies

The first three are the only ones I'd reach for in production with MongoDB. Materialized paths still come up in legacy schemas; nested sets almost never make sense in a document database.


6. Production Tactics

Q17. How would you implement soft-delete?

A deletedAt: Date | null field on every document plus a partial index that excludes soft-deleted rows from your common queries:

db.users.createIndex(
  { email: 1 },
  { partialFilterExpression: { deletedAt: null } }
)

Bonus: every read-side query adds deletedAt: null. Or better, route reads through a view that wraps that filter so a future developer can't forget.

Q18. How do you migrate a schema without downtime?

The standard pattern: read both shapes, write the new shape, backfill, retire the old shape.

  1. Deploy code that reads both old and new field layouts.
  2. Deploy code that writes the new layout going forward (still reading both).
  3. Backfill old docs to the new layout in chunks (use cursor-based pagination by _id to avoid duplicate visits).
  4. Once all docs are migrated, deploy code that reads only the new layout.
  5. Drop the old field/index.

Each step is independently deployable and rollback-safe. The interview signal: a candidate who proposes "stop the writers, run the migration, restart" hasn't shipped one of these.

Q19. How do you investigate a slow query in production?

  1. Turn on the profiler at a sample rate: db.setProfilingLevel(1, { slowms: 100, sampleRate: 0.1 }).
  2. Read from db.system.profile, sort by millis, group by command.filter shape.
  3. Pull the top offenders, run explain('executionStats').
  4. Decide between: add/modify an index, rewrite the query, or change the schema.

Sample rate matters , turning the profiler all the way on in production tanks throughput on busy collections.

Q20. How do you index a field that's sometimes a string and sometimes an integer?

Two routes:

  • Coerce on write , pick one type and enforce it via $jsonSchema. Best long-term answer.
  • Live with it: MongoDB will index both, but find({field: "123"}) won't match field: 123 (no implicit coercion). So queries become harder, not easier. Avoid.

The interview signal: the candidate who answers "use $expr with $convert" understands the operator surface but is solving the symptom. The candidate who answers "fix it on write and add validation" understands the cost of letting bad data in.


How to Practice Advanced MongoDB Questions

At this level, reading the answers is not enough. You need to test queries, inspect execution plans, and understand why MongoDB chooses one plan over another.

Start with a real collection and run explain('executionStats') on different queries. Compare IXSCAN, COLLSCAN, sort stages, documents examined, keys examined, and returned results. Then change the index and check what improves.

You can also practice with aggregation pipelines, $lookup, $merge, and schema design patterns by building small examples and watching how the data changes at each step.

For more practical workflows, you can explore the VisuaLeaf documentation, where MongoDB schema design, aggregation pipelines, indexes, and explain plans are explained visually.


How to Use This List

For interviewers: pick 5,7 across categories , drag in one schema-design question, one indexing/performance question, one replication or sharding question, and at least one tactical "have you actually shipped this" question. That mix surfaces the difference between a candidate who knows the docs and one who has been on call for a Mongo cluster at 3 AM.

For candidates: the lowest-leverage thing to study at this level is more operators. The highest-leverage is reading explain() output until every stage name makes sense, then doing the same with serverStatus() and db.collection.stats(). The senior interview signal is debugging vocabulary , you don't need to remember every flag, but you need to be able to say "I'd check the cache hit rate and the index size relative to RAM" and have someone nod.