The development cycle for 3.0 has been the most eventful of my entire career. As originally planned, it would have been great, but still incremental in nature. Instead, we wound up acquiring our first company, integrating their next-gen storage engine, and by capitalizing on that unlooked-for opportunity, delivering a release so beyond its original conception that we revved its version number.
Renaming a release in-flight is out of the ordinary, so I wrote about our reasoning when we announced the change. We had originally planned to deliver document-level locking built into MMAPv1, and a storage engine API as an investment in the future, not part of a fully developed integration. That would have been our incremental improvement, in line with our storage engine efforts throughout the 2.x release series. We had already added database-level locking, iterated over many improvements to yielding and scheduling behavior, and refactored a ton of code to decouple components.
At the outset of this development cycle we did several things in parallel. We carved out the code layers to support our storage engine API, started building collection-level locking into MMAPv1, and started designing document-level locking. At the same time, we worked with storage engine builders to put our API through its paces. By the summer of 2014, we had a MMAPv1 prototype for document-level locking, which we demonstrated at MongoDB World. While this was not going to make our use of disks more efficient or solve other MMAPv1 problems, it was nonetheless a huge improvement, and exactly what we were aiming for.
Then the WiredTiger team called us and demonstrated a working integration with MongoDB’s storage engine API. Before long, we realized we had before us an opportunity to shoot the moon. We would have to scale back our plans for MMAPv1 to just collection-level locking, but by doing so, we could completely leapfrog our roadmap and supercharge our team. By delivering MongoDB with WiredTiger, we could offer our users everything we had promised, along with performance MMAPv1 will never match, and features it would take years more to build in. After all, WiredTiger was developed with laser focus on the raw fundamentals of data storage in a modern environment, allowing it to support massive concurrency and other great features like compression.
For all its magnificence, WiredTiger is not yet the default storage engine. We have every confidence in its ability– it is a shipping product in its own right, and has proven its mettle to customers with the most demanding production environments, such as Amazon. We are using it ourselves in production to back MMS. However, the use cases for MongoDB are so broad and varied, we need to gather a wide range of feedback. With that data, we’ll be able to optimize and tune the integration and provide robust guidance on the role of specific metrics in capacity planning, leading to better, more predictive monitoring, and a healthy collection of best practices.
The acquisition of WiredTiger marks an important transition for me as well. Storage engines are incredibly interesting components of a database, but as much as I might like to dig further into them, our goal to make MongoDB the go-to database requires me to be more pragmatic. With a team of world-renowned experts available, that know more about (for example) how to implement MVCC than I ever will, it makes sense to leave storage engines in their capable hands so I can focus on other areas.
MongoDB 3.0 is a great release. I am very proud of the massive team effort that produced it. We will not be resting on our laurels though. There is still a long list of features and improvements our users need to be successful, and with MongoDB 3.0, we expect MongoDB to be used in even more demanding and mission critical projects. Many of those projects will surprise us, and these surprises will create new demands. We are excited to get started on these challenges, further optimizing MongoDB, and extending its capabilities so the pioneers can continue to surprise us.