Two weeks ago I submitted the second version of the Server Side Public License (SSPL) to the OSI for review. The revision was based on feedback from the OSI license-review mailing list, which highlighted areas in the first version that would benefit from clarification. We think open source is important, which is why we chose to remain open source with the SSPL, as opposed to going proprietary or source-available. I am hopeful that the SSPL will also be declared an OSI-approved license, because the business conditions that prompted MongoDB to issue the SSPL are not unique to us and I believe the SSPL can lead to a new era of open source investment.
A couple of weeks ago I did a great fireside chat with Matt Turck at Data Driven NYC. I’ve always found that the fireside chat is a format with a lot of potential to be boring, but Matt is a great interviewer, and interacting with him on stage definitely adds to the event. For example, when I was talking about the headline features of our 3.2 release, I omitted a significant pair – the BI connector and Compass – and he reminded me to talk about them.
Last week I talked about Parse shutting down and how unfortunate that was, but also how outstanding a job they have done providing a transition path for their current users. MongoDB also published a very detailed post on how to migrate a Parse app onto MongoDB Cloud Manager and AWS Elastic Beanstalk. Since that day, the amount of activity on the open source Parse Server has been phenomenal, and many have suggested, as did one commenter on my last post, that this means it’s time for MongoDB and Parse to work even better together.
Partial indexes allow you to create an index that only includes documents in a collection that conform to a filter expression. These indexes can be much smaller, cutting down index overhead in storage space and update time, and by matching against the filter criteria, queries can use this slimmed-down index and run much faster. This is one of the new lightweight “schema where you need it” features we’re bringing to MongoDB in 3.
On August 25th I will be delivering a talk at the AWS Pop-Up Loft in NYC. The talk is entitled: “Behind the Scenes with MongoDB: Lessons from the CTO and Cofounder on Deploying MongoDB with AWS.” The AWS lofts combine hack days, talk series, bootcamps, and “ask an architect” opportunities, and mainly target engineers working on startup projects that are built on AWS, although other people do attend the talks.
The aggregation framework is one my favorite tools in MongoDB. Its a clean way to take a set of data and run it through a pipeline of steps to modify, analyze, and process data. At MongoDB World, one of the features we talked about that is coming in MongoDB 3.2 is $lookup. $lookup is an aggregation stage that lets you run a query on a different collection and put the results into a document in your pipeline.
MongoDB 3.0 has landed. The development cycle for 3.0 has been the most eventful of my entire career. As originally planned, it would have been great, but still incremental in nature. Instead, we wound up acquiring our first company, integrating their next-gen storage engine, and by capitalizing on that unlooked-for opportunity, delivering a release so beyond its original conception that we revved its version number. Renaming a release in-flight is out of the ordinary, so I wrote about our reasoning when we announced the change.
Today our team made public our first release candidate of MongoDB 2.8, rc0. Since June, beginning with MongoDB World 2014, I’ve been speaking publicly about MongoDB 2.8, and its headline features: document level locking and pluggable storage engines. What I haven’t said until now is just how related these two features are. We’ve been working on our storage API for roughly a year, and with MongoDB 2.8 rc0, we’re rolling out the first fully supported and working storage engine integration: WiredTiger.
“MongoDB is as easy to operate at scale as it is to develop with.” From the very beginning of MongoDB, I’ve envisioned making that bold claim. Until today, it’s been a dream. We just brought it firmly into the realm of the realistic. Today we rolled out a completely revamped MMS built atop Automation, our cloud service for deploying and running MongoDB. Automation works with any infrastructure, from AWS to private cloud to bare metal.
MongoDB 2.6 has been released. For my thoughts on many of the features of the release, please see my blog post on mongodb.org. Beyond the features, this release means a lot to me. In five years, we’ve gone from four people trying to figure out if a document database was a viable concept, to the fifth most popular database in the world. MongoDB 2.4 and all previous releases proved that the document model can transform how modern applications are developed and deployed.
MongoNYC 2013 is on Friday, 6/21, and I’m really looking forward to it. This is our 4th conference in New York City, and we’re expecting over a thousand attendees. I’m delivering two talks one on Data Safety, and another on Full Text Search, which we added in 2.4. I’ll also be presenting the MongoDB Roadmap at the end of the day, during which I’ll both preview the short-term aims of the upcoming 2.
MongoDB 2.5.0 (an unstable dev build) has a new implementation of the “Matcher”. The old Matcher is the bit of code in Mongo that takes a query and decides if a document matches a query expression. It also has to understand indexes so that it can do things like create a subsets of queries suitable for index covering. However, the structure of the Matcher code hasn’t changed significantly in more than four years and until this release, it lacked the ability to be easily extended.
curl http://stream.twitter.com/1/statuses/sample.json -u<user>:<pass> | mongoimport -c twitter_live One thing that you can do with mongodb is have 1 streaming master and 1 read/write master server A: ./mongod —master —dbpath /tmp/a server B: ./mongod —dbpath /tmp/b —master —slave —source localhost:27017 —port 9999 You can then pipe the stream into server A, and it will only process the live stream. Server B will replicate all changes. You can also write to it, query on it, etc… This way you can do operations that block writing on server B, but server A will never backlog.