In light of the recent COVID-19 pandemic, the last several weeks have been undoubtedly difficult for all in different ways. For me, one challenge I’ve encountered has been wanting to help, but not sure how to do it. There are many amazing organizations that need financial assistance (such as City Meals and Support Kind, among others), but I wanted to find a way to do more. As some of you may know, my parents are both physicians in two of the largest NYC hospitals, although neither work in the ER or ICU.
Two weeks ago I submitted the second version of the Server Side Public License (SSPL) to the OSI for review. The revision was based on feedback from the OSI license-review mailing list, which highlighted areas in the first version that would benefit from clarification. We think open source is important, which is why we chose to remain open source with the SSPL, as opposed to going proprietary or source-available. I am hopeful that the SSPL will also be declared an OSI-approved license, because the business conditions that prompted MongoDB to issue the SSPL are not unique to us and I believe the SSPL can lead to a new era of open source investment.
I wrote an email to my team addressing the anti-diversity-effort memo.
One of the most important things an organization can do as it grows out of startup-hood to maturity is to learn to run parallel engineering efforts. A parallel set of projects might implement multiple takes on an overall idea, testing out different approaches. Or it might implement the same idea at multiple points on a time/quality tradeoff continuum: a low-effort, easily delivered prototype, and a more fleshed-out version that takes longer to deliver.
On November 1st, 2016, I gave a talk at the CTO Summit series hosted by NASDAQ. It was a 20 minute talk, updating and expanding on a topic I’ve both blogged and written articles about – how important it is for engineering managers to keep their hands in the codebase they make decisions about. Here it is:
Last week I was in Israel for the MongoDBeer meetup and an enterprise event, both hosted by Matrix, one of our partners, and a few really great client meetings. One of the things that I don’t get to do often enough these days is work directly with customers on interesting technical challenges, so those client meetings were really quite invigorating. I was reminded of this recently when I was doing a fireside chat with Albert Wenger at NYCode, an event hosted by NextView Ventures.
Back on April 25th I spoke at dotScale in Paris; I gave a talk called “The Case for Cross-Service Joins,” as in queries that join data across multiple 3rd party services. For example, analytics over data that comes from both SalesForce and Googe Analytics. I’ve been thinking a lot about this topic, because MongoDB sits at the middle of a lot of apps that utilize 3rd-party services, and the benefits of building your app on top of such services comes at the cost of that data being siloed away, and difficult analyze it in a holistic way.
One theme I kept harping on at MongoDB World a few weeks ago was knowing when to innovate around new ideas and when to just reuse what already works well for products that have been successful. This comes up continuously at MongoDB, because having a good understanding of it is a significant competitive advantage. I attribute a large extent of MongoDB’s success to our unbending adherence to this discipline. When we started MongoDB, we had a clear goal - make data and databases easier for developers and operators, so that data and databases serve their users, not the other way around.
Almost all modern applications are composed of presentation layers, services executing business logic, and backing stores where the data resides. Developers could be more productive and agile if they could work more directly with the backing data without having to build specific APIs for every access type, but is quite a challenging problem. An emerging class of solution known as Backend as a Service (BaaS) has tried to address this problem over the last few years, but hasn’t become the norm yet.
Last Thursday (4/7/2016) I spoke at the 2016 NYC CS Fair. Their number one goal is to encourage public high school students who study CS to stick with it, by showcasing all the great opportunities that await them should they pursue a career there. I talked about being a hacker, how to negotiate CS studies in higher education, the difference between CS and software engineering, and the importance of a good mentor.
A couple of weeks ago I did a great fireside chat with Matt Turck at Data Driven NYC. I’ve always found that the fireside chat is a format with a lot of potential to be boring, but Matt is a great interviewer, and interacting with him on stage definitely adds to the event. For example, when I was talking about the headline features of our 3.2 release, I omitted a significant pair – the BI connector and Compass – and he reminded me to talk about them.
Last week I talked about Parse shutting down and how unfortunate that was, but also how outstanding a job they have done providing a transition path for their current users. MongoDB also published a very detailed post on how to migrate a Parse app onto MongoDB Cloud Manager and AWS Elastic Beanstalk. Since that day, the amount of activity on the open source Parse Server has been phenomenal, and many have suggested, as did one commenter on my last post, that this means it’s time for MongoDB and Parse to work even better together.
Updated 2/3/2015 to reflect the publication of MongoDB’s migration guide. I was sad to hear about Parse shutting down last week. Parse made a big push towards serverless architectures, which I think is a great goal. Serverless architectures are the ultimate in letting developers focus on making great products for their users and letting other people make the plumbing work. In the early days of web and mobile application development, backends were a thing that every team had to write themselves from scratch.
For the past 6 months, I’ve been participating in the NYC Tech Talent Pipeline Advisory Board, a partnership between New York City and technology companies in New York. From the press release announcing this board’s formation: Mayor Bill de Blasio today announced 14 initial industry commitments to support the delivery of technology education, training, and job opportunities to thousands of New Yorkers as part of the Administration’s NYC Tech Talent Pipeline initiative.
When we first published a mongodb.org homepage, we sloppily described MongoDB as “schema free”. That description over-emphasizes the baggage MongoDB left behind, at the expense of true clarity. At the time, however, document databases were brand new, and it was simple to describe them in terms of what they were not (witness the prevalence of the terms “non-relational” and “nosql”). This over-simplification was much more than an oversight. As you can see by reviewing this old blog post, it reflects an immaturity in our thinking.
Partial indexes allow you to create an index that only includes documents in a collection that conform to a filter expression. These indexes can be much smaller, cutting down index overhead in storage space and update time, and by matching against the filter criteria, queries can use this slimmed-down index and run much faster. This is one of the new lightweight “schema where you need it” features we’re bringing to MongoDB in 3.
On August 25th I will be delivering a talk at the AWS Pop-Up Loft in NYC. The talk is entitled: “Behind the Scenes with MongoDB: Lessons from the CTO and Cofounder on Deploying MongoDB with AWS.” The AWS lofts combine hack days, talk series, bootcamps, and “ask an architect” opportunities, and mainly target engineers working on startup projects that are built on AWS, although other people do attend the talks.
The aggregation framework is one my favorite tools in MongoDB. Its a clean way to take a set of data and run it through a pipeline of steps to modify, analyze, and process data. At MongoDB World, one of the features we talked about that is coming in MongoDB 3.2 is $lookup. $lookup is an aggregation stage that lets you run a query on a different collection and put the results into a document in your pipeline.
A lot of people I talk to are unsure about the Apple Watch, and the category in general. Me, I’m counting down the days till I get my Apple Watch. In fact, at this point my impatience is so great, the prospect of having to wait another month to get one almost makes me want to go out and buy a Pebble. So, score one for the Apple marketing team, I guess.
As discussed in other posts, I spend a lot of time in email, and much of the email I get is related to MongoDB’s Jira. I’ve written before about my Jira summarizer, which maintains a single message in your inbox with a summary of recent activity in projects you watch. In my continuing quest to make Jira email easier to deal with, I wrote a tool to make it easier to quickly assess the email notifications about individual issues.
Last week I went to Las Vegas for MongoDB’s sales kickoff. The night before I left, Sunday, I came down with a decently high fever. I got a bit nervous, as it came on strong and fast, but I took some Advil, went to bed, and the next morning felt ok to get on a plane. That whole Monday was pretty good with the help of some more Advil. On Tuesday morning the Advil was giving ground, on Tuesday evening it was in full retreat, and Wednesday at 5am I found a helpful MongoDB employee in the hotel to take me to the ER.
MongoDB 3.0 has landed. The development cycle for 3.0 has been the most eventful of my entire career. As originally planned, it would have been great, but still incremental in nature. Instead, we wound up acquiring our first company, integrating their next-gen storage engine, and by capitalizing on that unlooked-for opportunity, delivering a release so beyond its original conception that we revved its version number. Renaming a release in-flight is out of the ordinary, so I wrote about our reasoning when we announced the change.
In my first post on this topic, I said I’d post an update in a week or so. Ok, so that was about 7 weeks ago. I abandoned the trial of both of these techniques because 2.8.0 is, frankly, more important than my experiments in productivity. I’m going to get back to it, but this is actually an opportunity to say something important about getting derailed from productivity projects by urgent items.
Today our team made public our first release candidate of MongoDB 2.8, rc0. Since June, beginning with MongoDB World 2014, I’ve been speaking publicly about MongoDB 2.8, and its headline features: document level locking and pluggable storage engines. What I haven’t said until now is just how related these two features are. We’ve been working on our storage API for roughly a year, and with MongoDB 2.8 rc0, we’re rolling out the first fully supported and working storage engine integration: WiredTiger.
On November 6th, I’ll be delivering the keynote address at MongoDB London 2014. I’ll be talking about the upcoming 2.8 release, the future of storage engines in MongoDB, and Automation. Since our last conference (MongoDB Boston 2014), the revamped MMS with Automation has gone from soft launch to wide release, and the response from the MongoDB community has been fantastic. We’re seeing tons of adoption and getting lots of great feedback.
“MongoDB is as easy to operate at scale as it is to develop with.” From the very beginning of MongoDB, I’ve envisioned making that bold claim. Until today, it’s been a dream. We just brought it firmly into the realm of the realistic. Today we rolled out a completely revamped MMS built atop Automation, our cloud service for deploying and running MongoDB. Automation works with any infrastructure, from AWS to private cloud to bare metal.
I’ve been using this new toy. Well, it’s for work, but until the novelty wears off, it’s definitely also a toy. I like taking notes in meetings on paper as much as possible. It’s less distracting, and more friendly. I’ve tried various ways of doing this, but nothing has stuck yet. The closest has been a regular notebook. The biggest problem is that I don’t like carrying things to and from work, or to different places.
Everyone with a staff knows they need a staff meeting on a recurring basis, often weekly. And those who don’t have staff are themselves in other people’s staff meetings, making it one of the most common meeting types for anyone to attend. Sadly, there is often ambiguity around what they are for, making them annoying and inefficient. What I Want out of Staff Meetings The purpose of these meetings is twofold: 1) status updates, and 2) key decision making or the precursor conversations for decision making.
MongoDB 2.6 has been released. For my thoughts on many of the features of the release, please see my blog post on mongodb.org. Beyond the features, this release means a lot to me. In five years, we’ve gone from four people trying to figure out if a document database was a viable concept, to the fifth most popular database in the world. MongoDB 2.4 and all previous releases proved that the document model can transform how modern applications are developed and deployed.
Like The Superhero, The Martyr does their team’s work to make up for not managing. However, whereas The Superhero insists on hogging all the interesting work, The Martyr does work that no-one wants to do. When a deadline is looming and things are looking down, they will pull all nighters to finish it themselves rather than do what a manager should do, such as motivating their team, or fixing the deadline.
Like The Martyr, The Superhero does their team’s work to make up for not managing. They are super smart, super capable, and they can often do most or all of the jobs that their reports do better than their reports. They also care deeply about the quality of the product their team works on. Unfortunately, they are not inclined to delegate any of the interesting work, because they want it all for themselves.
The Politician’s main concern is making their bosses and peers think they are doing a great job, and are responsible for every success they can claim, regardless of reality. They are cousin to the Glory Hog, but are far less destructive than them, because their goal is to create a successful environment for themselves. Also, their behavior is driven by confidence, not under-confidence. They are not threatened by their reports’ accomplishments, because they intend to take credit for them.
I’m intrigued by the idea of using Google Glass during a presentation to avoid ever having to look at or touch a computer. I’ve taken a cursory look over the apps that are currently available, and tried out Your Show and Glassentation. I’m concerned about two things – one, pulling it off at all, meaning making sure that my audience is still focused on my talk and not my gadget, and two, being able to continuously engage the audience while referring to my notes quickly enough to not break the flow.
The Isolationist manager takes their job as a “crap umbrella” to a dysfunctional extreme. They try to limit interactions between their team members and other people in the organization. They take their responsibility toward their team very seriously, and their isolation is a misguided attempt to make them more productive. Behavior in meetings: The Isolationist isn’t so much identified by behavior in meetings, as much as by the influence they have on organizing meetings.
I’ve written a Python program to do something fancy with JIRA that I couldn’t get using built-in facilities. You already get notifications from Jira about the tickets you personally care about, based on your notification settings. My tool will give you, additionally, an hourly email in your inbox summarizing all the changes in projects you care about, skipping the the ones you already got direct notifications of. Not only that, but it will make sure that you only ever have one of these summaries in your inbox, by consolidating them when a new summary is generated.
Are video tracking algorithms good enough to take the feed from a crappy digital camera, and tell me how fast I threw a baseball or how far I hit a golf ball? Last time I toyed with these over a decade ago, probably not, but now, they might be.
The True Democrat never makes decisions, they only operate by total consensus. This approach will lead to just as much unhappiness in a team as ignoring their input. Egalitarianism is a good foundation for seating charts, opportunities, compensation, and promotions; it is not for strategic decision making. Put another way: fairness is not the same principle as equality. Behavior in meetings: The True Democrat is more concerned with equal speaking time than guiding the meeting towards the best outcome.
The Glory Hog is really bad news. They want to take credit for the work their team does, and are more interested in their own advancement than their team members’ performance and growth. This boss bug is my first example of a cloaking bug, that is, this person will often use subterfuge to prevent their trait from being identified. As such it is very important to be able to distinguish glory hogging from innocuous behavior, and other bugs where the observed behavior can be similar.
The Best Friend manager wants to be friends with their team members more than they want to manage them. Their team members enjoy work and usually get along, but have a tendency to miss deadlines, do not drive the product forward, and over time, lose pride in their work as they accomplish less. A good manager has a team that respects them, and can also make tough decisions that may not make everyone on their team happy.
Recap Last week I started this Debugging the Boss series, to highlight specific traits that lead to management problems. I expect most people will see these traits in themselves or their bosses (I certainly do). I think everyone has some of these traits… in fact every manager should have many of these traits. What’s important is having them in moderation. I’m going to post one of these every week or so until I run out of traits to write about.
Managing in a technology company is one of the charter topics of this blog. I cannot think of any single thing that represents a greater risk to a growing tech firm than the damage that can be done by bad management. A really bad employee can waste resources and time, and lower the morale of those around them; a really bad manager can do more serious harm, in a wider range, that lingers on even after they have been removed or corrected.
Email is my prefered method of communication (besides in person). At the same time, I get a lot of it, so making it better via tooling is very important to me. A large proportion of the emails I (and many other people, I’m sure – especially those in the tech world) receive are generated automatically, from LinkedIn notifications to Jira updates to monitoring alerts. Email is good at receiving information, but then acting on the information is encumbered by the need to link out to a browser.
Problem: I’m at a playground in central park and I need to print something at home. I have a new printer that is online, but hasn’t been setup with Google Cloud Print yet. Solution: Set up an SSH tunnel from my phone, through my desktop to the printer (using iSSH on iPhone). Configure google cloud print from the browser on my phone. Print from chrome. p.s. I was at a playground with my kids, not just to print something.
Update: I’ve written a much more in-depth article on this topic that was published in Dr. Dobb’s Journal on 1/7/2014. The typical expected path of an engineer goes something like this: Individual Coder Project Lead Team Lead Manager Director At each step, their expected coding time looks something like 90% (Individual Coder) 80% (Project Lead) 50% (Team Lead) 1% (Manager) 0% (Director) This seems wrong. It is not shocking that engineering management is often found to be out of touch with the tech when they aren’t working on it themselves.
I get a lot of email. I used to think I got a lot of email, but that was before 10gen. Maybe one day I’ll remember writing this and laugh because comparatively today’s load is light. I hope not, because that thought is frankly scary. There are a number of programs I’ve written to help me deal with email. One of them is less about helping me, and more about letting the people around me know that I don’t have a special desire to ignore them.
Doing a technical phone screen has always been a challenge for me. My preferred in person technical interview, especially for more junior engineers, is to take a relatively simple programming task, and dive deeply into it. The code will be simple enough to commit completely to a whiteboard, or a piece of paper, but I’ll lead the conversation to edge cases, performance, and how to test that code, for example. The coding task is really just the framework within which the interview happens.
MongoNYC 2013 is on Friday, 6/21, and I’m really looking forward to it. This is our 4th conference in New York City, and we’re expecting over a thousand attendees. I’m delivering two talks one on Data Safety, and another on Full Text Search, which we added in 2.4. I’ll also be presenting the MongoDB Roadmap at the end of the day, during which I’ll both preview the short-term aims of the upcoming 2.
I’m trying a relatively new thing these days: working through huge lists of open MongoDB JIRA tickets using a pencil and a big printout. This turns out to be a better way for me to handle this workload than sitting at a browser and doing it interactively. To explain this, I suppose I have to explain why I’m reading all these JIRA tickets. I’m reading all these JIRA tickets because I don’t want to lose touch with the needs of MongoDB users, in spite of the ever increasing volume of related articles, blog posts, and yes, JIRA tickets.
Emacs is the only editor I can use effectively at this point. It doesn’t matter if there are better choices (there aren’t ;-), it’s the one I’ve invested all of my muscle memory into. When working on files locally, I use normal emacs, and things are grand. Life, however, dictates that a great deal of my coding is done on remote machines. I had tried a variety of solutions to edit remote files (emacs in a shell, emacs of x, samba, nfs, etc…), none working terribly well for me.
MongoDB 2.5.0 (an unstable dev build) has a new implementation of the “Matcher”. The old Matcher is the bit of code in Mongo that takes a query and decides if a document matches a query expression. It also has to understand indexes so that it can do things like create a subsets of queries suitable for index covering. However, the structure of the Matcher code hasn’t changed significantly in more than four years and until this release, it lacked the ability to be easily extended.
I visited London a few weeks ago to attend and speak at MongoDB London. The event was very successful, and I enjoyed many conversations with attendees and staff during the event. But having the opportunity to spend time with our 10gen London team makes the value of the trips far exceed my contribution to the conference. Although my time with the team was relatively short since my entire trip to the UK lasted only two days, it provided yet another example of “no substitute for in-person collaboration”.
Monday was a big day for 10gen in New York; we moved into our new offices on West 43rd Street. The last time we moved (about 16 months ago), our then new office seemed quite spacious and impressions were that it would last quite a while. That turned out to be a bit short sighted. By January of this year we were bursting at the seams, with every desk full, expansion space taken, and competition for conference rooms straining everyone’s patience.
curl http://stream.twitter.com/1/statuses/sample.json -u<user>:<pass> | mongoimport -c twitter_live One thing that you can do with mongodb is have 1 streaming master and 1 read/write master server A: ./mongod —master —dbpath /tmp/a server B: ./mongod —dbpath /tmp/b —master —slave —source localhost:27017 —port 9999 You can then pipe the stream into server A, and it will only process the live stream. Server B will replicate all changes. You can also write to it, query on it, etc… This way you can do operations that block writing on server B, but server A will never backlog.