The fundamental lesson of the forces governing scaling startups

 

Idealistic founders believe they will break the mold when they scale, and not turn into a “typical big company.” By which they mean: Without stupid rules that assume employees are dumb or evil, without everything taking ten times longer than it should, without wall-to-wall meetings, without resorting to hiring anything less than the top 1% of the talent pool, and so on.

That is, keeping the positive characteristics of a tiny organization, avoiding the common problems of a larger organization, by preserving their existing values and processes, just doing it with more people, and figuring it out as we go along, exactly as we always have.

Why do they never succeed? Why is this impossible when you have 500 employees? What are the fundamental forces that transform organizations at scale?

From Brittle to Robust

A “team of one” is the fastest, most efficient team, as measured by “output per person.”  Communication and decision-making occupy the minimum possible time. And maybe the person working on that thing is a “hero” — working extended hours and experienced with the problem space. Small companies operate this way by necessity, and it works!  It’s a big reason why they move quickly.

But, an illness takes the velocity of the product or quality of support from heroic to zero. And if that person leaves, you’ve just lost six months to hire and get back up to speed on that thing.  Or nine months because there weren’t any processes and documentation in place — again because it was just one person, who didn’t need that stuff, because after all we’re moving so quickly!

Or it’s fatal because that was a co-founder. “Founder trouble” is a leading cause of startup death (though data also show that companies with only one founder are more likely to fail, so the conclusion is just that startups are just always likely to fail!)

A team of one is brittle, but fast.  When you’re small, this is a good trade-off, because speed is critical for combating the things that are constantly about to kill the company.  When you’re large, and you might have 15-25% annual employee turnover, not to mention illness, vacation, and family, the same structure would sink you immediately.

So, no project can have fewer than, say, three people dedicated to it, plus people management and possibly some form of Product or Project Management. But that team of four will not be 4x more productive than the one-person team; per-person productivity goes down in exchange for robustness and continuity.

On the other hand, while the small company loses 9 months to the loss of a key employee, or even implodes, the big company is the steady turtle that adds thousands of customer per month like clockwork and wins the race.

Predictability

When you’re small there’s no need to predict when the feature will ship. Marketing isn’t scheduling a launch and recruiting isn’t timing the start-dates of the next 50 hires in customer service and sales. This means you can — and should! — optimize myopically for speed-to-market.

Small companies brag about their speed as an advantage, but it’s easy to see why the larger company actually has a massive advantage. Sure, when WP Engine launches a new product, the marketing department needs predictability for the launch date, but that’s because it’s a highly-skilled, well-funded group, which explodes with press, events, campaigns, social media, and newsletters, grabbing more attention in a single week than a smaller company might garner in a year. There’s also an armed globally-dispersed Sales and Support teams, so we’re selling to our 70,000 existing customers as well as thousands of new customers per month, which means we’ll end up adding more new revenue in one month than a small company will take in over a whole year.

The tradeoff, however, is predictability. We didn’t line up that press and have those sales materials and ensure code-quality high enough to scale on day one, without predictability. Predictability means going slower. Predictability requires more estimation (takes time), coordination (takes time), planning (takes time), documentation (takes time), and adjusting the plan when it inevitably unfolds differently from the prediction (takes time).

Predictability is also required for healthy team-growth. Consider the timeline of adding a technical support team member. First, Recruiting is casting about for potential candidates. Then scheduling and performing interviews. Then waiting for them to quit their job and take a week off. Then new-employee-orientation. Then classroom training. Then paired up with senior folks on the floor as they ramp up their skills and comfort. Then finally, after (say) four months, they’re up to speed.

Since that takes four months, we have to be able to predict the demand for technical support at least four months in advance, because we have to be hiring for that future demand right now. If we under-estimate, our support folks get overwhelmed with too much work, their quality of life suffers, and service to each customer suffers; if we over-estimate, we have too many people which is a cost penalty. Of course, the latter is a better failure mode than the former, but both are sub-optimal, and the solution is predictability.

“The future is inherently unpredictable,” insists the small company, spurred on by Lean and Agile mindsets. Indeed, blue-sky invention and execution are hard to predict. But this is also a self-fulfilling prophecy; to insist the future is unpredictable is to ignore the work that could make it more predictable, which of course makes it in fact unpredictable to that person.

Small companies don’t have the data, customers, institutional knowledge, expertise, and often the personal experience and skillset to predict the future, so they are usually correct in saying it’s impossible. But it’s not impossible in principle, it’s impossible for them. At scale, it becomes required. Not because Wall Street demands it, or investors demand it, or any other throw-away derogatory excuse made by unpredictable organizations, but because it’s critical for healthy scaling.

Materiality Threshold

If Google launches a new product that generates $10,000,000/year in revenue, is that good? No, it’s a colossal failure. They could have taken the tens of millions of dollars that the product cost to develop, and made their existing operation just 0.01% more effective, and made the same amount of money.

At nearly $100B/year in revenue, Google can only consider products which have the potential to generate $1B/year in revenue as an absolute floor, with the potential to grow to $10B/year if things go better than expected. Things like YouTube, Cloud, and self-driving cars.

This principle is called the “Materiality Threshold,” i.e. what is the minimum contribution a project must deliver for it to be material to the business.

With a small business, the materiality threshold is near $0. A new feature that helps you land just a few new customers this month is worth doing. A marketing campaign that adds two sign-ups/week is a success. Almost anything you do, counts. That’s easy, and it feels good to be moving forward. But it’s only easy because the bar is so low.

The financial success of the larger company dictates a non-trivial materiality threshold. This is difficult. Even a modest-sized company will need millions in revenue from new products, maybe tens of millions in the optimistic case. Very few products can generate that sort of revenue, whether invented by nimble, innovative startups or stately mature companies. As proof, consider that the vast majority of startups never reach a $10M/year run-rate, even with decent products and extraordinarily dedicated and capable teams.

Yet, it’s the job of a Product Manager at that mid-sized company to invent, discover, design, implement, and nurture those products — something that most entrepreneurs will never succeed at. Tough job!

Recruiting

Employee #2 will join a startup for the experience. Even at a significant salary cut, and even if the company fails — the most likely outcome. It’s worth it for the stories, the influence, the potential, the thrill, the control, the camaraderie, the cocktail-party-talk.

Employee #200 won’t join for those reasons. Employee #200 will have a different risk-profile regarding their life and career. Employee #200 will be interested in different sorts of problems to solve, like the ones listed in this document instead of the ones where you’re trying to understand why 7 people bought the software but the next 3 didn’t. Employee #200 will not work for a pay-cut.

Small companies could view this as an advantage, and certainly it’s advantageous to recruit amazing people at sub-market rates. But there are dozens if not hundreds of employees at WP Engine today who are much more skilled in their area of expertise than I’ve ever met at a small startup. Why? Because after developing that expertise, they find it’s only possible and enjoyable to apply their skills within a larger environment.

For example, there are advanced marketing techniques that would never make sense with a smaller company, that are fascinating, challenging, and impactful to the top line at a larger company. There are talented people who love that challenge and would hate going “back to the Kindergarten of marketing” scratching out an AdWords campaign with a $2000/mo budget or assembling the rudiments of SEO or just trying to get a single marketing channel to work or being called a “growth hacker” because they finagled a one-time bump in traffic.

But, this has implications around compensation, how you find that talent, and why that person wants to work at your company instead of the one down the block who can pay a little bit more. Therefore, it’s critical to have a mission that is genuinely important, have meaningful and interesting work to do, connect everyone’s work to something bigger than any of us. These matter even more at scale, because they’re the anchor and the primary reason why talent will join and stay.

 

Communication

With four people in a company, any information that needs to be shared can be told to just three other people. Everyone can know everything. If there’s a 5% chance of significant misunderstanding, that event doesn’t happen often.

With four hundred people, it’s never true that a piece of information can be reliably communicated, in a short period of time. A 5% chance of misunderstanding means twenty people are confused. In software terminology, communication challenges scale as O(n2).

“Slack” is not the answer. “Email” is not the answer. (Your emails are probably misinterpreted 40% of the time, by the way.) Repetition is the answer, in different formats, at different times, by many leaders, and even still it’s never 100%.

Technology & Infrastructure

Managing 10,000 virtual servers in the Cloud Era sounds easy. Automate everything, then any process that works for 100 servers, will work for 10,000 servers just by doing the same thing repeatedly — exactly the thing computers are excellent at.

It never works like that. Reddit took 18 months to get the “number of likes” to work at scale. StackOverflow took 4 years to get everything converted to HTTPS. Wired did that conversion in a “mere” 18 months. Everything is hard at scale.

What are the patterns in those stories?

One is that scale makes rare things common. Rare things are hard to predict and can be hard to prevent. Often they’re hard to even identify and sometimes impossible to reproduce. This is fundamentally difficult.

Another is continuity or compatibility with existing technology. New companies get to start from scratch, but at-scale companies must transform. New companies like to make fun of large companies for how hard it is to transform, neglecting that the cause of the difficulty might also be generating $100,000,000 in revenue.

Another is bottlenecking. All hardware and software systems have bottlenecks. At small scale, you don’t run into any bottlenecks, or at least the ones you do can be solved with simple techniques like increasing capacity. Eventually something difficult breaks and you have to rearchitect the stack to solve it. Even something simple like converting HTTP links to HTTPS or updating “number of likes” in real-time, becomes a monumental architectural challenge.

Not only does this slow down development, it adds investment. There will be entire teams who focus on infrastructure, scaling, deploys, cost-management, development processes, and so forth, none of which are directly visible to or driven by the customer, but which are necessary to manage the complexities of scale.

Risk-mitigation

For a small company, the most likely cause of death is suicide. Usually it’s starvation — can’t get enough customers (distribution) to pay enough money for long enough (product/market fit). But also things like founders splitting up, not getting enough traction to self-fund or to secure the next round of financing, having to go back to a day job, and so on.

At scale, the risks are completely different. There is very low risk that WP Engine will not sign up thousands of new customers this month. Other risks, however, are not only possible, but likely. Addressing those risks head-on, is required for a healthy and sustainable business that can last for many years.

Take the risk of business continuity during a disaster scenario. What if all availability zones of Amazon in Virginia are disabled for a week? How quickly could we get all our customers back up and running? Would that be true even though thousands of other businesses are also trying to spin up servers in other Amazon data centers at the same time? Could we communicate all this with our customers quickly and simply, so that our support team isn’t overwhelmed by repeating the same message to nearly a hundred thousand justifiably-worried customers?

Risk-mitigation can even result in growth. Serious customers want to see that their vendors understand and mitigate risk; this maturity becomes a selling point. That’s why enterprise suppliers are constantly flouting their compliance with SOC 2 and ISO 27001 and all the rest. Small companies make fun of those things as being unnecessary at best or a false sense of security at worst, but while they’re busy making that point, the larger companies are busy signing three-year multi-million dollar clients.

Early on, you do not need a disaster-recovery plan. That won’t be the thing that will kill the business, and your customers will understand if a young business is subject to that sort of risk. Later on, this becomes critical, and worth investing in.

The fundamental challenge of scaling: Embracing and implementing the shift from Small to Large

These forces cause larger companies to be fundamentally different than small ones. This isn’t a bad thing or a good thing. It’s a different thing.

Some idealistic founders believe the root cause of scaling issues is the “command-and-control” organizational structure. But none of the examples above make reference to any organizational structure. It’s universal. This is why Holacracy and Teal Organizations do not solve these problems in practice. It could be a fantastic idea to experiment with organizational structure, but the fundamental forces above will not be eliminated through recombination of roles and organization.

Scaling is hard, the road is foggy and bendy, it lasts for years, the set of people you need might be different, and no one emerges unscathed. So, it is not a sign of disaster if you have difficulty wrestling with these forces. Everyone does.

Disaster is when a company is scaling, but the leaders don’t appreciate these forces, don’t work constantly to morph the organization accordingly, don’t bring in experienced talent, decide they can figure it all out as they go along without help. Rather, it should mean new people, new roles, new values, new processes, new recruiting, new stories, new constraints, new opportunities.

Too many founders and leaders want to believe that “What got us here is what’s important and unique about us, and thus we should preserve all of it. Other companies fail because they ‘act like big companies,’ but we’ll avoid all that because we’re smarter than they were. As evidence of our acuity, just look at our success thus far. We will continue to succeed in the future as we have in the past.”

But they’re wrong.

There should be a few values that are kept constant, that’s true. Otherwise none of it means anything. But the details must change.

Many founders and leaders can’t make the shift. This always hurts the company, and sometimes kills the company. The world is full of those horror stories. It’s sad, because it’s an avoidable waste of opportunity and sometimes hundreds of person-years of effort.

Don’t become one of those cautionary tales.

  • David Wright

    So you’ve done both. Which is easier to hire for, small or large? Does it matter? Do you reject “small company” people at large companies?

    • I’m not sure what a “small company” person means. I’ve seen people grow as the company grows, and people who only like to operate in one mode or another. Sometimes you don’t know what sort of person you are until you’re in the thick of it! So no, we would never hire only someone from a large or small company background.

      I’m not sure one is easier. Different *aspects* of hiring are easier and harder. For example, it’s often hard for a brand new company to pay high salaries, whereas it can be “easily” budgeted at larger companies. But for example the allure of being employee #2 somewhere is awesome; that’s automatic for a startup but impossible for a large company to offer.

      It’s certainly *different*, which is the main point. A few things are constant, like how to identify great talent in a certain professional domain, or selling the vision and culture of the company. But many other details are different.

      • David Wright

        What about the common theme of small companies requiring more willingness to cross functional lines and pitch in in non-core tasks.. Functions that don’t really make it onto the job description?

        • That is true, but a person who says “not my job; not my problem” at any company is not a culture fit at WP Engine. Of course it’s OK if it’s “I’m not going to be great at this, but I’ll pitch in.”

          • David Wright

            Of course a blatantly negative attitude is bad anywhere but what I’m getting at is whether there is some downside here anywhere. Some folks want a feeling of control and ability to help on any problem that is critical to the company and at small companies they can. Larger organizations have “this is mine and that is yours” delineations for good reason, departments are much more efficient and that works best when you don’t constantly introduce outside distractions, so someone who feels they can contribute to a problem (assume it’s genuinely true that they can) have to go through and enormous political game to execute on that idea. Is a good intentioned but meddlesome person not a problem at a large company while a potential boon at a small one? Wouldn’t you want to weed that quality out of large company hiring but encourage it at a small one?

            I feel like there is a greater need to managing a large web of relationships at larger firms and that consumes time and energy. They are more efficient because of specialization and that’s the cost/benefit. Some may not appreciate this and find large companies slow and ponderous places with “politics” (relationships) everywhere when all they want is to execute. Small companies can optimize for people who are better at execution but maybe not as willing to work on internal relationships.

            Do you agree?

            • I definitely agree that managing a web of relationships is important for success at a larger company. For some positions that might not be true, but the majority require it, even when not in management.

              I think words like “more efficient” or “more productive” are difficult to use. In the sense that a single person can ship more code faster at a small company, for example, that person is more efficient or more productive. But in the sense that the larger company can ship more code to more people and make more money per unit time, is itself also a direct form of being more efficient and more productive.

              Perhaps it’s most accurate to say that larger companies (can) get more accomplished as an organization, but in smaller companies individuals can feel like they’re doing more by themselves.

              I do agree that you want to hire personality types or professional proclivities which match those two expectations. This also explains why many people don’t survive the transition from tiny team to 500-person organization — if it requires different skills as well as personality types, that’s a bridge to far.

  • Damian

    Interesting write up Mr. Cohen. I enjoyed reading your take on the topics as captain of WP Engine. I couldn’t help but notice you didn’t have a topic covering training. I’m not talking about new hire training. I’m referring to the best kind of training. On going staff development, especially for leadership. I’ve witnessed first hand the lack of training for VP’s, Directors and front line managers can lead to poor performance, increase risk in retention and a slow chip away to company culture. It’s like a cancer and can only affect a certain area of your business if you’re not careful to keep your ear to the ground. I’m curious to hear your take.

    • I completely agree! Aside from new-hire training, we also have on-going training. It’s our vision that everyone should be able to have a career that includes changing titles and jobs — even outside of your department — without leaving WP Engine. (A *vision* not a mandate!) We invest in a full L&D department and we continue to find ways of making that vision closer to true.