No items found.

Scaling as a CEO with Tony Xu

Guests:
Tony Xu

Table of Contents

1. Evaluating Leaders

Always Bias For Clock Speed and Emotional Maturity

  • Always default to two attributes above all others: clock speed (raw intellectual horsepower) and emotional maturity (the self-awareness to know what they are weak at) 
  • If someone has both of these, Tony takes the bet 90% of the time

Tactic: The Six Attributes Tony Looks For In Leaders

Below are the six attributes Tony looks for in every leadership candidate, in rough order of importance at the Series B/C stage:

  • Bias for action. There are no great interview questions for this. The best signal is what a candidate does naturally, without being asked. Christopher Payne showed up on a Friday, spent hours discussing the logistics algorithm, then drove deliveries that night with his son and wrote a detailed email the next morning about everything that was broken. That kind of unsolicited action is a far stronger tell than any structured case.
  • Ability to operate at the lowest level of detail. An outsider knows roughly 1/5,000th of what you know about your company. The question is how fast they can close that gap. The best tell is someone who naturally dives into granular specifics. DoorDash’s current president showed up to a casual 30-minute coffee with a 10-megabyte financial model projecting seven years of DoorDash financials. Nobody asked him to build it - that was just how his brain works. The coffee turned into a three-hour conversation about every line item on the P&L.
  • Openness to change (tested as: holding two opposing ideas, or rate of learning).  Both are proxies for the same underlying trait - how willing is this person to abandon something they believe in when the evidence changes? Ask the candidate to articulate something they believe in deeply, then argue the exact opposite case with equal conviction. Defensiveness in this exercise is one of the strongest predictors that an executive will not scale. 
    • In the AI era, this matters even more - the question becomes how fast they can learn and unlearn, not just whether they can hold two ideas at once.
  • Ability to recruit great teams. Instead of asking interview questions about recruiting, ask for the names of people they have recruited and developed, then look at those people’s career trajectories.
  • Strength of followership. Ask: “Give me three names of people who would join this company if you joined today.” Take down the names and contact information, call each one, and confirm. This is a direct, falsifiable test.
  • Getting 1% better every day. This can be about anything - running marathons, learning cloud infrastructure, restoring cars. The point is to test whether the person is intentionally, consistently improving at something. The specific domain does not matter. What you are testing is the habit of deliberate improvement.

Early Leaders Should Be Player-Coaches

  • At the Series B/C stage, every leadership hire should be a player-coach, and in today’s AI world, more player than coach.
  • For engineering specifically, look for a tech lead with product sense and maturity. The best ones talk to customers without being told to. If you masked their face, they would read as late-30s in maturity even if they are in their twenties.

A Candidate’s Natural Behavior Beats Interview Questions

  • The biggest mistake in executive hiring is getting swindled by great talkers. This accounts for the majority of mis-hires.
  • Structured case interviews can work for ICs and tech leads, but they do not reliably test VP-level candidates. A great exec will out-perform you in any structured setting because they are paid to talk and present
  • Spending 20-25 hours with a candidate across multiple settings will surface more signal than any structured process.

Tony’s Backchannel Process

  • Tony tells candidates from the first meeting that he will be back-channeling throughout the process.
    • Commit to not jeopardizing their current job. Focus on former employers and previous colleagues.
  • The goal is to develop a true 360-degree view. The most valuable perspectives come from direct reports, skip-levels, and peers. A manager’s opinion is usually the least useful vs. the people doing the work.
    • Skip levels are especially telling when backchanneling leaders
  • Tony expects to conduct ~20 conversations to get 8 genuinely signal-rich ones. The other 12 are either dead ends or referrals to better sources. 
    • While 20 is objectively a lot of time going towards references, it is the reality of the quality of backchannels. Early calls surface the names that actually matter: references who give polite non-answers are effectively telling you they don't want to talk, so you ask them for another name and follow the thread. 
    • It’s hard to shortcut to the 8 good ones because you don't know who they are until you've worked through the others.
  • Be transparent and reciprocal: offer references on yourself, including people who think critically of you. This builds trust and usually gets more candor in return.

Mixed Backchannel Scorecards Are Normal - Hire for the Superpower, Cover the Gap

  • The best hires tend to have mixed scorecards, not straight-A reviews. 
    • People who do great work rub some people the wrong way because they are opinionated, pushy, or hold high standards.
  • No one will hit all six attributes. The question is: are their weaknesses damaging to the team? And does their superpower match the #1 skill needed for this job? Can someone else on the team cover their weakness if so?
    • DoorDash had one of the best operators Tony had ever seen but that person wasn’t great at recruiting. So Tony recruited 100% of his team and gave him right of first refusal to sit out of interview panels. The leader ran and developed the teams because he was far better at that.
    • This pairing model only works if the person has the humility to let you help on their weakness. If they block you, the partnership breaks down.

2. Calibrations

The Two Sources of Error in Exec Hiring

  • There are two main reasons why leaders don’t work out:
    • Not knowing what the job actually requires. This accounts for ~70% of mis-hires. Without understanding what great looks like in a function, you end up chasing resumes or taking investor recommendations at face value.
    • Not trusting your instincts. This accounts for ~30%. The pattern: your gut says this person is not right, but the role has been open for 16-20 months, so you give in to desperation. Every single time this has happened at DoorDash, the leader hasn’t worked out

Shadowing Leaders to Calibrate on What Great Looks Like

  • Between 2015 and 2020, while DoorDash was roughly 100-250 people, Tony asked friends of the company to name two great executives they admired (in any function) then asked to shadow them for as long as they would allow.
    • Tony shadowed ~500 execs this way. Though many said no, you’d be surprised by just how generous the Valley is if you ask. 
  • This is a non-competitive exercise. Make an explicit or informal agreement that you will not recruit these people. The purpose is purely calibration.
    • Example: Dave Wehner, one of Meta’s first CFOs, allowed a four-hour shadow session in a conference room watching him run different meetings. This built intuition on what a great CFO actually does day-to-day, which is impossible to get from interviews or descriptions.
  • Through shadowing he learned what makes someone successful at Meta is very different from what makes someone successful at Google, Apple, or Amazon. There is no universal “great executive” archetype. You need to understand what type of leader fits your company’s specific environment.
  • This eventually gave Tony a way to recruit mentors for his leaders. In functions where Tony himself wasn't particularly strong and the leader wanted to learn from a “great” in their function, the rolodex Tony had built through these calibration sessions allowed him 

Example: How Tony Used His Calibration Network to Develop Leaders
  • Through shadowing, Tony built a Rolodex that became useful beyond his own calibration - he could deploy it to provide targeted mentors for his own leaders. 
  • In functions where Tony wasn't particularly strong, or where a leader wanted to learn from someone great in their specific function, he could tap 2-3 people from that network rather than having each leader spend time networking broadly from scratch.
    • For DoorDash's VP of Eng, for example, Tony sourced three mentors: Rick Dalzell (first CTO at Amazon - the only person Amazon ever named a building after), Bill Coughran (ran a Google engineering team of ~25,000), and Mike Schroepfer from Facebook engineering leadership.
    •  Each spent roughly an hour a month with the leader, tapering over time.
  • The key to making these arrangements work is to be fully transparent. Let the relevant companies or investors know what you're doing. People are supportive when you're open about it, they just don't want to feel like you're hiding the ball.

A Leader’s “Vintage” Matters 

  • Not all experience at a given company is equivalent. A Facebook hire from 2011-2016 is a very different profile from a Facebook hire from 2018 onward.
    • The 2011-2016 vintage at Facebook/Meta is particularly valuable because there was still significant building happening - the business model was not figured out, the mobile transition was underway, and there were meaningful people management challenges. That combination produces builder-mentality leaders.
  • After a certain vintage at the same company, the talent profile shifts to people who are great at running huge teams, optimizing existing systems, and presenting on stage. These are different skills from what early-stage companies need.
  • DoorDash had specific vintages they recruited from at each stage (and specific vintages they auto-rejected) but that filter came from meeting people, not from rules invented in a vacuum. You cannot shortcut this. Spend time with people from different eras at the same company and the pattern will become obvious.
    • The right calibration target for your stage: companies and executives one or two stages ahead of you

3. Scaling Hiring Quality

Tony’s Offer Approval Process up to 5000 Employees

  • Every offer letter required CEO approval, up to 5,000 employees. The alias was offers@doordash.com, with a 24-hour SLA, 365 days a year.
  • Each email included the candidate packet, the case study (code review or business exercise), everyone's scorecard, and all required back-channels (3 for below Director; 6 for Director to VP).
  • The intent was not to veto hires - only ~12-13% were rejected. The intent was to create productive friction that raised the bar at the team level. Knowing an email abc-and-forth from Tony was coming forced hiring managers to prepare more carefully.
    • In the packet, Tony would look for: was the hiring manager a strong yes? (If not, it is a no.) Did anyone on the panel have a track record of good hires? Were the back-channels substantive?
  • The friction itself was the mechanism through which DoorDash reduced the number of bad hires and moved the hiring average upwards 

Hire for Slope, Not Just Pedigree

  • Expect a ~70% hit rate on hiring - a mentor once told Tony that 60% great is Michael Jordan-level recruiting. You are always guessing. There will be flame-outs. That's the reality of hiring fast at an early stage, and pretending otherwise sets the wrong expectations.
    • The first 10 engineers at DoorDash were effectively all friends. After that, the strategy was to find people with more talent and potential than the market had yet recognized and lock them up before larger companies noticed. 
    • By the time Meta got to them, it was too late - they'd offer 5x the comp and 2x the scope. So you find them earlier, when the market hasn't priced them correctly yet.
  • The attributes that matter - drive, curiosity, willingness to do whatever it takes - are more predictive than pedigree. 
    • Tony's youngest hire during the early years was 14 years old

DoorDash’s Final Customer Delivery Interview

  • For engineering hires at DoorDash, the final interview consisted of doing deliveries for an hour and a half. Candidates were walked through the entire end-to-end experience and assessed for whether they showed a genuine interest.
    • 30-40% of engineers found it boring. Those were immediate passes.
  • The test was not whether they could do deliveries. It was whether they had the operating mindset to care about what the product was doing in the real world. 
    • At your stage, the equivalent test is whatever puts candidates in direct contact with your customer's actual experience

4. Internal Promotions vs. External Hires

 Default to Internal Promotions When Possible

  • An external hire - even with AI assistance helping them ramp - is working with a 1:5000 fraction of the institutional knowledge you and your team carry. That gap takes years to close, not months.
    • DoorDash had no management layer in engineering until ~150 engineers with no formal head-level hires in most functions until 300-400 employees. In retrospect, waiting that long on finance and people was a mistake - but the default toward internal was correct.
  • Bias against external hires is especially warranted when the team is executing well. If things are working, the bar to disrupt it must be extremely high. If things are broken, the bar is low - staying the course when nothing is working is not a strategy.

The Bar for an External Hire Is: Clearly Better, And Everyone Knows It

  • This is a precise standard: "clearly better" means there is one dimension where the external candidate is so obviously superior that you could not ignore it even if you tried. "Everyone knows it" means it is visible - not just to you, but to the people who will work with them.
    • At DoorDash's scale, this clarity typically emerges within 30-60 days. At your scale, you will know within a week.
  • The test for yourself is: do you have the humility to admit that someone is better than you at something? If yes, and if that "something" is the exact thing the role requires, it is a clear call. If the advantage is marginal or unclear, do not pull the trigger. 
    • The damage from a wrong executive hire is disproportionately large at 50 people.

5. What to Look For Function by Function

As a general principle, when in doubt about any function: bias toward clock speed and emotional maturity. Horsepower plus self-awareness will cover almost any gap.

Head of Engineering

  • At your stage: a tech lead with product sense and emotional maturity, not a people manager. Emphasis is on "player" far more than "coach." Do they naturally seek customer contact to validate what they are shipping? Do they do it without being told?
  • In the AI era, the most important skill for a head of eng is building verification into the development process. Generating code is becoming easy. Knowing when the AI made a mistake - and knowing how to fix code you did not write yourself - is the hard problem.
  • Measure candidates on two things: shipping velocity (are high-quality things being shipped quickly?) and talent density (is the team getting better?)
    • Get informal signal from your best engineers to answer these. They are closer to the work and tend to be honest and opinionated if you ask. Tony does this through low-pressure direct messages on Slack to get fast, honest feedback without making it seem like a big deal

Head of Product

  • At your stage, you probably do not need a formal product leader. Your best product person is likely you.
  • If you are going to hire one, what you want is someone with exceptional taste who is closest to the customer and the market.

Head of Sales

  • Hire the best salespeople from mediocre product companies. Selling a mediocre product is orders of magnitude harder than selling Google AdWords or ChatGPT ads. 
  • The best sales leaders are mathematical, systemic, and love money. If they do not love money, they are not going to drive revenue.
  • Testing a sales leader is slightly more methodical than other functions. Look at their actual quota attainment history, not just whether they hit it but the context around it:
  • Did they hit quota because they had a great territory and a great product (i.e., anyone would have hit it)? Or did they hit quota despite a mediocre product or a tough territory?
  • At this stage, you probably need a sales lead, not a big-function sales VP.

 CFO / Legal / Comms

  • For all three: skew on IQ. At your stage, find the best single generalist athlete for each function. You do not need a specialist head yet.
  • Finance is the one function where hiring sooner can sometimes pay off. A strong strategic finance person would have helped DoorDash raise money faster - the lack of one made that harder than it needed to be.

People / HR

  • DoorDash hired no one with formal HR experience for the first six to seven years, up to roughly 1,000-1,500 employees.
  • The two skills that actually matter in people: (1) understanding how the business wins and (2) knowing how to attract and keep the people who will make it win. Everything else is teachable.
    • Most HR people cannot articulate either. Look for culture carriers who are respected across multiple functions, not just operators of HR systems.
    • The DoorDash chief people officer started her career as an English teacher, then ran factory operations at Amazon, then was persuaded to lead HR. She thinks like a business builder, not an HR administrator.

6. Operating Cadence

To set operating cadence, DoorDash starts with the jobs to be done, not archetypes like “staff meeting” or “1:1.” They still only have two recurring meetings at the entire company level:

Weekly Business Review

  • The north start question in this meeting is: what can we do this week to improve outcomes for customers?
  • The WBR is a tactical meeting. Nothing strategic. They review the main goals for the year, make 3-4 decisions about capital allocation to double down on what’s working, fix what’s broken, and take action items on what’s sideways.

Staff Meetings Are for Existential Questions, Not Updates

  • Do not use staff meetings for business updates - that belongs in the WBR with the right operational people in the room.
  • Staff meetings are for topics that take longer: existential questions about the company’s 2-3 year direction, recruiting gaps, interpersonal or team friction that needs to be addressed early, and capital strategy (with a smaller subset of people).
  • Assign the most important company problems to specific people. The founder takes the bulk, but one or two problems go to others

When You Do and Don’t Need 1:1s 

  • 1:1s are important when the team is new: you don’t know what it’s like to work with them, they don’t know what it’s like to work with you, and things are changing quickly.
  • After working with someone for nearly a decade, scheduled 1:1s become less necessary. The relationship already has the bandwidth to surface issues without a standing meeting

How to Set Goals

  • Good goals are predictive of the outcomes that matter. They have both a positive metric and a constraint (e.g., improve conversion but with a ceiling on ads spam). 
    • Every goal has a cost - free wins are almost never free.
  • Good goals can be moved within the time period given. For example, retention is a bad weekly goal because it moves too slowly to provide signal within the period of a week
  • Good goals have one name bolded next to them. One DRI - the full team is listed, but one person is accountable.
  • Use input metrics wherever possible. Break the top-level outcome into the input levers that actually drive it (selection of restaurants, delivery time, delivery cost, on-time percentage, accuracy). Then run experiments to figure out which inputs move the outcome and set goals on those inputs.

Running New Products vs. Core Products

  • At DoorDash today, roughly 30-50% of resources in each business area are allocated to zero-to-one projects. At your stage, this is maybe 3-5 people.
  • Allocate resources before setting goals: decide up front that 70% goes to the core business, 30% goes to new bets. If you do not make this split explicit, every debate becomes “which is more important, the main thing or the new thing?”  and the main thing always wins, which means you never build future S-curves
  • The failure mode at scale is that everyone migrates to the new thing and the core product stalls. The best companies (Meta, Google) are remarkably disciplined about continuing to compound the core business even while pursuing new bets. You need both, and different management systems for each

Why DoorDash Broke the Traditional GM Model

  • The traditional GM model (one person owns business, product, and engineering for a vertical) sounds clean. In practice, it requires someone who is simultaneously business-minded, product-minded, and technically-minded. Very few people are actually all three.
    • DoorDash's approach was to break the functions apart. Put a soft asterisk next to one person who breaks ties - but without a hard reporting relationship. Play people to their strengths rather than forcing them into a tidy org chart.
  • This only works under two conditions: (1) same goal on everybody, so there is no "that's the business team's problem" dynamic, and (2) genuine humility on the team, so people serve the company goal before their own agenda.
    • This only works with the right incentives and the right people. On incentives: every function — product, engineering, business — shares the same goal. An engineer gets a “does not meet” rating if the business misses its number, even if their code was excellent. You cannot say “that’s the business guy’s problem.”
    • On people: this requires genuine humility — putting the company first, then the team, then yourself. If someone needs to “get what is theirs,” the partnership will eventually break down
  • The caveat is that this model is genuinely hard to scale and not a universal recommendation. Get it right at 50 before assuming it will hold at 500

Collapsing Roles into Planners and Builders in the Age of AI

  • In the world of AI, the functional structure may collapse into two job families: 
    • A Planner (determines the roadmap - talks to customers, synthesizes evidence, decides what to ship) and a Builder (executes - builds, tests, deploys). Traditional distinctions between product, strategy, and operations start to blur.
  • This is dramatically easier to architect at 50 people than at 500. Start collapsing redundant functions now. DoorDash is actively trying to do this at 16,000 employees and it is hard.

7. Scaling as a CEO

When to Stop Managing Every Function

  • There is no clean “kill switch.” The transition happens when you start realizing there are people better than you at something. 
    • One of the earliest PMs told Tony directly: “Your job is to tell me the goals, give me the principles and constraints, and then get out of my way. Otherwise you don’t need me.” That mental switch took about six months.
  • You would much rather be by yourself than be with the wrong person. Do not hire just because a function has been open for a long time. The damage a bad executive does to morale, business outcomes, and customer outcomes is enormous.

 Build Culture From Real Stories 

  • Values should be 80% who you are and 20% who you aspire to be. If it is more aspirational than real, you are chasing ghosts.
  • Wait 2-3 years before writing down values and write them based on the behaviors you actually observe in the company, then self-select. Some people will come and say “that’s not me” - that is a much better outcome than forcing people to morph into something they are not.
  • Every value should have 10-15 real stories behind it. If you cannot point to stories, it is an aspiration, not a value
    • “Customer obsessed” came from the time DoorDash was late on every order, refunded every customer (costing 40% of the bank account with two weeks of cash left), baked cookies, and hand-delivered them before customers woke up. That is a real story, not an aspiration.

Managing Psychology 

  • Three things that may help when you are in survival mode:
    • Share the stress broadly. During DoorDash's three-year fundraising drought (2016-2018), the declining bank balance was shown at every all-hands on Friday. 
      • It was not motivating but it made the problem collective. A group of ~20-30 trusted people self-organized around building a plan to get out of it. Being productive on what you can control is more useful than worrying about what you cannot.
    • Have something outside of work that is non-negotiable. In a period of no control, you need one thing that is stable and yours. It does not matter what it is - running, date nights, cooking, whatever. The function is psychological anchor, not recreation
    • Optimism is a choice. This is a deliberate decision not to be defined by what is not working.
      • DoorDash often says "Choose optimism and have a plan." Every week, they had a slightly different plan to achieve the same objective functions. The plan changed constantly but the direction did not.

Comments

Confidential & Proprietary