David Henke - On building a culture of "Site Up" at LinkedIn and Yahoo! - #3 | Transcript
You can see the show notes for this episode here.
This transcript was generated by an automated transcription service and hasn’t been fully proofread by a human. So expect some inaccuracies in the text.
Ronak Nathani: [00:00:00] David, it’s such an honor to have you with us today. Welcome to the show.
David Henke: [00:00:05] My pleasure. Good to be here.
Ronak Nathani: [00:00:07] So while preparing for this conversation I was reading about you. And when I went on Google and search, David Henke hanky, one of the first hits I got was an image of you pointing at a screen with some metrics and graphs. You were pointing at the screen, but you were also looking at the camera and. Period is that you were screaming at the camera. So I thought you would not start with asking you about the story behind that picture.
David Henke: [00:00:34] Okay. We’re referring to a poster that said, this is my freaking yes. Yes. You can substitute freaking for whatever word you would like, but I can assure you it’s nastier than freaking by the way, just until you know, and in India they made us change it to it’s my precious site, because they were offended by. What happened was site up, which was something that’s important, especially to a company, the size of Yahoo with so many properties and so many important pages was not going well. And they wanted to bring attention to the site. And at the time I was running production operations, which is really the data centers the networks. The 28 pops and the 35 data centers and also the 1 million computers, but the sites were actually run by different groups like messenger and search and mail and so forth, and things were going well. And they wanted me to bring attention to the side. So I pointed to something that matters to me greatly, which is sponsored search, which was my first job at Yahoo, which is how we made half of our money. And if you know anything about calculus, the area under the curve is money. Yeah, right. And we don’t want to lose money. And we were losing money because we had some sponsored search problems and I was trying to make everybody read or remind everybody it’s your freaking site and site up does matter because that’s the business we are in.
Ronak Nathani: [00:01:58] Site Up does matter, and it is something that we are going to dig a lot into. But remind me did, did was this poster also kind of stuck on walls in various buildings of the company?
David Henke: [00:02:09] Unfortunately, I don’t really, I’m not a very photogenic person and I’m not a very good speaker, but I will tell you my ugly mug was in every floor of every building at Yahoo. And it was in every data center and people who did not know me would go. Is that you, man, you are one ugly sort of a gun.
Ronak Nathani: [00:02:29] I would, I would disagree that you’re not a good speaker. I think you’re a great speaker as we are seeing here today.
David Henke: [00:02:35] Well, thank you for that. I appreciate it.
Ronak Nathani: [00:02:36] So you had an amazing career in tech and you had a huge impact on companies like LinkedIn and Yahoo, AltaVista, Silicon graphics. One thing that. While I was researching. I learned that you retired three times throughout your career. Can you tell us more about these times?
David Henke: [00:02:53] I can. I’m trying to think how to do this nicely. So I left Silicon graphics after eight years and Silicon graphics was arguably the greatest company I ever worked for. Jeff leaner sometimes gets mad at me because LinkedIn really is a great company, but Silicon graphics just the fact that I was there for eight years, I thought we had the best engineers on the planet. And they left to go to companies like Google, like Netscape. And, but, but when I, when I left after eight years, it was cause we had just bought Cray computers and that was becoming the largest player in a dying market because Google had shown us that the large number of small computers was better than this mammoth supercomputer. So I just quit and I had enough money. I always made money. I was good at that. And then I just said, I’m done. And I sat on the sidelines for awhile and a friend of mine who I had worked with at Silicon graphics called me up. And he said, I’m at an Elon Musk company. Hmm. It’s his first company it’s called zip to do door to door directions. And we did a newspaper sites for the New York times and the mercury San Jose mercury. And he wanted me to run operations and I had never run operations before. I’d always been an engineer. So I said, what the heck I’m sitting on my butt might as well go get another job. So I went to work for this little company called Zip2 Elon Musk’s first company. We sold it for $310 million, which got him into the business. But also I made some dough. We sold it to compact, which at the time owned Alta Vista and that’s how I got into Alta Vista. And and then after four years of that, I retired that’s retirement one. I retired because my children were at the age of high schoolers. Freshmen and sophomore in high school, then it’s the one-time your kids do not want to be with you. Young people both know this yet, but trust me, they don’t want to be with you, but I wanted to be with them. So I retire for three and a half years and I did not work. I took a karate, I took Spanish, I took real estate courses, but I also spent a lot of time with my kids. And then I got another call and this was from Yahoo. And that’s when Yahoo asked me to come back into and help them with their sponsored search. Down in over which was overture, a company they had bought, and that was responsible for half their dough. And it was a hell of a thing. And we can talk about that later, but I came down and did that for two years. And then I did production operations for Yahoo for two years. And at that point, Yahoo capitulated on search and search monetization, which I was very disappointed with. And so I retired again and then within a few months somebody I had worked with at Yahoo, a guy named Jeff Wiener. Who’s the CEO of LinkedIn. And now as a chairman of the board of LinkedIn, yeah. He called me and said, I want you to hear about this company. And I said, you know, Jeff, I, I, I think I’m not going to be working anymore. Yahoo kind of left a bad taste in my mouth. And he brought me in and he, I talked to Reed Hoffman. I talked to all the engineers. But the most important guy I talked to was the CFO. Let me explain this, this business that we have based on this data, that we have some incredible set of data and what we’re trying to do, and it just blew me away. So I thought the heck, I might as well try it one more time. And that was arguably probably the best business decision I ever made. And I, I, I think looking back LinkedIn is the best company I’ve worked for.
Ronak Nathani: [00:06:16] Well, we are extremely grateful that you chose to work at LinkedIn. I think Austin Ouyang and I here have a job, probably because of you, because you, you build in that situp culture at LinkedIn when you came in. And before we go there, you mentioned that you were always on the engineering side and operations is not something that you had at least led before. So. When you went to Silicon graphics or sorry, zip to at the time did you know you would enjoy this or were you passionate about operations engineering or you just aren’t take a chance?
David Henke: [00:06:46] I actually didn’t know if I would enjoy it. And in fact, I did not enjoy it when I don’t like it. As many of you when things do not work and I will tell you things did not work at zip to it was one of those Microsoft shops. It didn’t scale. I did, I was such, I was a good enough programmer and encoder where I actually reviewed changes before they got on the site. Because at that point I was like, you’re not this isn’t getting out of my sight. I remember Elon Musk himself trying to write code to get on the site, not a good engineer, a very great, but not a good program. And I, and it will say, I, I didn’t know a lot about data centers. I didn’t know a lot about networks and networking. I didn’t know a lot about the scale and the issues. The beauty of Alta Vista when we took that over was that was a large problem. It was a large search engine before Google, but they used the big alpha chip DEC machines and that wouldn’t scale because of cost. And so you learn a lot real fast, and I realized that engineering and operations really need to work very closely together. If you’re going to scale on the internet. And that’s a lesson. A lot of little companies need to learn including LinkedIn when I joined them.
Ronak Nathani: [00:07:57] Yeah. It’s pretty amazing that you learn all of that on the job. Also you mentioned that for the first a lot of years of your career, you were a programmer but then you went from the IC track to the management track. How did that happen?
David Henke: [00:08:12] Well, that’s a long story. So I was a founder in my two startups and I was the principal programmer. The first startup, I wrote 73% of the code and it will all see programming back then, if you were interested in that, when I got to Silicon graphics, I thought I was a real hot shot programmer. And then after the app, I would say out of the 200 programmers that were there, I would rank myself one 97 and it was a little humbling. And I thought, well, I can look at this one of two ways. I can feel bad about that, or I can feel really good that I can learn from these other hundred 96 really smart people that are way better programmers than I am. And that’s what I chose to do. I worked for the tools group and I, and also compilers. So I was, I was involved with the C plus plus compiler. I was the one who brought Java over from sun. What a piece of shit Java was at that time. A hundred times slower than C. I brought, we brought purify into the company pure software but at the time I was still an individual contributor and we were supposed to get to a 64 bit computing model. We were the first ones to do it in the major computer makers and everything worked except the tools and the compilers and our group had, was choking on the tools. I wasn’t part of that exercise. So they came to me as an individual contributor and they said, we need you to work on this problem. I said, well, what if I don’t want to work on it? And they said, then you’re fired. And that’s the way Silicon graphics was. So they made me manage a group for the first time in my life of people. And I put them into two shifts. Working round the clock to get our debuggers and our performance analysis tools and our CS plus plus front ends and all these systems to work with a 64 bit computing, you would think this would be an easy problem coming from 32 bits to 64 bits. It’s a really hard problem. And we worked, we worked two shifts around the clock. Dan Knight for three months created a minimum viable product. And all of a sudden Silicon graphics could ship their 64 bit computers. And I realized. As an individual contributor, I can do this amount of work, but I just got these gentlemen and these ladies to work really hard on this problem. And we were heroes and Silicon graphics was really good about taking care of their heroes. They sent me to Hawaii for five for five days in a five star hotel with presence on my desk every day. Plus bonus is of course that’s amazing, very good company to work for that way. But anyway, that’s why I became a manager. I decided that I could do more with that. Now, if you ask a person like me and especially Kevin Scott, who’s the CTO of Microsoft and I hired for LinkedIn. He would rather be programming in Python right now. They’re doing anything. So once you’re an engineer, you’re an engineer and you have to give up something to be a manager or leader. So don’t forget that when you’re making that transition.
Ronak Nathani: [00:11:03] So as you got into management, I’m sure at some point you would also miss writing code, but ward, what aspects kind of get what you’re going to continue on the management route?
David Henke: [00:11:14] Well, there were, there were the let’s get back to the code part. I still had coded because, so for example, when I, when we did the transition to 64 bits, I wrote all the test cases when we did the the. When I was the manager of the group that meant that moved to purify to get purify, to work. I wrote the purify torture, test, everything you could do wrong and see programming. And it was so good that the pure software people made me an honorary member of their engineering team. Now I was the manager at the time, but I was writing the test code. Even when I was at LinkedIn, we had hack day and I wrote a, a hack that basically scraped all the members of LinkedIn. At the time, there were 140 million, and I could tell you. Their names and their titles and their companies. And I scraped it without having access to LinkedIn directly. And then I handed it as the hack to my security team who worked for me, of course. And they were quite embarrassed. And then they wrote a thing called Sentinel, which fix this problem. So dummies like me couldn’t scrape LinkedIn. So I kept my hand in, but at the end of the day, You have to decide, you know, you’re going to trust your team. And I always trusted the team. And now, you know, at this point, even when I left LinkedIn, everybody was smarter than I was, which is great. That’s the way it should be. Yeah. Well,
Ronak Nathani: [00:12:28] so going from the court part, what, what got you going on the management track? Like what did you like about it?
David Henke: [00:12:33] I liked about his is the ability to, to handle and do many more things. So I could get a lot more work done and accomplish a lot more if I could direct traffic. And at that point I became like a coach. I don’t know if your audience knows this, but I’m a big fan of the Los Angeles Lakers. Sorry about that warriors fan. But I grew up here and I have season tickets. Why you make a championship team? Not by just having LeBron and Ady on the team. You make it by having rebounders and defensive specialists. And I assembled those teams at LinkedIn. When they did the going away from me, there were 22 LinkedIn employees in the room out of those 22, five of them were there before I got there. The other 17, I hired personally, including Kevin, Scott and Bruno and I, that was my team. And that’s what I realized would that kind of a team you’re going to win championships. You’re going to, you’re going to win. I love that
Ronak Nathani: [00:13:29] makes sense. So I, I’m very eager to jump on some of the, some of her time at LinkedIn, but before I do that you mentioned when you went to AltaVista like operations or something that you learned on the job, like learning about networks, data centers. Well, first of all, it sounds challenging. And throughout your career, like you were still writing code and just staying close to the ground while managing a big organization. What did that learning about learning on the job about like data centers and networks look like? I, if I imagine it just sounds hard to do all of that.
David Henke: [00:14:03] Well, when you, when you realize that at the end of the day, it’s either hardware or software and what you’ll find, and this is where I have friends in this business. So my first startup, we wrote software to design integrated circuits and PC boards. So I have some background in hardware engineering, but the people that know about hardware and software combined. Make for good, good people that the old way of doing operations, where you had engineers and they would hand her over a fence to the operations team and the operations team would run. It doesn’t work in the internet. It doesn’t scale. It’s not fast enough. It doesn’t deal with issues fast enough. And, and so I was actually a good candidate to be somebody to learn about operations. I actually find the best operations personnel are people that were engineers first and then became operations personnel. If you’re strictly operations and you don’t know how the code works, then you don’t know how the internet works. Oh,
Ronak Nathani: [00:14:58] yeah, I used the last one to that. Well, so talking about a time at LinkedIn, you mentioned the third time you came out of retirement was when you came to LinkedIn and led engineering and operations now Every SRE at LinkedIn has heard this, that the number one priority after talent is site up.
David Henke: [00:15:19] Yeah. It took a while to get that, right.
Ronak Nathani: [00:15:24] Yeah. And like every SRE who joins LinkedIn from the get go, like from the bootcamp to every, like, if they go to a post-mortem after an outage or if they are speaking with a team for a new design, like site up is a culture. That’s talked about a lot. And it’s something that’s attributed to you. And many people at LinkedIn who are still there since your time. Talk about site up. Yeah. That’s that comes from David Henke. And I was reading about you and it seems like that’s true for your time at Yahoo as well. And I realize we probably can’t go into all the details, but I would love to know what, what did it take to bring that culture at a company like LinkedIn? Like you mentioned, it took a while to get there like what major changes needed to be happening.
David Henke: [00:16:11] You used the right word culture. So let me, let me, let me tell you what happened. I come from Yahoo and Yahoo and Google know how to run it at scale on the internet. Good for them. And that was a good experience. Come to LinkedIn. The fucking site is down every day, every day. Okay. I, they had a word that I never heard of called throttling. You know what throttling means, where I come from, it means you’re on your motorcycle and you’re pulling back and you’re gassing it. Right. That’s what throttling means to me. That’s not what it means. These guys, it means you’re trolling bits on the floor because you can’t handle this many requests. Never heard of this term before. Okay. This happened to every day and I’m like, this is not good. This is a, this is not a good outcome. Why is this happening? And then you start exploring it. So the first thing we did at LinkedIn was we had a daily operational meeting. And the beauty of the daily operational meeting is I get to hear everything that’s wrong. And unfortunately, there’s many things that are at this point, but I, but unlike Jeff leaner, who was probably the greatest QA person that LinkedIn ever had, all 30 of the things that he reported that day, aren’t going to kill me. But these three things are, and let’s go over these three things and figure out what we’re doing at that point. You’re bringing attention to the problem. Then you’re getting people to realize, you know, if we weren’t having so many problems, maybe we could spend more time working on things that are of more interest to us so we can get out of some of these problems. The other problem was that just the sheer culture of the organization, LinkedIn was a product company, Reed Hopkins, a wonderful guy, and he’s a product guy. Jeff, leaners a wonderful CEO product guy. Deepness Shar, a very technical guy, but he’s, he was the head of product at LinkedIn, from Google. And so you have three of the most powerful guys at LinkedIn or they’re all product guys. Okay. There’s an old adage quality scheduled features. Pick two. Yeah. Okay. Engineers typically pick quality and schedule. Not always, but typically product people typically pick schedule and features. So there’s this natural tension between quality schedule features picked up. And I, of course what I asked you, if we heard this question, what do you think he said, I want all three. That’s not the deal, Jeff. That’s not, that’s not the problem we’re trying to solve for here. Bottom line, LinkedIn didn’t treat site up importantly. And now we did. We had to, because if you can’t keep the site running and the service running, who gives a damn about this next feature. Okay. I’m all believer in growing fast. And I think that’s a lesson I learned from Mr. Hoffman. We’d often go as fast as you can grow as fast as you can, but the site’s got to work. The other thing that was important to me, well, security, and, and you know, when, you know, 500,000 of your users have the password, one, two, three, four, five, six. That’s not a very secure system. And, and you, you want to start building that out as well. Now we weren’t moving money at LinkedIn like you know, like a bank, but we still wanted to make it as secure as possible. So I wanted site up and security to be on top after our talent count always comes first because without the people you can’t do anything then that, and then everything else. Now that doesn’t mean we spent all our time working on site up and secure. What it means is when push came to shove, that took priority. And that was a difficult and time consuming argument with Jeff and with deep and with Reed and with the rest of LinkedIn, because LinkedIn wasn’t used to that, but we figured it out and it was in our best interest. And at the end of the day, it really, really paid off.
Ronak Nathani: [00:19:59] So I want to touch on what you said quality schedule and features spec too. It’s a challenge, especially during high growth times. Like if you’re spending too much time getting the perfect technological solution, but not for the right product, you won’t survive on the other hand, if you build the right product. But like you said, at the site is not working, that doesn’t work either. So. While you want to grow as fast as you can with still want to make, keep the site up. How have you seen successful teams manage this?
David Henke: [00:20:29] Well, what you realize is what are the things you look for, the things that are killing you. So if the sites, if you’re constantly fighting the site, then you really don’t have time to add new features, right? You’re you’re just not going to do that. The other thing that was killing LinkedIn was the release process again, before your guys’ times, but we used to release every two weeks. Very badly. It was a huge job, a monolith. I remember I’m the guy that poured a Java from Sunday to, to Silicon graphics when it first came out, put a piece of shit. And, and it, it was the memory management model was horrific and this right. It’s much better. Now bottom line is that we shipped this big piece of junk every two weeks and we had to fix it forward. Sometimes we were there till the next morning, sometimes till the next afternoon, just trying to get our site to work again. And it was horrific, right? So we finally went to the product people and to the rest of LinkedIn, the engineers and said, we want to redo this. We want to rearchitect how we deploy. And we want to do this in a way that at the end of the day, we will all be better off. And we had four principals. The project was called inversion. Everything at LinkedIn is in something. Yeah, it’s kind of a nice name inversion. So we had four principles, one trunk development, everybody checks into the trunk. This is the way it should be. This is the way we did a dash at SGI. By the way, if you broke the build at Silicon graphics, your buyer. Okay. So anyway, you don’t break the bill because we’re all checking into Trump too. Gotta be 24 seven testing against the trunk. We’re constantly testing against it because we’re all using it as the common base, three Canary in the coal mine deployment, instead of deploying to all 100 nodes, I’m deploying to one that works three that works five. If that works 10, if that works 30, if that works 70, if that works all a hundred nodes get deployed to at any moment, if it does not work undo. You have to be able to undo. And with those four principles and the machinery behind that called inversion, we changed how code was delivered, deployed, tested on done, and the pace at which this was all done at LinkedIn. People could release anything at any moment. And if it didn’t work, we’re going to do it. This makes the engineers happy. They can go as fast as they can. This makes the ops people happy. If it doesn’t work, we’re on doing it. This makes the product people happy. We’re shipping more features than we were before. This makes the sales people happy. We’re making more money. Everybody’s happy. And I will tell you that was a very good thing for us to do.
Austin Ouyang: [00:23:21] Yeah. That’s really interesting too. To see the, how the project and version came about. And I can imagine that this was the right thing to do for, for LinkedIn at that time. And it’s not easy to, to make that call and say like, Hey, this is what we have to do. And a lot of, I can imagine a lot of engineers coming in and saying, well, I thought I was going to work on, you know, like the latest and greatest technology. Why am I being put to, to do the, these other tasks, which I thought would be done already? And I can imagine someone in your position. Has to be able to keep such a large group of engineers motivated.
David Henke: [00:23:56] Okay. So to your point, think about not just the engineers, but think about the product people, right? Their whole, their whole world is new features, new product development. And they were effectively put on hold largely for six months.
Austin Ouyang: [00:24:11] Right. So what I’m curious about is like how, how are you able to influence them to. Kind of convince them and keep them motivated to be like, Hey, this is, we have to do this. And this is how that’s going to pay out. Like in the short term, I understand this going to really, really suck, but you’re going to really enjoy it, you know, after this time
David Henke: [00:24:31] that’s right. And, and, and what first you go to the leadership, right? You make no mistake. We went first to deepen Deep Shar, the VP of product. And, and the good news is Steve’s a pretty good engineer. And, and so he understood the mess we were in. Then we went to the Jeff Wiener and he has to, and the nice thing about Wiener is he spoke every two weeks in front of the whole crew every two weeks. And I always asked him why he says. Because you have to say something 42 times before anybody will remember it. That’s one and two, you said we’ve got a lot of new people. They’ve never heard any of this shit. We got to do it over and over again. Well, in this case, he’s going to speak to the whole company about inversion because we need them to, we need to get buy in from the product, people from the finance people, from the salespeople, as well as from the engineers. And we did.
Austin Ouyang: [00:25:21] Nice. So I want to pivot now. So we love discussing stories about production outages on the show. And also the lessons learned, I imagined leading up to project and version and probably during it, there were probably many, many of those. So you’ve seen other many product outages, not just that LinkedIn, but also other, could you share maybe one or two of these war stories from your experience?
David Henke: [00:25:45] Yeah. Some of these actually bring back very bad memories for me. So I just feel if I, if I start crying at some point, you’ll understand. So probably the greatest outage in that, in the history of Yahoo that, that I was on board with was what I called the 10 G massacre. So 10 G is Oracle. Oracle Tangi. And at the time again, we had an, a legacy system that was, that was responsible for half a Yahoo’s money. And the, the massacre went from spark to, we were going to change all of our databases. And we’re going to go from spark to Intel, single computer to rack 32 bits to 64 bits, Solaris, OSTP to Linux OOS. Big Indian to little Indian in the, in the bite represent EMC to net app storage and finally Oracle nine, NEI to Oracle 10 G. This was our migration path. Not
Austin Ouyang: [00:26:47] in all one step though.
David Henke: [00:26:48] Right? I remember I had been, I had been at Yahoo for one month just trying to keep the site going. And this had all been tested. I was assured by everyone, this was all tested. And we have certifications and all this stuff. And like anything else when you make that many changes all at once, it’s probably not going to work. So I called it at the end of the day, the 10 G massacre. We loaded it up on a weekend, which was typically our non-traffic time. And by the morning when Ajax started coming online on Monday morning, things turned to shit in a hurry and it was bad. And effectively what we learned was this, this version of Oracle on Linux in this environment was beta at best. And it was crashing constantly. I now know what an Ori 600 is, never knew that before, but you’ve got to know that because that’s, that’s basically you’re screwed and you don’t know why, but, but I’m going to preserve the data. If I can. That’s an Oracle fatal air. Anyway, Took us two months to get out of this nightmare. Two months during this time, I would get up at seven in the morning, walk to work and I would leave every morning at two. So I got five hours of sleep max and I was living in Pasadena at the time because that’s where obiter was at at 2:00 AM in the morning. And Pasadena, there’s only two groups of people that are out there. The drunks and the horse. And they would ask me, can I have a cigarette? Cause I used to smoke at the time. Sure. And then they would ask me, why are you out here so late? And I said, well, it’s hard to explain, but I’m responsible for half of the money at a company. And we’ve just shot ourselves in the foot, if not the head. And they, then they started asking me, you know, personal questions, like what’s my name? Because they would see me every night. After about three weeks, they said, David, how’s it going today? Any better? And I said, well, Hey, you know, we, we stayed up for most of the day and so forth. After five weeks, they said, we have a good feeling about this, David, at this point, I’m buying them beer and, and cigarettes and shit. And remember, I see the same crew every night at 2:00 AM. After two months, we sorted all this out with a lot of help from Oracle and experts from a lot of places, including a lot of people flown down from a headquarters in Sunnyvale, per Yahoo. And at the end of it, I went to a liquor store, bought five cartons of cigarettes and as many bags of booze as I could carry. And I took it out there at 2:00 AM to my friends. I said, I will never see you again. I hope. And I never did. That’s the 10 G massacre.
Austin Ouyang: [00:29:43] Yeah, that sounds fairly horrifying. And I’m glad that you guys are able to get up at at some point. So you wrote a series of blog posts related probably quite to this on the LinkedIn blog titled every day is Monday in operations. As I was reading through one of those posts, you were talking something about the Panama project or you wrote one of the axioms that go like. Go to work every day, willing to be fired. Can you elaborate more on this or share any related stories
David Henke: [00:30:11] I can. So Panama is as you know, the Panama canal is probably one of the greatest construction efforts in the 20th century. It’s an amazing story. I recommend to all your listeners to read it. It’s like the a thousand page book in one of my bibliographies on this topic, but we called our project to rebuild, sponsored search for overture and Yahoo Panama. And it was very similar. And in terms of many things, one in the Panama canal, they had to keep the workforce alive because of yellow fever and because of malaria and they didn’t know the cause of it. In our case, we just had to build a team. When I got to Yahoo, there were 27 people working on Panama. When I ended up there were 500 people working directly on Panama. The second problem was how to engineer the solution. And we had to create brand new engineering and solutions to, to make this work and infrastructure as well, just like they did on the Panama canal. For those of you who don’t know it, the Suez canal connected to C’s, but the Panama canal connects to oceans and they literally had to build a Lake at the top of Panama, get the water in there and use that to, to float the boats and to lower the boats in the locks. And that’s the engineering solution that works. But the other thing to remember is it’s a long-term project and the Panama canal was a long-term project. The French had started it, they dug up one third of it and quit and an almost bankrupted France. And then the Americans took over because of military reasons. And Teddy Roosevelt was smart enough to know that we needed this, nothing else for our military. And but you can’t give up. So there’s a, there’s a famous part where they’re cutting. What’s called the co-labor cut in the Panama canal. And it’s a very mountainous area. And it, once again, because it’s a tropical rainforest, it fills in with mud and water and the engineer goes to the chief engineer. He says, what do I do now? We just filled up the trench, the labor cut, by the way, labor means snake in Spanish. We filled it up again. What are you going to do? And the chief engineer says, what do you think you’re going to do? Dig. Well, that’s what we had to do on the Panama project. It took us one and a half years to do this. Now, getting back to go to work every day, willing to be fired right near the end of the Panama project. The boss shows up the CTO of Yahoo. She would made nameless at this time, but he’s my boss. And I got all my lieutenants in there and I had written this 25 page spec for what it meant for acceptance criteria for Panama. Right. And he says, we got to ship now. I said, but we’re not, we haven’t checked off everything on the 25 page back. So he starts taking things off my list and I start getting angry and this is not, not good. And I will say, yeah, I didn’t behave. Well. I took my badge and I threw it at the big boss and I walked out of the room and I quit because he was trying to undo my list in front of my staff. At the end of it, he came and we talked and we, we both apologized to my staff because either of us handled that one very well bottom line, we didn’t relax the criteria for releasing it. And we did a hell of a job and I’m still proud of that project and very proud of the people that worked on it.
Austin Ouyang: [00:33:28] Yeah. That’s really great to hear. Yeah. You mentioned like you, you work with many engineers even at that time. And I imagine. A big part of your role has been to grow them as well. And to actually see key qualities that, that you’re like this, this is what I want to see in a really good engineer. What kind of key qualities have you kind of discovered from a variety of engineers that you feel like they’re going to go places and do great things in the future?
David Henke: [00:33:54] Well, again, obviously you gotta have the smarts, but that’s not enough necessary, not sufficient. So believe it or not culture, once again comes into play, right? You could be the smartest guy on the planet or a gal. And if you can’t get along or figure out how to get along with his team, you’re off the team. You also have a certain some engineers have a certain knack. For exploring or thinking about what could be. And you’re always looking to them because those are the ones that are going to take it to the next level. And that’s another trait that I look for by the way I interviewed almost every engineer we hired in the old days. I spent 35% of my time in LinkedIn hiring. Imagine that 30, I remember. I was there all day, all night, but still. 35% of my time hiring people. And I would, I would talk to junior interns as well as senior leaders didn’t matter to me because, because everybody that joins his company, that is the number one priority for us hiring the best and the brightest, but also the cultural fit that does matter.
Ronak Nathani: [00:35:02] Since, since you talked about your boss a little bit in the previous conversation, in what Austin Ouyang was talking word during the Panama project I was looking at your LinkedIn profile and Kevin scarred CTO of Microsoft actually has a recommendation for you. And one thing that he says is that you are the best boss he’s ever had.
David Henke: [00:35:20] Yeah. That, that actually pisses off all his other bosses. That’s okay. I’ll take it. I have, I have only one other recommendation that I post by the way that’s named Sheila.
Ronak Nathani: [00:35:33] Yeah. And I have a question on that one too. Go ahead.
David Henke: [00:35:36] Go ahead. I just want to make sure we mentioned him because everybody’s got a hero in this business and my hero is okay.
Ronak Nathani: [00:35:37] Why don’t you tell us why
David Henke: [00:35:38] Yahoo Yahoo was a, what I call the loose Confederation of warring tribes. You got a lot of really smart people working on it. A lot of very different things, not necessarily together. She just wanted to do search and search monetization and he was really good at it and he attracted people to do it. So his team was, was very loyal to him. And I eventually worked for him at the end of the day in search and search monetization. He’s the smartest, most humble, hardest working person I have ever met in my life. Period. He literally, you can call him up every hour of the day. And I tried this, I even Cron job to do this because I was so lazy to stay up just to see what he would respond all, but four hours of the day he responded.
Ronak Nathani: [00:36:42] So I I’ll get back to my question about Kevin Scott’s recommendation. What are you being the best boss? But since you mentioned hard work also on these recommendations, like both from Kevin, Scott and Sheila, I actually want to read it out, just a tool, a couple of lines from both of their recommendations. So this is from Kevin scars that I knew within a day of working with him that David’s passion and commitment to his work and to his employees were almost superhuman. And Sheila say something on the similar lines where ferocious intensity is another hallmark of banking. I still remember the days where hanky can fight off a major site outage with this enormous willpower. So you mentioned Qi Lu was one of the hardest working folks, the hardest working guy you worked with. My question for you is. These two people that you respect a lot are saying that your abilities with the intensity you came in with and the passion you have are amazing. So how did you develop this passion and intensity that you brought to work every day?
David Henke: [00:37:42] Well, that’s, that’s a good question, by the way. I’m not like G Lou and Kevin Scott. They’re both introverts. They’re, they’re both a lot smarter than I’ll ever be. But I, I believe I don’t like it when. Shit doesn’t work. And it really bugs me and I don’t like to fail, you know, there’s, there’s, I always say this to people, two kinds of people. Do you love winning or do you hate losing ask yourself that question sometime I hate losing more than I love winning doesn’t mean I don’t love to win. I just hate losing it. I hate it when things don’t work and you know, the nice thing about LinkedIn is there was no end of it. They’re there and Yahoo. And an overture, no end to it there either. So let’s go fix that. And I was often brought in to fix things like that. And and I know what to do in terms of you know, willpower to overcome it. That’s not me. That’s the team. What I had to do was get the team in place and make sure everybody knew what they were going to do. And once you get enough people marching in the right direction, you’ll solve any problems.
Ronak Nathani: [00:38:46] Well said and coming back to my question about Kevin Scott’s recommendation for you being the best pause. I know I’ve repeated that now three times, right? I don’t know how many people
David Henke: [00:38:54] keyed all the reasons he might’ve said that you can interview Kevin someday and maybe ask
Ronak Nathani: [00:39:00] him that question. Yeah, we would love to have Kevin on the show as well.
David Henke: [00:39:04] He’s quite a character. We’ll just leave it at that. Okay.
Ronak Nathani: [00:39:07] So my question actually is what do you think makes for a good boss?
David Henke: [00:39:12] A good boss is first of all, somebody that listens to you listening is a very underrated skill. You, you want somebody that actually hears what you’re saying, right. But also can speak it back to you. So this gets back to one of my definition of communication. Emails not communication. Texting is not communication. IRC chat is not communication. Communication is when you and I are talking to each other. I say something to you. You can say what I said back to me, and I have to agree that you got it right. That’s communication. I had to learn that the hard way in marriage counseling and that didn’t work, but it is anyway, bottom line is communication’s really hard and believe it or not, I would use this trick. With some of my teams. Okay. Person a, you need to talk to person a, say what he said and make sure he agrees that you’ve got it right. You would be amazed how hard this is. And that’s something we need to focus on. The other thing that I used to do as, as the boss was, I wanted to know what they, what my employees wanted to do. Look, what is it you want to be when you grow up? What is it, what you want to do next? What is it you want to work on? So, I, I don’t, I don’t wait every six months or one year to have a, a talk like that with them. I waited maybe at the max a week and we’ll, we’ll hash it out. And by the way, there’s usually three things. And, but no one wants to hear it more than three things that you want to work on. Trust me, poor thing. And there, their eyes are waving over their head and they could give a shit. So pick three things. And instead of saying, you suck at this, say, what if you did this instead, maybe we can talk about that. What if this, for the three things that we’re trying to work on? Cause there’s always something to work on and that’s true for all of us. Yeah. Yeah.
Ronak Nathani: [00:40:58] That’s absolutely true.
Austin Ouyang: [00:41:00] Yep. So you’ve been leading you’ve been in a lot of leadership roles at this point and worked with many, many leaders as well. I’m pretty sure there’s a ton of parallels with, you know, being a good boss, but I imagine there’s other independent contributors or individual contributors that, you know, Also have like the leader attribute just not in like more managerial position. So what to you makes a great leader beyond, like, I know
David Henke: [00:41:27] let’s talk about individual contributors for a second. I’m going to turn your question a little bit because it’s good. It’s a good LinkedIn story as well. So when I got to LinkedIn on day two I w I asked the VP of operations at the time. Show me the Dr. Site. Tell me about it. How long will it take me to cut over if I lose this data center? Hey. He says, I don’t think you understand. And I said, I don’t, I don’t think you understand. It’s a simple question. Is it going to take four or hours? Eight hours, 12 hours a day. And he says, I don’t think you understand. I said, I don’t think you understand. He said, well, we have 90% of the computing we need in the second data center. The Dr side. I said, that’s fine. We can buy the other 10%. I said, what else? He says, well, we don’t have any software. On those computers of our software. We don’t have any data replicated on those computers and we don’t have any configuration parameters set up. Basically they’re just machines clicking the machine room. I said, so basically what you’re telling me is we don’t have a Dr site. Is that what you’re telling me? And he goes, that’s right. I, so by the way, that’s not good. I had to go right to the CEO and right to the board of directors and say, if we lose this data center, we are out of business. Everyone nod your head. Everyone understands this. Right. Okay. So fast forward, we built a Dr. Site. Thanks a lot to Neil Pinto and a whole bunch of people. And we built it from scratch and it worked. We know it worked because we cut over to it in about eight hours pain in the ass, but we cut over to it. Then we wanted to build multicolor. Now, multicolor is a lot harder. Cause you gotta flip traffic, but the hardest part for LinkedIn was all the data sources had to be replicated so that they were consistent. And that was a really hard problem. So I took the best engineer I could think of at LinkedIn that we hired from Yaddo a guy named Swee Lim, individual contributor, not a manager, not a director, blah, blah, blah, but a real smart, yeah, engineer. And I said, sweet, we’ve got to make multi work. Just like we used to do at Yahoo, just like they do at Google flip traffic who gives a shit. If we lose a data center, right. Everybody in the company at that point worked for sweetly am the individual contributor. He became the leader. Jeff Wiener got in front of the entire company. If this gentleman comes into your office and asked for something, do what he says, and we built multicolor in one year, roughly now. It runs in many, many data centers. And I remember the last time I was in the NOC at LinkedIn, it was green, green, green, red, red was one data center down. The other three are datas d’etre centers up. And Mr. Hinkey, isn’t this a beautiful thing. We don’t give a shit. It works perfectly. And I said, that is a beautiful thing, but it took an individual contributor to your point to lead all of LinkedIn. On a cross-functional project, which was a real pain in the ass, but we did it.
Austin Ouyang: [00:44:33] And I imagine even like, sweet, like you said extremely smart. And I’m assuming that like another big part was that he was able to bring other people along, which I think is something that a lot of engineers, you know, over time, slowly start to develop which made him so Useful in that
David Henke: [00:44:51] particular position that’s right. People want to, they want to be part of something. They want to follow people that are, that are very good. They want to learn from people that are very good. Should I learned a tremendous amount from sweet? How does this work? How does that work? What’s the biggest problem we are faced with to get this replication problem to work. And and he, by the way, he, I think he just left LinkedIn and he’s over at Databricks now. Good for them. A good, very good guy. And, but it just shows you that you don’t have to be a. Manager or director or a VP, you could lead as an individual contributor. And as long as you, you need the backing of the, of the other leadership. But the nice thing is, so we had the backing of the CEO of the company. Again, Jeff Wiener would stand up every two weeks. This is sweet Lim. If he walks into your office, do what he says.
Ronak Nathani: [00:45:38] Oh, I have a up on that. It’s amazing that Jeff and the entire company supported suite on, on this project of multi colo. And I mean today, all VVC that would I gleaned and runs out of multiple data centers. We do traffic shifts and it’s just amazing. That’s one of the first things when I came to LinkedIn that I was extremely impressed as, Oh, we just click a button and it all happens. It’s like magic. What does it take for an IC to build that trust, that level of trust, who with the leadership like?
David Henke: [00:46:10] Well, part of it is, is I have to talk to once again the CEO and the product people and the salespeople. So I, you know, the nice thing about talking with Mr. Waiter is he likes philosophy. So we talk about things like existential issues. Existential is basically death. Or life. Right? So, so he gets the fact that if we really can’t make this work, we’re just going to be in, in a bad spot. And, and, and the nice thing is we also came from a company that understood how to do this. So people know what can be done, Google. It’s not a problem. They do it all the time. Facebook, they do it all the time, but LinkedIn did not. And Yahoo knew how to do it. So it’s important to say it can be done once it’s done. Everyone’s going to breathe these year. Believe it or not of the, of my four years there until we had multicolored, I did not breathe easy. I would ask
Ronak Nathani: [00:47:06] for a lot of startups who are on this excellent rated part of growth. Probably are still working towards developing that site up culture, which they want, but it’s not there yet. What advice would you have for them? Either teams or restage
David Henke: [00:47:25] companies? Well, I w one thing is you you’re in a better spot now. Right? So when I started at LinkedIn, AWS was just coming out right with Amazon. Google didn’t have the cloud. Microsoft didn’t really have Azure. But AWS was starting and we knew some of the first users of it. Cause they were ex Yahoo guys that went over to Netflix and Netflix was going to go all in with this and believe it or not at that time, this was years and years ago. It wasn’t reliable. It wasn’t Alaska Dick. It wasn’t cost-effective. There was the system administration school tools sucked, the security system sucked and it was like much as we wanted to use it. We wouldn’t use it. For all of those reasons now it’s, but it’s great. And you can go to AWS. You can go to Google cloud, you can go to Azure and you can be pretty sure that you don’t have to deal with data centers and networks. The computing that doesn’t meet in your software, that doesn’t have to be resilient. And your monitoring systems don’t have to be excellent. And you’re scaling systems don’t have to be knowledgeable. And so you have to build that in at least. Architect for that upfront. That’s my that’s one of my suggestions. I know you want to go as fast as you can. And I, I’m a big believer in Mr. Hoffman’s blitz scaling. That’s something I had to learn the hard way, the faster, the better, but that’s why you invest in things that help you go faster, help you scale because you, you, you don’t want to be in a position of what I call the going out of business business. Let’s say your success is so good. But you can’t keep up with the demand because you can’t scale it fast enough. Even if I throw it away, computing at it, you can’t scale it fast enough, then you’re in a bad spot. And if you’re moving money, that’s what I like to talk about. You know, like the encryption guys and the Bitcoin guys versus versus, and the banks versus LinkedIn, LinkedIn had to run at three nines. We did not at first, but, but my goal was three nines, 99.9 uptime. The the money guys can’t do that. They have to run it four nines or better, and cause they’re moving people’s money and if they screw it up, they’re out of business. And, and I think it’s important to grasp that as you’re running as fast as you can, because it seems like they’re at odds with each other. They don’t have to be. You just have to engineer and architect your solutions to scale always think 10 X.
Ronak Nathani: [00:49:52] Yeah. Makes sense. So w we starting to wrap up and have a few more questions for you want to make sure we respect your time, but before I go onto these questions, are there any other war stories you would like to share with our listeners?
David Henke: [00:50:05] I have many maybe if, if people like them, I literally could go on forever. So I will not do that. I will go to your questions. Okay. Well,
Ronak Nathani: [00:50:15] we will, we’ll save that for another time. We are, at least I heard that you read a lot and you also like gifting books too, to your staff, to the people who work with you. What are some of the books that you have gifted the
David Henke: [00:50:29] most? So I not only gift books, but I’m, it’s required reading. Okay. So if you worked for me directly, you have to read these books and Actually I’ll just lay some, I was going to look for a list, but yeah, I can’t find it. So, so here’s some of the books that I really recommend, but I have a bibliography and your readers could go find it on the internet. It’s a leadership talk that I given maybe 30 times. It was recorded by UC Santa Barbara. And I managed not to use the F word in that thing because it was my Alma mater. But at the end of this deck that, that this leadership speeches is done. It has a bibliography of all these books and, and definitely allow your readers just to look at the bibliography cause they’re good reads. But I have many, many good reads. Some of them go way back you know, like the mythical man month. It’s a software book, six hats is kind of a fun read, but some of the more important books are one of us have philos philosophical book, but written by the Toltecs who predated the Aztecs. And it’s called the four agreements. And it’s very important to me. The four agreements are, do your best, which you guys all do which is great. Be impeccable with your word. That’s number two, impeccable is Latin without sin. Don’t assume anything. Well, that’s ops one Oh one, right? And the heart, this one don’t take anything personally. I love this one because that may be the hardest one for most people to appreciate, but I’ll give you an example, David Henke, phylo the founder of Yahoo smart guy engineer. And I worked for him at one point when the, when the CTO left and we were looking for a new CTO and he called me an idiot about 500 times while I worked for him. And I said, David Henke I have the largest group in the company and I have the largest budget cause I run the data centers and the computing and the networks. And you’ve given me all this responsibility and I’m an idiot. What does that make you? And I’m thinking he must have just kicked his dog that day or something. I don’t know. Bottom line is don’t take it personally. I don’t think he really thought I was an idiot. And then maybe he did, maybe he didn’t, but I don’t really worry about it. So that’s a very good book too, for me, because you can use it in your life as well as in work, go to the bibliography, because I think, you know, he talks about things like how they built a Panama canal. Talla you want to see how something’s done a big projects done because not everything is a quick and dirty project anymore. Sometimes things are much harder to build than others. The graph at LinkedIn, the network that took, we did it three times when I was there. That’s a hell of a thing and it’s a hell of a thing to get. Right. And it may be the most interesting data structure we have. Yeah. And we
Ronak Nathani: [00:53:12] just sport releases a new database to store
David Henke: [00:53:14] that graph. You did well, I didn’t know that. So you were on iteration number ed where N is greater than than I remember. Yeah.
Ronak Nathani: [00:53:22] Well, thank you for that recommendation and David, we could go on and on with you and we could ask you a lot more questions, but probably we’ll save that for another time. It’s been such a pleasure and an honor to speak with you. Thank you so much for taking the time,
David Henke: [00:53:37] Any, any time. And if you work at LinkedIn, you’re, you’re probably part of the best SRE group. In the universe, at least in our universe. And the reason I know that is ex Googler, Kevin Scott. Now CTO of Microsoft knows how good Google is at this. And he said, Bruno, your team’s better. So there you go.
Ronak Nathani: [00:53:58] We’ll we’ll, we’ll take that. Thank you so much again, we really appreciate it.
David Henke: [00:54:02] Take care.