Julie Gunderson: We cover leading practices used in software industry to improve both system reliability and the lives of those people supporting those systems. I’m your host, Julie Gunderson, @julie_gund on Twitter. Today we’re going to be talking about security. We’re joined by Bea Hughes and Sarai Rosenberg, security engineers from PagerDuty. To get us started, I’d like our engineers to give us a little bit of background on themselves. Bea would you kick us off?
Bea Hughes: Hi, I’m Bea. I’ve been doing the security lock for about 20 years now, starting off in Merry Old England and then flying my way around the world at great carbon footprint expense. I originally got interested into security from the old BBS scene, which used to incur horrendous phone bills because in England we didn’t have free local calling. That led to some interesting family conversations. From there, I fell in with a good bunch of miscreants who started going to things like 2600 meetups because this was in the ’90s. I hear they still happen. From there I really started to get into Linux and Unix. I had my first Linux machine was running a one point something kernel. I am that old and having open BSD laptops. Ones so good that if you ever accidentally touched the track pad, you had to plug in an external keyboard before the keyboard continued working. In the end, I just unplugged the track pad internally in the laptop which is why the year of Linux on the desktop is something I celebrate every year. From that security scene, it naturally progressed into exploring the security about the people’s machines on a prorata, not entirely upfront basis because that was the internet in the ’90s, and was able to deliver many security reports by the way of root shells. I eventually realized that this could become a career because the body piercing industry was probably not going to pay as well. I managed to get a job at a local ISP because again, early 2000s and deal with modems and mail servers and new servers and all that kind of good stuff. Out of that, just slowly built a career of reading frack, occasionally finding old Solaris exploits to break into machines that people have forgotten passwords about, running too much end map and taking out networks and that kind of thing. Here I am today.
Julie Gunderson: Well, thank you for joining us. Sarai, how about you?
Sarai Rosenberg: In 2015, the CEO that I worked for told me to research security standards and compliance and give them some recommendations. I did that. I was sent out I was pretty good at that. I got really into making recommendations and I kept doing it. They kept letting me do it so it just built and it grew and it grew and it was a lot of fun. I decided to keep doing that and pursue security roles at other places. And I still continue to do that. I find it really good opportunity to connect with quite a lot of variety of different people and learn about the things that they care about and help them improve the things that they do.
Julie Gunderson: Then can you tell us, how would you describe security for anybody that’s new to security?
Sarai Rosenberg: To me, security is about assessing risk. It’s about finding the things that could go wrong. The things that could be threats to your system, deciding what you care about in your system, the things that you want to protect and then deciding what could you do about that. Where could you cut off potential risks? Whether it’s attack that’s in progress or securing your infrastructure, anything that you do can be improved and prevent any kind of threat to it.
Julie Gunderson: Thank you. What about you, Bea? What are your thoughts? What would you describe security as to somebody who’s new to it?
Bea Hughes: The point Sarai is really good because it’s all about managing risk. You can’t eliminate risk despite how much the RSA conference says otherwise. And it is about reduce harm reduction in a sense that you can’t secure everything. Not every thing in your company is the most important thing and you aren’t defending from everyone equally. I think those are really salient points. Security, if you ask at conference, it’s about [inaudible 00:05:00] and having the biggest exploit and breaking into the most things. If you ask somebody who actually does I guess the defensive blue team side, then just trying to make your business be as safe as it can while still functioning as a business that kind of gets lost. I think in the beautiful security view of the world where we can make this secure. Yes. But no one can log in. We’re kind of going to go out of business, something like that.
Julie Gunderson: Thank you. And I feel like you were headed down a really interesting path there. If you were to debunk a myth with security, what would the biggest one be? What would you say? This is the thing that I hear over and over and it’s just not true.
Sarai Rosenberg: That there’s one solution that works for everyone. What is a big risk for me, for my product, for my software is going to be a risk for every software. The threats that I have to care about, the things that I want to protect are completely different than the threats that you may want to protect and the things that you care about.
Julie Gunderson: Thank you. Bea and Sarai, I’d love to hear what are the values of the security team at PagerDuty and beyond what those values are, why are they your values?
Sarai Rosenberg: Many of us had an experience with security culture that we didn’t like. We gathered at PagerDuty and build something better. We worked for companies that would name and shame employees that failed phishing tests. And we decided that we wanted to do something better. We want to make it easy to do the right thing, to build something secure. We try to make it easy to bring in other people, never hesitate to escalate. We try to achieve to the maximum value.
Julie Gunderson: Tell me more about this phishing tests and naming and shaming. What were practices that happened that you see out there with that?
Sarai Rosenberg: Phishing is inevitable all over the place. Everyone is going to get phished at some point in ever. Every company is going to have some employee who clicks on a link and open something. Phishing is very good in 2020. We can’t prevent the phishing. But one of the things that we try to do is teach people to reduce the phishing. We teach people this things that they can look for to spot a phishing attempt, to learn to use their email and the ways that they log in and authenticate with surfaces more securely. And some of the things that go on, some companies will send out internal phishing tests where they’ll hire a company or do it internally to send out a phishing test. And it’s a fake email out to all of your employees. “Hey, can you fill out this form?”. And putting your username and password and you try to trick your employees and you always catch a couple of them. And small, a lot of companies we’ll name those employees, will bring it out. We’ll call them out. And even if they don’t broadcast it across the company, they’ll bring it to that employee and say, “Hey, you failed this phishing test”. And it feels bad. It’s not a good experience to fail a phishing test and go, “Well, shit. What do I do? I failed this. How do I do better?” We can help reduce phishing without having to shame people and make people feel bad for failing something. It’s okay if you fail. It’s okay if you click on something,
Julie Gunderson: Thank you. It’s almost like embracing failure, right? I like that. Not shaming people and not setting them up to fail. Bea how about you?
Bea Hughes: I was just going to add to that. You have whole teams of people whose job it is to open PDFs like recruiting. Their job is to accept anonymous PDFs off the internet of people who they want to employ and thus open them. So telling people to not open dumb in quotes attachments is the opposite of like… It is not their fault. Their job involves this risk and they are not at fault for opening PDFs. You should give them a way to open PDFs or whatever, present this information in a secure way not chastise them for doing their job because they will do their job or they will not have a job. So they will just not listen to you and they are doing completely the right thing by that. Anyone who’s like, “Oh, users shouldn’t click on links or open PDFs.” No, they’re doing the right thing. You need to update what you’re doing.
Julie Gunderson: With a team that has strong values like yours, how do you impart those values or give those to the rest of the company. For myself, after going through the security training at PagerDuty, I learned really valuable real life things that applied beyond my badging in and out at work. How did you all design that training to get people excited?
Sarai Rosenberg: We try to make the security training fun because we want it to be relatable to their work. We want it to be interesting and be laughs because not everyone lock picks in their work, but really beyond that. There are interesting things and fun interesting ideas that we can extract from little lessons like lock picking. That a lot of these things are surmountable, but by teaching them that maybe we can surmount this level or this level that we still have additional layers of security practices that keep us safe. That even if someone can lock pick and get in through our doors, we still check their IDs when they come in and we want to make sure that we only have familiar employees walking around the place. And we try to find things that people can use so that they understand that security is about defense in depth. It’s about layers of security. It’s about maybe one layer can be bypassed, but there are other layers going on. And in our trainings, we have little tidbits and little examples that are quick and easy and understandable and quick little things that you can directly apply to your work or lessons that you can extract, security ideas that you can improve. The ways that you do things, the way that you understand risks and the way that you understand the relationship of our team to everyone at the company. That if someone is going to get phished and they end up doing something that exposes some information or financial exchange through the phishing attempt, we’re not going to shame them. We’re going to help them fix it. And it is not their responsibility. It is our collective responsibility to help improve that.
Julie Gunderson: I feel that is one of the things that you have done well, is you have made your team very approachable. Security is not the enemy. We want to work collaboratively with you. And it started with that training. But Sarai, you talked about lock picking and I feel like some of our listeners may not understand why we learn about lock picking here at PagerDuty as part of security. Bea do you want to expand on that a little bit?
Bea Hughes: Sure. Locks are a wonderful analogy for cyber’s security and the locks are actually pretty terrible. The picking locks is actually not that hard. The majority of consumer locks, especially in North America and not particularly challenging to pick. You would think why housesit’s broken into by lock picks all the time. And I’m sure someone will cite one example where this happened once that’s proved me wrong. But by and large, it’s much easier to just force a door open with a crowbar which doesn’t require a ton of additional skills and tools or break a window or find a door that’s unlocked yet we still have locks on our doors because they provide some security while not providing perfect security. And I’m confident there are people out there who spend lots of time researching to get the best locks from Europe that have unpickable other than by three people in Belgium. But that’s like a threat modeling. The majority of homes are not going to be locked picked into if they are getting to be boggled in some way. It adds a layer of security. It is not a 100% layer of security and picking locks is fun because it makes you realize that we’ve all been living with this myth that locks are impenetrable and you must have the key and lock picking is some doc out and locksmiths are definitely worth the hundreds of dollars they charge at 2:00 AM when you’re locked out of your apartment. But actually it’s the adage of their designed to keep honest people honest which I agree with. But also it’s like what is that designed to stop like 99% of people just marching into your house and acquiring a new television. And that works enough that it is a good addition.
Julie Gunderson: And I like that it really is a physical representation of what we’re talking about with security and technology. And thank you for all of this. Shifting into some more operational methodologies of the team. Let’s talk a little bit about incident response. How does the security team tackle incident response?
Sarai Rosenberg: When I joined PagerDuty, I was new to incident response. And one thing that they clarified for me is that PagerDuty’s securities incident response is a little bit different than security incident response in general because we have a mixture of traditional security incident response which is responding to active attackers or breaches but figuring out what went on doing the forensic investigation, closing off attack routes. And we also have more traditional DevOps incident response where we maintain our services that we have for security like secret management. And so we have a collection of incident response that we have to make sure that we’re maintaining our own services. And as I started, I watched the other people on our team reduce the amount of noise and the amount of alerts that we got. And so they would go through this process of noticing that we had way too many alerts that were not actionable and getting rid of those. And then there would be alerts that were useful but we were getting too many. And so we had to do something to fix those services or improve the way we were handling that or secure infrastructure in some different way so we wouldn’t get these kinds of invalid login attempt kind of alerts. And we kept iterating on that. And every time we tuned down one alert, we’d open up another route and we’d go onto something else and say, we’ve also noticed that we’re getting some fraud. And so now we’re going to look and monitor some fraud that’s coming in or we’re going to have this new source of data about potential threats to our systems. And we’re going to remind her of this too. And we keep adding on new things and then reducing them. And so it’s the cycle of tuning our alerts as we add more and more monitors.
Julie Gunderson: And tuning is so important because like you said, eliminating the noise, reducing the noise, which helps with alert fatigue. It makes life better for your engineers. Bea you and I have talked in the past a little bit about reducing the noise and well getting rid of stuff. Do you want to give us some practical advice on how to get rid of stuff? How to identify what you don’t need?
Bea Hughes: Does the service bring you joy. The easiest way to secure a system is not to have it in the first place because then you don’t need to secure it. I remember Facebook having a bug bounty drama over some service that someone found. They then went in and found AWS credits that then led to this blah, blah, blah. By having management of your enterprise things whatever, you can have fewer things you need to secure which is ideal. Also with alerts as Sarai spoke about the target breach, I believe did actually have IDS alerts for the things that happened. But the NOC or SOC, there we go. That’s not an overloaded term, receive thousands upon thousands so it was lost in the noise. Step one in buying an IDS. Step two, if you do by mistake delete all the rules and then slowly add them in because every rule set has way too many things in to actually be useful. I strongly advocate often too much for having high value and high signal alerts over things that might be in a lot, but might not that goes off a lot. Because I think as humans, we’re generally lazy pattern matching machines. And after a while of seeing a same or similar alert, we will just assume that all alerts like that are of limited value. But ones that infrequently go off and you’re like, okay, I will actually investigate this. And alerts should have outcomes such as risk in the security world. If this alert wasn’t useful, then we will change it to make it useful. We will filter more things. We will tweak the alert. We will change the window in which it panics about or we will add more things in whatever. But having alerts for that high quality is so much better than just having lots of alerts. I’m keen of getting rid of alerts or systems or things that generate email that don’t need to generate email. Please don’t send me cron email ever. It doesn’t need to happen because you’re wasting time by getting humans to do things that computers can. And that is the whole point of these amazing labor saving devices. And by trying to get humans to do bad pattern matching on things, we do ourselves service and we waste Amazon’s CPU’s,
Julie Gunderson: Bea thank you for that. I really enjoy the focus around the high quality and high value alerts. Let’s talk about some quick wins that our listeners can employ right off the bat.
Sarai Rosenberg: Quick wins are something that I learned on PagerDuty security because it’s about reducing our risks. That we have so many risks out there. We’re never going to be able to eliminate all of them. And so if we look at one particular topic and say, Oh, well, I want to reduce this risk or eliminate this risk. There’s a thousand things that we could do well. Well, let’s have access controls for AWSIM. And at first glance, we have this grand idea of, well, we could build a service and they have a user interface and then store all this config information and we’ll have a database of the current permissions. Then we’ll store the DIFs. And then every time someone makes a request, we can decide which things… All of that is really complicated, but let’s start with basic things. Let’s start with the very things that are easiest to implement. Is there something that we could implement that would take a few hours, a couple of days, rather than something that’s going to take weeks of work from a team and build a whole project around it. We start with something that’s quick and easy and we get that quick win. And we build up from there and we look across our stack and we say, “Well, here are the risks that we want to reduce. And here are quick wins that we can take.” And then we look from there, how do we build this up into something? Where do we want to spend our time on bigger projects beyond those quick wins?
Julie Gunderson: Fantastic. Bea do you have anything to add?
Bea Hughes: Further to that security has always been obsessed with the kind of ridiculous notion of a binary of security of things must be a 100% secure. And will push for something going from 98% secured and 99% secure rather than trying to go from 0% security to 50% secure. And the industry is slowly realizing that that might not be the best approach. And having a security when not be perfect is better than pretending that you can get to 100%. Certainly the battle for everyone must have impossible to guess passwords which kind of ignores phishing. But then you just turn on to FA buy something that’s… Like turning on SMS to FA and you it like suddenly now they need to compromise your phone or your phone provider which is doable rather than just get your password. Just step up and like people have often said, the one password storing of 2FA codes means it’s not 2FA but it changes the thread model from you must not compromise one password account to you must steal their password off the wire because it still has a notion of a second factor. It is not as secure as another device, but it is a step up. And if it means that we’re not having 2FA , I will have that every time.
Julie Gunderson: Fantastic. Now, there are two things that we ask every guest on this show. Starting with Bea, what’s one thing you wish you would’ve known sooner when it comes to running software and production?
Bea Hughes: I think most things because I seem to take very strong opinions and then six months down the line find that they’re entirely wrong and then spend years trying to backtrack them. Maybe you don’t listen to me on anything. I had a whole query con talk about why I made the wrong decision with OS query and building our own intrusion detection system in house. That could have saved me years of work. Yeah, I guess there are many documents at times I have been wrong and continue to.
Julie Gunderson: Sarai, how about you? One thing you wish you would’ve known sooner.
Sarai Rosenberg: As a mathematician and insecurity professional, I have extremely low risk tolerance and I wish I had realized that I have very much lower risk tolerance than most security people or most software people in general. And I should just go for it, go try something, see if it works, see what happens if I break something, whatever I can fix it. As long as I have a plan to go backwards, then might as well try it.
Julie Gunderson: Now, is there anything about running software in production that you’re glad we did not ask you about today?
Sarai Rosenberg: How do we delete encryption keys or any data for that matter? The little hidden secret is we can’t. We really can’t.
Julie Gunderson: How about you Bea?
Bea Hughes: The beauty is it largely doesn’t matter. And as a mathematician that will drive you crazy and as a security person you will be like fine. I’ll eventually maybe concede some of that. But don’t buy an Intel CPU. I think my word of wisdom is, buy-in-large stop worrying about your crypto because that’s probably not what you’re going to get owned in a security sense. Because if you use libraries, you’ll probably be fine versus that will be a new bug and open SSL.
Sarai Rosenberg: Or as my partner, Sophie Smith likes to say cryptography problems are actually social problems. If you want to be able to encrypt anything, if you need to be able to protect your data, if you’re a journalist, it depends on your use case. For software doesn’t matter what kind of cryptography we use. But if you need to protect yourself from governments and nation state attackers, if you’re a journalist, please find good encryption techniques, please find good encryption tools and use them.
Julie Gunderson: Beau and Sarai, I want to thank you for being part of our security episode today. And for those of you out there listening, to view the training and learn a little bit about picking locks, you can go to sudo.pagerduty.com. To learn more about our incident response process here, you can go to response.pagerduty.com and as always join the community and be part of the conversation at community.pagerduty.com. Thank you for your time and for listening. This is Julie Gunderson. That does it for another installment of Page it to the Limit. We’d like to thank our sponsor PagerDuty for making this podcast possible. Remember to subscribe to this podcast if you like what you’ve heard. You can find our show notes at pageittothelimit.com and you can reach us on Twitter at @Pageit2theLimit using the number two that’s at page two, the limit. Let us know what you think of this show. Thank you so much for joining us. And remember uneventful days are beautiful day.
Sarai talks to us about how security is about assessing risk and deciding what you care about in your systems, what you want to protect, and what you want to do about these risks. Bea hops in to describe security as being about managing risk, and harm reduction as you can’t defend everything equally.
“You can’t secure everything, not everything in your company is the most important thing”
In debunking a common myth in security, Bea talks to us about how there isn’t always one solution for everyone:
“That there’s one solution that works for everyone, that what is a big risk for me for my product, for my software is going to be a risk for every software, the threats that I have to care about the things that I want to protect are completely different than the threats that you may want to protect and the things that you care about.”
Bea and Sarai talk about the values on the PagerDuty security team and why the team has those values.
Sarai talks about how at previous companies, there was naming and shaming of folks who failed things like phishing tests, and how at PagerDuty “we try to make it easy to bring in other people, never hesitate to escalate”.
Sarai talks to us about how while you can’t prevent phishing, there are ways to help people use their emails in a secure way without trying to trick employees so you can catch them and call them out.
Instead of creating a negative experience Sarai tells us that it’s ok to fail.
Bea talks to us about how it’s not the fault of specific people to accept risk instead: “Present this information in a secure way, and not chastise them for doing their job”, instead as a security team you need to update what you’re doing to enable folks to do their jobs.
Sarai and Bea talk about how it is important to make security training fun for the participants, and how through the use of lockpicking you can “find things that people can use so they understand that security is about defense and depth”.
They go on to share with us how Security has a collective responsibility to be collaborative.
Bea expands on why lockpicking is a great example of the vulnerability of security but is part of a “layer of security”.
We talk about how PagerDuty’s Security team has a little bit of a different take on incident response because it is a mixture of traditional security response which is responding to attacker or breaches, and how tuning alerting also comes into play.
Sarai: “It’s this cycle of tuning our alerts as we add more and more monitors”. Julie: “…and tuning is so important because eliminating the noise, reducing the noise; helps with alert fatigue, it makes life better for your engineers”.
Bea talks to us about how to reduce the noise, and how if you can get rid of unnecessary services you have less to secure.
Bea: “Step one, in buying an IDS, don’t; if you do by mistake, delete all the rules and then slowly add them in because every IDS rule set has way too many things to actually be useful”. Bea goes on to say how alerts should have outcomes and be high quality.
Sarai discusses how quick wins are great and how there are so many risks out there, so let’s look at one particular process. “Start with basic things and the easiest things to implement”. By starting with something quick and easy we get that quick wins.
Bea talks about how having a security win not be perfect is better than pretending that you can get to 100%.
Sarai tells us how she wished she realized that she has a lower risk tolerance than other security and software people in general and that she should: “Just go for it, go try something, see if it works, see what happens if I break something. As long as I have a plan to go backwards then might as well try it”.
Bea brings us the closing wisdom of “stop worrying about your crypto because that’s probably not where you are going to get owned in a security sense”, and Sarai says: “Cryptography problems are actually social problems… it depends on your use case”.
Bea has been frustrated at Linux’s IP blocking tools for over 20 years now, and are just waiting to see what Nftables is replaced by.
Bea likes shouting about threat models a lot, and trying to convince people that their primary concern is probably not the NSA and that DNSSEC should be put out to pasture.
She is more opinionated about coffee.
Sarai Rosenberg is an Insecurity Princess working in Security Engineering at PagerDuty. She is a professional mathematician who has been working in security since 2015. Sarai has spoken on threat modeling, recommendation algorithms, and psychological safety, as well as being an advocate for compassion, mentorship, and elliptic curve cryptography.