Episode 104: Cryptography Demystified
This week, the Wordfence team discusses cryptography in depth, including the basics, a brief history, hashing, and the Crypto Wars. We also go over current news, including 2 new findings by the Wordfence Threat Intelligence team, a new milestone for WordPress, and a recent attack on a Florida Town’s water supply.
Here are timestamps and links in case you’d like to jump around, and a transcript is below.
0:17 New findings by the Wordfence Threat Intelligence team
1:08 New Milestone for WordPress
2:40 An attack on a Florida Town’s water supply
5:52 Introduction to Cryptography
7:49 History of Cryptography
13:45 The Crypto Wars
24:30 Hashing
37:26 Symmetric Cryptography
39:26 Asymmetric Cryptography
Find us on your favorite app or platform including iTunes, Google Podcasts, Spotify, YouTube, SoundCloud and Overcast.
Click here to download an MP3 version of this podcast. Subscribe to our RSS feed.
Episode 104 Transcript
Ram:
Hello, and welcome to Think Like a Hacker, the podcast about WordPress security and innovation. I’m Ram Gall, threat analyst and QA engineer at Wordfence, and with me is our CEO Mark Maunder. How’s it going, Mark?
Mark:
Pretty good, Ram. How are you doing?
Ram:
Not too bad. It’s been a long day, but I feel like it’s been pretty productive.
Mark:
Yeah. Didn’t you just find a bug?
Ram:
Actually, Chloe and I both published a couple of critical vulnerabilities in some fairly popular plugins. Chloe found a couple of file upload vulnerabilities that could be used for remote code execution and the responsive menu plugin and a CSRF settings update. I also found a couple of CSRF… That stands for cross-site request forgery, it’s where you trick someone into clicking a link, and then the attacker can basically make them do whatever they want on their own site.
Mark:
Very cool. Vulnerabilities, which are really just celebrity bugs, right?
Ram:
Pretty much. Yes. I mean, they’ve gotten a decent amount of coverage. Mine was in NextGEN Gallery, which is a fairly popular image plugin for WordPress. It also allowed file upload, but yeah, I think we’re doing some good research. We’re keeping up the pace in the new year.
Mark:
Yeah. Most definitely.
Ram:
Something else that came up is that WordPress now officially accounts for 40% of the top 10 million sites on the internet, according to a new study by W3 Technology Surveys.
Mark:
I remember, I think it was earlier today, our team was chatting about that. I was just remembering what that 40% number means. I guess it’s the top 10 million as defined by… Was it Alexa combined with some other lists or cross-referenced?
Ram:
Yeah. Yeah. It was mostly the Alexa top 10 million. They’ve been keeping track of internet statistics for quite some time now, so.
Mark:
Yeah, but I mean, Alexa, not to go and excavate this from ancient history, but they used to have really accurate traffic on sites based on a browser tool bar that was in… I think it was like back in IE6 or something, and then those toolbars went away, and somehow they still have data. I’m just curious how accurate it is.
Ram:
Apparently, they’re fairly well-trusted by most of the organizations that do internet research, but it would be fun to look into at least.
Mark:
Yeah, for sure. One of my favorite resources is BuiltWith that we’ve been playing with, and I think you might’ve as well.
Ram:
Yeah. Yeah. I’ve played with BuiltWith a little bit. It has given us some interesting data on some topics that we’ve been researching, like nulled plugins.
Mark:
Definitely.
Ram:
But we don’t want to get too far off of track, but we’re going to be discussing one of the most important technologies to the internet’s continued existence today. We’re going to be following up on our Wordfence Live show this week. Just wanted to cover one of the biggest things in the news this week, which is that an attacker gained access to a computer controlling the water treatment plan for Oldsmar, Florida. I guess they tried to dump a bunch of lye into the water supply.
Mark:
Yeah, I commented about that on Twitter, but I don’t know about you, Ram, but I don’t remember a kinetic attack that targeted civilians that has had this much impact. When I say civilians, I’m thinking of Natanz, the Iranian uranium refinement plant that was targeted by the U.S. military cooperating with the Israelis. I also think about the Israeli F-15s that targeted a facility after the Israeli cybersecurity team disabled the radar system that would have detected the attack.
Mark:
Those things have happened, and they’re kinetic and they’re exciting and that kind of thing, but this is hackers going after a civilian facility dumping orders of magnitude more lye in the water supply than should be there and an operator seeing it actually happen and catching it just in time, which is very lucky, I think.
Ram:
Yeah, well, they were using TeamViewer, weren’t they? Apparently, this was a fairly low-sophistication attack. They believe that it was credential stuffing that lead to it.
Mark:
Oh, wow. No, I didn’t read the details on this. What is TeamViewer exactly?
Ram:
TeamViewer is basically a remote desktop management application, something where if you want to help someone out on their computer, you can get an invite to take actions on their computer using TeamViewer. It’s a lot like remote desktop Microsoft bundles, or with Windows server.
Mark:
Some attacker… and you said it was credential reuse?
Ram:
That’s the running theory. Yes.
Mark:
Oh, wow. Oh, man. I think that’s by far the most common vector for just about everything these days is password reuse, something gets breached, the list of hashed passwords gets out on the internet, someone cracks the list or as many of the hashes as they can, they make it available to everyone else, and you’ve got the username/password combination, and the username’s usually an email address, and so once that’s out there, they just reuse that credential somewhere else and get in, and Bob’s your uncle, huh?
Ram:
Pretty much. Yeah. That’s one of the reasons why cryptographic salts are so important, which is something we are actually going to get to a little bit. Apparently, the operator caught it because the attacker moved the mouse. That’s kind of eerie if you see your mouse moving on your screen and you’re not the one moving it. I feel like that speaks to the relatively low sophistication level of the attacker.
Mark:
I just hope it wasn’t some kid. Was this done in Florida where this happened?
Ram:
Yeah, this was in Florida.
Mark:
Oh, man. There’s something about Florida. We have a mutual friend who you know of, and we’ll chat about that after the podcast, but he got prosecuted when he was much younger for making silly choices and going after government websites. I hope this isn’t the same thing, some teenager who’s gone exploring and made some really bad choices because it’s not going to turn out well for them.
Ram:
Definitely not, especially not with something that could have impacted that many civilians.
Mark:
Yeah. Yeah. For sure. All right. Should we talk crypto?
Ram:
Yeah, let’s talk crypto. I guess the question is what is cryptography?
Mark:
Just some background here, as Ram was saying, we had a really nice chat on Wordfence Live, which feels like it was last week for both of us because so much has happened since then, but it was actually yesterday morning on Tuesday morning. We’re recording this on a Wednesday evening, February 10th. Wordfence Live is every Tuesday around… Is it 9:00 a.m. Pacific, Ram?
Ram:
I know that it’s noon Eastern, so.
Mark:
Yeah. Yeah. All right, so 9:00 a.m. Pacific, we did a segment yesterday on cryptography, and we wanted to do that on the podcast as well. I’m going to condense this so that it’s a little bit less in-depth and kind of like a compact introduction to cryptography for most folks. If you’re a little more technical, I suspect there’ll be some fun items in here for you as well.
Mark:
But my goal here is really to get our listeners thinking about cryptography in a way where they have an understanding of some of the basics. By the basics, I really mean where cryptography comes from, why it’s useful, the history of the crypto wars, what is a hash, what symmetric cryptography, what is asymmetric cryptography, what is key escrow and why is it problematic? Then we’ll talk about the future.
Mark:
If you are less technical and you’re listening to this, I’m going to make this as accessible as I possibly can. Some of those words that I threw out might sound a little bit intimidating, but they’re not. They’re actually fairly easy concepts, and you don’t need to be a mathematician to have some fun with this stuff. I think once you’ve got a basic understanding of what these various things are, you can then listen to news reports about the evolution of policy with regards to privacy and cryptography and even things like blockchain with more of an educated ear. That’s my goal here. I suspect that’s yours too, Ram.
Ram:
That is. Should we dive into the history a little bit?
Mark:
Yeah. Why not? I guess we can go back to the Caesar cipher, which is one of the earlier forms of cryptography, and chat about that. The Caesar cipher, it’s a really basic form of crypto that is just shifting letters by a fixed number, so-
Ram:
ROT13, right? The thing-
Mark:
Exactly.
Ram:
… that people always used to post spoilers on the internet.
Mark:
Yeah, exactly. Now, in the case of ROT13, what they’ve done is they’ve said every letter becomes the letter that is 13 places away. I think A is probably… Is it M or N? B as the next letter on after that. Maybe A becomes M, B becomes N, and so on. When you get the message, all you do is you reverse the process. You shift it 13 letters back or two letters back or whatever you’ve agreed on. What’s interesting about that kind of algorithm is that it’s cryptography where you have a shared algorithm, and all an algorithm means is just a piece of logic.
Ram:
It’s just a way of doing stuff. It’s the way of performing operations, right?
Mark:
I like that. I actually like that better than a piece of logic. It’s a way doing something. In early cryptography, it was a shared algorithm. I’m shifting it forward by 13 spaces, and you’re shifting all the letters back by 13, for example. Then you had other early forms of cryptography, like the scytale where you had a… It’s kind of like a belt, if you can imagine like a belt that you would use to hold your jeans up when you’ve lost too much weight. You wrap a belt around a cylinder so that it covers the whole cylinder, and then you write your message on that. When you unwrap it, what will you see as a bunch of letters when you’re looking at the belt, and it doesn’t really make much sense. The way you decode it is you just take the belt and you wrap it around a cylinder of the same diameter, and you can end up actually seeing the message.
Mark:
Again, it is a shared algorithm that the two people have. I know that I need to wrap the belt around a cylinder of 100 millimeters in diameter and you receive the belt that the courier has transported across the country on their horse avoiding the enemy, and you know that you need to wrap it around a cylinder of 100 millimeters in diameter. We share that logic, and there you go. One of the first kinds of cryptography where it was no longer a shared algorithm or a shared piece of logic, but they actually separated the logic or the algorithm from the key was… What the heck’s it called, Ram?
Ram:
Vigenère cipher. Yeah. That was the one where they basically used every Caesar cipher in a sort of table, right?
Mark:
Exactly. Yeah. Yeah. Vigenère was the first cryptographic algorithm or method that had logic. It also had a shared key between the sender and the receiver. The key is you plug that into the table, as Ram was saying, and you can then decode the message. What’s interesting about that is it begins to introduce this concept of Kerckhoffs’s principle, which is that for any good cryptographic algorithm, that algorithm should… You should be able to make that completely public and make it available to your adversary, your enemy, or the other team, and it doesn’t matter if they have the algorithm as long as they don’t have the key.
Mark:
That’s the importance in cryptography of separating the algorithm or the logic from the key. Kerckhoffs’s principle is used still today by the NSA, and the inside joke at NSA is that if we produce a device that does cryptography, the very first device, the one with serial number one is sent to the Kremlin. What they’re really doing is just illustrating this idea that you can have the device or the algorithm or the logic as long as you don’t have the key. If you have the device or you understand the logic or the algorithm or the math, it doesn’t help you as long as you don’t have the key to unlock the cryptography.
Ram:
Most of the really strong ciphers we use today are very public, aren’t they?
Mark:
Yeah.
Ram:
Like I’m sure the NSA has a few in their back pocket that they’re not sharing, but most of the ones that are considered extremely secure, the algorithms are all public.
Mark:
Exactly, exactly. I mean, it used to be that they try to keep these things trade secret. You know, RC4 was actually a trade secret until it was, it was leaked on a cypherpunk forum. Cypherpunks are, are hackers that are interested in cryptography. But RC4 was actually leaked and it eventually became, you know, public knowledge and so on.
Mark:
But these days when cryptographic algorithms are developed, they’re generally made very public based on Kirchhoff’s principle and debated to death among mathematicians and pounded upon until they’re like, okay, we’re at a good place where it doesn’t matter who knows the algorithm. As long as they don’t have the key, it’s as secure as we need it to be. I’m not using the word unbreakable on purpose for reasons that I’m- I’m, Ram, I’m sure you understand.
Ram:
Schneider said that usually when the NSA tries to break these things, they don’t try to actually break the math. They just try to break the implementation. Right? And that’s, that’s-
Mark:
Yeah.
Ram:
… where it’s usually easiest to go wrong. It’s much easier to find some sort of side channel of like, oh, hey, this chip leaks way more power when, you know, the first few bites are zero of the, of the key-
Mark:
Yeah.
Ram:
That kind of thing.
Mark:
Yeah, exactly. I think sometimes the math is vulnerable. But most often you’re able to target cryptographic algorithms using side channel attacks much more successfully.
Mark:
All right. So we’ve chatted about the, the sort of brief history of cryptography. I’m going to bring us into roughly the seventies.
Ram:
The height of the Cold War.
Mark:
Right? So around the 1970s, in my opinion, and other folks will disagree with me, the, what we call the crypto war started. And that’s where the US government, and the NSA in particular, were doing their best to make cryptography unavailable to the general public and to their adversaries, like other countries and so on. And, you know, as we’re, we’re obviously at the height of the Cold War there. And so the, the big adversary for the US was Russia.
Mark:
And so one of the events that I, I think clearly illustrates the, the Crypto Wars is that IBM in the 1970s was developing DES or the digital encryption standard and they wanted to use 128 bit key. Now in cryptography, the longer the key you use, generally, the more secure the, the cryptography is, the harder it is to break. And so IBM wanted to use 128 bit key.
Mark:
And when I say 128 bits, if you don’t know binary, it just defines the length of a number. And you can have numbers in binary or hex or octal or decimal. In this, in, you know, most cases, that’s what we’re used to. But in binary, 128 bit key is just a number that is 128 binary digits long.
Mark:
And so IBM wanted to use 128 bit key. And the NSA started lobbying them to only use a 48 bit key because they wanted the digital encryption standard that IBM was developing to be more easily breakable. They didn’t want secure cryptography to be available to the general public and to IBM customers, including customers that were overseas. And what they settled on is a key length of 56 bits, which was vulnerable at the time to brute force attack, and to being broken. And, and so that’s kind of the, one of the earlier events.
Mark:
And if we fast forward to 1991, you saw something similar happen where Phil Zimmerman released pretty good privacy, which was an implementation of a public key encryption. And I’m going to explain what that is in a few minutes. but what that did is it made very strong public key encryption available to the general public. And, and he actually used 128 bit encryption. If you remember, the longer the key, the more secure it is.
Mark:
And they actually went after Phil and they tried to prosecute him. And eventually they kind of dropped the case. But what happened was Phil was not able to legally export pretty good privacy because it fell under the ITAR regulations at the time, which were the regulations that export munitions. And strong cryptography was considered ammunition by the US government, the same as missiles and so on.
Mark:
And so he could only export a weak version of PGP. But with the way they got around it eventually was Phil partnered with the MIT press. And they actually printed out the entire source code of PGP as a hardcover book. And you could buy that book and you could take it overseas. And the reason you could do that is because it was protected speech under the First Amendment of the US Constitution. And so they actually managed to export it that way. Someone took the book overseas, they yanked off the cover and they used optical character recognition to scan the code back in and they compiled it. And they had strong encryption that had legally been exported from the US. And that was around ’91.
Mark:
And then just a little, little later that decade, around the mid nineties, Netscape implemented SSL in their browser. And they, because of the ITAR regulations, they had to provide weak cryptography to international users of their browser.
Ram:
That seems like a terrible idea for e-commerce,
Mark:
Right? So Netscape only allowed 40 bit encryption internationally. And you could have 128 bit encryption if you’re in the US. And what the- what happened is Verisign, at the time, was selling all of the SSL certificates. And they were selling 128 bit certificates to US customers and 40 bit for the rest of the market. And a little company in South Africa, where I’m- where I come from called Thawte, which was started by Mark Shuttleworth, started selling 128 bit SSL certificates. Because they could, because they were not in the US, and they were not governed by the ITAR restrictions.
Mark:
And so Mark and, and Thawte cornered the, the other half of the global market in SSL certificates and Thawte ended up owning 50% of the market and Verisign owned the other 50%, which was in the US. And eventually the ITAR restrictions were, were lowered. Verisign bought Thawte for just under 600 million US dollars. And Mark became a very wealthy person and became the second space tourist ever. And then used his hundreds of millions to launch Ubuntu, which is a Linux variant. That is, I think, Ram, the most popular Linux flavor these days?
Ram:
Uh, by quite a bit, I believe. Yes.
Mark:
Yeah. And Ubuntu’s actually brilliant. I mean, everyone uses it, who, who is, uh- everyone sensible uses it. (laughs) No, I’m just kidding. I’m going to get, I’m going to get killed by some of our listeners. Cause there’s a, there’s a lot of other great Linux flavors out there, including Kali, which is used by us penetration testers and security researchers. So lots to choose from. But Ubuntu is, is huge and has made a huge contribution.
Mark:
And that’s the, that is how Ubuntu came about. And you can thank the, the US government and their restrictions on exporting strong cryptography for helping bootstrap Ubuntu Linux into existence.
Mark:
And then, you know, we’re still chatting about the Crypto Wars, right? I mentioned IBM in the seventies, you know, Netscape and the story of Mark Shuttleworth and Thawte and so on. Well, the Clipper chip was developed by the National Security Agency in the 1990s. And that included the, Ram, I think he told me about the Skipjack algorithm.
Ram:
Yeah. I remember there being a controversy about the Clipper chip as well.
Mark:
Yeah. So Clipper included what they call a key escrow, which is this absurd idea that the- a government, in this case, the US government, should hold a key that is a back door to a certain kind of cryptography. And so Clipper was supposed to be used in all phones, cell phones, landlines, that kind of thing. And it would give you the illusion of secure communications. When in fact, someone out there has a key. In this case, it’s the US government. And the idea is that they’ll be able to keep that key secure.
Ram:
They would never ever lose it. Right? That has never happened.
Mark:
Right.
Ram:
Right?
Mark:
Right. Well, so the a- if, if you know anything about the OPM breach, the office of personnel management, which is a division of the US government, holds files on everyone in the country who has clearance.
Mark:
Clearance, if you’re not a, a US person means that you can look at secret government stuff. And there’s various levels. There’s secret, top secret, top secret SCI and so on. And so if you work on a military base in this country, you generally have clearance of some kind, and there’s a process that you go through. They attach a polygraph to you and they ask you a bunch of awkward questions. And there’s a process called adjudication where you’re supposed to tell them that you’re, you know, an alcoholic who dances every midnight when it’s a full moon around the streets naked. And you’ve done all this other naughty stuff. And that way they’ve got all your naughty stuff on file. And no one can blackmail you saying, well, I’m going to tell everyone about the naughty stuff, unless you tell me the government secrets.
Mark:
And so OPM had that data on file along with biometric data for all of these folks with clearance, like fingerprints and so on. And that was breached. And it’s an absolute disaster. In my opinion, it’s one of the most important breaches in the history of this country, anyway, because you don’t get a lot of that data back. And you can’t change a lot of that data. You know, if it was passwords that were breached, sure. You know, change your passwords, no big deal. Biometric data is a password that you can never change unless you somehow are able to change your fingerprints or your, your iris. Uh-
Ram:
It should be a username really.
Mark:
(laughs) Right. And yeah, that’s an interesting idea actually. And then, you know, the adjudication data is obviously very, very sensitive data and, and that’s now out there. And so if, if they couldn’t protect the OPM’s data, the idea that they would be able to protect a key that has- that they’re storing under key escrow is utterly absurd.
Mark:
And it just really highlights the need for strong cryptography that isn’t back-doored in my humble opinion. Ram, what do you think?
Ram:
I definitely agree. I feel like back door crypto is a disaster waiting to happen because as soon as that happened, that escrow key is going to become the number one target of literally every adversary and every light crime syndicate, and everyone who has good hackers who-
Mark:
Right.
Ram:
… aren’t interested in the public good.
Mark:
Yep. That’s right. Giant bullseye. And so, you know, the Crypto Wars are still going on. We have a, an act that is the EARN IT Act of 2020, and it provides for a 19 member national commission, which will develop a set of “best practice guidelines” to which technology providers will have to conform in order to “earn immunity” to liability for child sexual abuse material that’s been posted by their users on their platforms.
Mark:
And the thing about that is that earning immunity probably means providing back doors into things like end to end encryption if you’re providing a service like WhatsApp or Signal. And WhatsApp is relevant because it’s owned by Facebook. And so if Facebook wants to “earn immunity” from people posting illegal material to Facebook, they would have to conform to the EARN IT act and potentially back door WhatsApp.
Mark:
And just to be clear, traditionally, content providers, social media platforms, that kind of thing have had automatic immunity thanks to Section 2-30 of the Communications Decency Act. And so they’re kind of like rolling that back and providing a way for the US government to put tremendous pressure on platforms and social networks and so on to back door their cryptography with the threat of well, we’ll prosecute you for child pornography content that is posted onto your platform.
Mark:
And, and so the way that they framed this is this kind of choice between the safety of children or strong cryptography. And it’s a false, it’s a false choice in my opinion.
Ram:
There are very, very, very many legitimate uses for a strong cryptography. Honestly, I feel like it’s a much stronger argument, even than the old VCR debate. If you recall, the main reason VCRs were allowed to have a record functionality is that there were legitimate use cases that weren’t copying copyrighted material, even though that’s what most people use them for.
Mark:
Yeah, yeah-
Ram:
Whereas with cryptography, almost all of the use cases are legitimate, and the illegitimate uses are edge cases.
Mark:
Yeah, exactly. I would say it’s the inverse of that; I totally agree. All right, so let’s dive into some educational stuff over here-
Ram:
Yeah, you were going to talk about hashing, right? Which is pretty big on the blockchain. It’s kind of the thing that makes that work, but-
Mark:
Yeah, so let’s start off with hashing as you suggested Ram. It’s one of the easier concepts. A hash is simply a machine that you can feed data into of any length, it that can be just a single byte or it can be a petabyte, which is a lot. And at the other end, it’ll spit a fixed length number that’ll uniquely identify that data. If you feed the same data and again, you get exactly the same number, and that’s it. Now you know what hashing is.
Mark:
And hashing is super useful for all kinds of things. I was teaching my colleague in our film department the other day about hashing, because what he can do is take a huge film, a complete film that is multiple gigabytes, run it through a hashing algorithm on his Apple computer, which includes the MD5 hashing algorithm, he can get a hash that represents the data. He can then send that huge file via courier, or via network, or via pigeon to the visual effects company that he’s going to be collaborating with, and when they get it, he can also just email them the hash, they’ll run it through the same algorithm, and if it doesn’t generate the same hash, it means the data has been changed in transit; it’s been corrupted, or someone has added a scene into the film that we don’t want-
Ram:
Or the NSA has tampered with the film.
Mark:
Right. For some reason, I thought of Fight Club. If they had hashing then Tyler Durden wouldn’t have been able to put his little little hidden frame into the films, but let’s not go there. This is a family podcast, and we probably shouldn’t go into that territory.
Mark:
But that’s the basics of what hashing is. And it’s used in cryptographic algorithms or crypto systems all over the place to uniquely identify data. And hashing is also used for authentication. So when you register on a website, what it does is you enter your password, it creates a hash of that password to uniquely identify it, and it stores that hash. And the next time you sign in it just hashes the password that you enter again-
Ram:
But not with MD5 anymore?
Mark:
Well hold on, you’re getting ahead of me here. So it hashes your password again, and compares that output with the hash that it’s stored, and if the two match, then it lets you in. And the benefit of doing that is you don’t have to store a plain text password in the database anymore. You can now just store a representation of that password.
Mark:
And the thing about hashes is it’s very difficult, and in some cases impossible, to reverse that hash back into the original data. And so that’s the basic idea there.
Mark:
Now, if you wanted to crack passwords, because you’ve managed to, let’s say hack into some WordPress website and you’ve downloaded all the hashes, and you want to turn them into passwords, well the way you do it is you just take a whole bunch of words, turn them into hashes, compare that list of hashes you came up with the list of hashes that you stole, and ones that match, well now you know what the password is because you used that password to create that hash, and it matches a hash in the database, and so it must be the same word that they’ve used.
Mark:
And so this begins to help you to understand why you want to choose a long password that is not an English word, or made up of English words, and that contains random characters, and that those characters should be upper and lower case, and numbers, and some symbols. Because if I’m trying to crack your password by throwing a dictionary at a hashing algorithm-
Ram:
You’re going to try the easy ones first, right?
Mark:
I’m going to try to use the ones first. And I’m going to try dictionary words first. If I have to do, let’s say 20 character passwords that are completely random, it’ll take me the rest of my natural life multiplied by a thousand to actually crack your password.
Mark:
Some hashing algorithms are a little more computationally-intensive than others, and those are better for this particular application, which is storing hash passwords.
Mark:
Now, crazy story, right?
Ram:
Yeah, yeah.
Mark:
I’m going go on a tangent over here Ram just for fun. This is not something we chatted about on Live yesterday, and I wish we had.
Mark:
Adler-32, does that sound familiar?
Ram:
Yeah, that’s the CRC, cyclic redundancy check algorithm, right?
Mark:
Yeah. And Adler-32 now, the reason I want to go on this tangent is I was saying that if you’re using a hashing algorithm for storing representations of passwords, you want to use something that does actually consume quite a few CPU cycles when you’re generating the hash, because it’s harder to crack. If you do a thousand guesses with that algorithm, it’s going to take a lot longer than with, say MD5, which is incredibly fast.
Mark:
But Adler-32 is funky. It was designed by Mark Adler. He worked at either JPL or NASA at the time, and it was designed for spacecraft. And that is an algorithm that’s designed to not be computationally intensive, because it’s running on a spacecraft that has a limited amount of power available. It’s running off solar cells, and you don’t want to consume a huge number of CPU cycles, and therefore watts when you’re using this algorithm.
Mark:
So it’s actually a very computationally-efficient algorithm. And Matt Barry, who’s our lead developer, our most senior developer here at Wordfence, he discovered a weakness in the wordpress.org infrastructure that would have potentially allowed an attacker to compromise the servers where you download WordPress from, and that send out the plugin updates and all that stuff. And the mistake that they had made is they were using Adler-32 for a certain, I think it was authentication or authorization step.
Ram:
Yeah. They let you choose your own algorithm, right?
Mark:
Right, that was it!
Ram:
And you could even choose Adler-32.
Mark:
Yes!
Ram:
And at that point you can just generate something else, it’s called a collision where you have two values that generate the same identical hash, which can only happen in really small, short hashes, right?
Mark:
Yeah. So that was an absolutely brilliant piece of research by Matt. And you can find that on the Wordfence blog, it’s a few years old now. The WordPress security team managed to fix that quite quickly, working confidentially with Matt. And then we went ahead and published the research when the all clear was given.
Mark:
But that’s an example of why it is very important to choose the appropriate hashing algorithm for whatever your application is. If you’re launching spacecraft, choose something that’s computationally not that intensive to save some power with your solar cells; if you’re doing password hashing, you want to choose something like bcrypt That’s a little bit more computationally intensive than let’s say MD5. How are we doing so far Ram? Making sense?
Ram:
I think we’re pretty good. Yeah, I think we probably want to cover salts, but I’m not going to spend too much time on them. When Mark was discussing about how you can basically just precompute the hashes of a bunch of passwords, that doesn’t work so great if when you’re storing the password you append a random bit of data to it, called a salt, because then that turns it into a completely different hash. Unless the person who’s attacking your database knows the salt, they’re not going to be able to generate a hash that actually matches your password no matter what list they run.
Mark:
Okay, so I’m going to unpack that a little bit because I’m assuming that we’re dealing with folks who are not programmers here and not necessarily mathematicians.
Mark:
So if you are a bad person, and you’re going around regularly stealing databases of hashed passwords, and you want to crack those, and you want to do it in a way that’s computationally more efficient, what you’ll do is you’ll take, let’s say a bunch of dictionaries of words in English and various other languages, and you’ll use that as the beginning of your word list. And then you’ll take some other sources of passwords, commonly used passwords, you’ll take passwords that have been breached, and you’ll dump that all into a long word list. And let’s say you’ve got around a billion words, well you got two choices when you want to crack your breached password database, you can turn those words into hashes, and then compare those hashes against the hashes in the database and where they match you know that you’ve cracked it, and you know what word it was.
Mark:
And you can do that for every single breached password database that you want to crack. Or you can just do the computation once, and store the hashes alongside what the original plain text was, and then just compare the hashes to each hash in your various breached password databases. And that allows you to only do those competitions once, and then just do that comparison, which is much faster than computing a hash every time.
Mark:
And so that attack is called a rainbow table, and a rainbow table is simply a long list of precomputed hashes and the words that that created those hashes. And there’s big lists of rainbow tables that you can download that are precomputed hashes, and they massively speed up the process of cracking hashes.
Mark:
And so back in the seventies already, they came up with this idea of using salts. And what a salt is, is when you take a user’s password and you’re going to turn it into a hash for the first time, let’s say they’re registering for your service, you want to turn that password that they just entered into a hash, you don’t just run it through the hashing algorithm. You actually append, or prepend, a random piece of text to that password, and then you compute your hash. And what you store is the hash, as well as the little random piece of text.
Mark:
And what that means is that whenever someone signs in now, instead of just taking the password that they enter and running it through the hashing algorithm and doing the comparison, you have to take the salt, prepended or appended, to the password that they entered, run it through their hashing algorithm, and then compare it to what was stored.
Mark:
And what that means is it defeats the rainbow tables attack. Because the hacker can no longer use their rainbow table of precomputed hashes, because they’re forced to take that little piece of text and prepend it to every single word that they’re guessing you might’ve used as your password, create the hash, and then compare that. So you’re forcing them to compute the hashes when you use salts. That’s why salts are useful, is because it defeats a rainbow tables attack.
Ram:
That sounds incredibly useful, and WordPress uses them too, right?
Mark:
Yeah. WordPress, when I started coding was using MD5, and MD5 was a fairly computationally-fast hashing algorithm, therefore it’s easy to crack. And so what WordPress did is they, instead of just using straight MD5, they did 8,000 rounds of MD5. In other words, you hash the password the user entered, and then you create a hash, of a hash, of a hash, of a hash 8,000 times, and then look at that output.
Mark:
And so when they sign in and they enter their password, you just do the same process. Again, hash, of a hash, of a hash, of a hash 8,000 times; sorry, I can’t say that very fast. But WordPress also incorporated a salt, so that they could defeat rainbow tables attacks. And that was back in the olden times of 2011, 2012 when I started diving into WordPress security. Now they’ve moved over to bcrypt. And if you have old MD5 hashes in your database, there’s a migration process right Ram?
Ram:
Yep. If you put an MD5 hash password in a database, and that user logs in, then the password will be changed over to bcrypt, I believe, next time they log in. I mean, you have to log in, but yeah.
Mark:
Yeah, for sure. And the reason they have to do the migration when you log in is because they need to actually know your plain text password to be able to do the migration. So they’re authenticating you by taking your plain text password, using the old algorithm, comparing the output with what stored in the database, which is an MD5 hash. And then if they authenticate you, they’ll say, “Okay, now let’s migrate him to bcrypt,” and they’ll take that password that you entered, which is in memory and use bcrypt to hash it along with a salt, and then replace the MD5 password in the database with a bcrypt password. And that’s how the migration works.
Mark:
So chatting about hashing, and hashing is just used with all kinds of things. It is an integral part of what is called a blockchain. You may have heard of a blockchain and hashing as an integral part of that. The chain element of a blockchain is actually created with hashes where you have a series or a sequence of events, and maybe those are transactions in the case of Bitcoin or maybe they are interactions with a file or a journey of a piece of data or whatever, but it’s essentially a kind of ledger or a sequence of events, and that are tied together using a hashing algorithm. I’m not going to dive into it any more than that because that is beyond the scope of what we’re trying to do on this podcast. But let’s chat about symmetric crypto.
Ram:
Symmetric crypto, that’s just where if I want to send a message to you, then we have to establish a single secret key. You were discussing in the history where the algorithm can be public, but as long as there’s shared secret that we both have, we can use it to encrypt and decrypt data very quickly.
Mark:
Exactly. Symmetric cryptography has been around for quite a while. It’s been around since… I keep wanting to say Wassenaar, but of course, it’s a Wassenaar Arrangement. What is the algorithm called again?
Ram:
Vigenère
Mark:
Vigenère, there we go. So with Vigenère, you had a shared key, and essentially that was a kind of symmetric cryptography where I have the same key that you have, maybe we’ve decided on the name of my dog, and when you receive the message from me, you use that same key to decrypt the message. Now that’s the basis of symmetric cryptography. Symmetric cryptography is very fast, but there’s a major, major problem with it. And that is that if you, audience member, and I want to communicate, and Ram is listening in. And we’re not able to get together, I’m never going to meet you. I don’t know what your name is, maybe it’s John or Mary or whatever your name is, but you and I know each other via this podcast, we chat via DM on Twitter. Ram has managed to hack into a fiber optic cable outside my house and is monitoring everything that we say. We’re on an insecure channel and we want to establish a secure channel.
Mark:
Well, we can’t use symmetric cryptography because I would have to send you my dog’s name as the key over that insecure channel and Ram will get the key, and he’ll just decrypt anything that I send you and vice versa. And that’s the trouble with symmetric cryptography. This challenge had baffled cryptographers for… I don’t know if I’m just making any assumptions about historical cryptographers, but I suspect it baffled mathematicians for a very, very long time. And around the 1970s, there was a major breakthrough in cryptography by RSA, which is the initials of three famous cryptographers.
Ram:
Rivest, Shamir, and Adelman.
Mark:
And what they developed was a way to separate the key that you use to encrypt data from the key that you use to decrypt that data. This is a massive breakthrough, and the reason why is because, again, let’s use our analogy where Ram is sitting outside my house in a van, and he’s got the fiber optic cable running into the van and running back out again, and he’s listening into everything that we say.
Ram:
For some reason no one has spotted me yet.
Mark:
Right. Even though it’s a white van with blacked out windows and all kinds of weird bumper stickers on the back. But okay, moving swiftly on. So I want to communicate with you dear listener, and Ram’s listening in, and the way I do that with asymmetric cryptography, which is this major breakthrough, is that I send you my public key and you receive it. The only thing you can do with that public key is encrypt the data that you want to send me. And the only thing Ram can do with it sitting in his van is he can also encrypt messages and send it to me, but he’s not interested in encrypting messages and sending them to me. He wants to decrypt our stuff, right?
Mark:
So you get the key, you get my public key, you encrypt a message, you send it to me. And I use my secret key, my private key to decrypt that message, and I’ve never sent that private key across the wire. It’s safe in my house. I’m behind locked doors. I know he’s sitting out there in his van, but all my doors are locked. Sorry, Ram. Ram’s actually a really nice guy by the way, but-
Ram:
I promise I’m not actually in a van outside of Mark’s house, listening into, hacking into his fiber optic cable.
Mark:
Or are you?
Ram:
Or am I?
Mark:
All right. So you can use my public key to encrypt data and send that to me and I decrypt it with my private key. And I want to send you some data, I want to send you a reply. So what you do is you send me your public key. I use your public key to encrypt information. I send that back to you across this insecure channel that Ram’s listening in on, and you use your secret key or your private key, those two terms I used interchangeably, to decrypt the data that I’ve sent you. And again, you’ve never sent your secret key or your private key across the wire. Ram doesn’t have it. The only thing he now has is your public key and my public key, and the only thing he can do with those is send me encrypted messages and send you encrypted messages. He can not use those to decrypt messages.
Mark:
And that is the amazing, amazing thing about asymmetric cryptography, is that it provided a way to establish a secure communications channel over a communications medium that’s being monitored by the adversary or the enemy or some hacker. It was a major breakthrough. And so asymmetric cryptography is used extremely widely. It’s used in SSL, now called TLS, which is how we communicate with websites securely. When you’re buying something on Amazon, you are using TLS. And the way that works is that TLS will establish a secure communications channel using asymmetric cryptography, exchanging those public keys, and then decrypting with the private keys that never crossed the wire. Once that secure communications channel has been established, TLS will switch to symmetric cryptography, because as I mentioned, it is more efficient. It’s more computationally efficient that uses less resources. It’s faster in other words. And that is what symmetric cryptography and asymmetric cryptography are, and why they matter.
Ram:
Asymmetric cryptography is kind of one of those world changing things, and it’s kind of something that’s enabled the internet as we know it to sort of flourish, right?
Mark:
Yeah. I mean, for me, I just go back to the wonderful story of Turing breaking the Enigma machine. And of course it was a team at Bletchley Park in the UK and so on, but the Enigma machine was an encryption machine developed by the Germans and used by the Axis powers to communicate with their ships and submarines over an insecure channel, which was HF radio, in other words, shortwave radio. And they would have to set up the shared key before those submarines and before their ships left the Harbor. Now I just think about how excited those, whether it was the Germans, the Axis powers, or the allies, either party, how excited they would have been about a way to establish a secure communications channel over an insecure link. I mean, it would have blown their minds, and that was only invented around the 1970s.
Ram:
It’s a very good thing it wasn’t around back then.
Mark:
Well, it’s interesting, right? Because it really brings up that debate because it is around now. The United States has adversaries around the world. Those adversaries in a lot of cases would consider the United States an adversary. We still have a fair amount of tension floating around the world and the potential for war and actual wars going on. And yet this is now in a world where public key cryptography does exist. You can use key lengths of 2048 bits or 4096 bits which are way longer than 128 bits. That stuff may or may not be crackable by NSA based on the amount of noise that they’ve been making. We think that the stronger cryptographic algorithms with longer keys are actually not trackable by them.
Ram:
Considering they still really, really want that backdoor. Yeah, probably.
Mark:
Yeah. And so, now that everyone out there knows what cryptography is and a little bit of the history, and the crypto wars, and what symmetric cryptography is and what asymmetric cryptography is and why that’s such a breakthrough, and why key length matters, and how NSA in particular has been trying to lobby for smaller key lengths and so on, now, hopefully our audience can think about cryptographic or the cryptography debates and the privacy debate in an informed way. And you can kind of make your own decisions about where you land on it and go from there. And Ram, I think that’s probably a good place to leave it. What do you think?
Ram:
Yeah, I think we’ve covered a lot of ground today. As always, it’s been a pleasure having you on the show. Come and subscribe to us, and listen to us on your favorite podcasting app, whether that’s iTunes or Spotify, or I don’t know what else we have for podcasting.
Mark:
Yeah, absolutely. So Ram-
Ram:
Kathy usually does this part.
Mark:
I know, right. We hijacked her podcast because she’s got the last two days of the week off.
Ram:
Yeah.
Mark:
So hopefully she has a good break there, but Ram, if folks want to follow you on Twitter, what’s your username there?
Ram:
That is Ramuel Gall. I’m pretty boring on Twitter. Mostly all I do is talk about vulnerabilities I or Chloe found.
Mark:
That is definitely not boring. You guys do some amazing research. If you folks want to follow me, I try to stay on message and talk about things Wordfence and security related. My Twitter handle is mmaunder. So you can follow me there. Definitely check out the Wordfence blog at wordfence.com/blog. That is where you’ll find all of Ram and Chloe’s research. And I think you can scroll to the bottom of the page and subscribe to the WordPress Security mailing list, which we run, which has a huge number of subscribers. It’s extremely popular among WordPress site owners. So if you haven’t already subscribed, definitely do that. It’s a very, very high signal to noise ratio mailing list. We don’t spam you with all kinds of product pitches. It’s really just hardcore WordPress Security research made accessible, courtesy of Ram and Chloe. Right, Ram?
Ram:
Yep. Anytime we find a new vulnerability or a find out about a new attack, you’re going to be the first people to hear about it, except for the plugins developer who actually has to fix the plugin.
Mark:
For sure. All right. And then of course you can follow Wordfence on Twitter, @Wordfence. All right, everybody, thanks so much for listening. It’s been an absolute pleasure. Next week, I think we’ll be back to our regularly scheduled programming with Kathy and Ram. Have a wonderful weekend. Bye, everyone.
Ram:
Bye.
You can find Wordfence on Twitter, Facebook, Instagram. You can also find us on YouTube, where we have our weekly Wordfence Live on Tuesdays at noon Eastern, 9:00 AM Pacific.
Comments