If you use MongoDB, then you may be feeling ecstatic right now. Why? Amazon Web Services (AWS) just released DocumentDB with MongoDB compatibility. Users who switch from MongoDB to DocumentDB can expect improved speed, scalability, and availability.
Today, we’re talking to Shawn Bice, vice president of non-relational databases at AWS, and Rahul Pathak, general manager of big data, data lakes, and blockchain at AWS . They share AWS’ overall database strategy and how to choose the best tool for what you want to build.
Some of the highlights of the show include:
Database Categories: Relational, key value, document, graph, in memory, ledger, and time series
AWS database strategy is to have the most popular and best APIs to sustain functionality, performance, and scale
Many database tools are available; pick based on use case and access pattern
Product recommendations feature highly connected data - who do you know who bought what and when?
Analytics Architecture: Use S3 as data lake, put in data via open-data format, and run multiple analyses using preferred tool at the same time on the same data
AWS offers Quantum Ledger Database (QLDB) and Managed Blockchain to address use case and need for blockchain
Authenticity of data is a concern with traditional databases; consider a database tool or service that does not allow data to be changed
Lake Formation lets customers set up, build, and secure data lakes in less time
DocumentDB: Made as simple as possible to improve customer experience
AWS Culture: Awareness and recognition that it takes many to conceive, build, launch, and grow a product - acknowledge every participant, including customers
Full Episode Transcript:
Hello and welcome to Screaming In The Cloud with your host, cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming In The Cloud.
Corey: This episode of Screaming In The Cloud has been sponsored by CHAOSSEARCH. CHAOSSEARCH is a cloud-native SaaS offering that extends the power of Elasticsearch’s API on top of your data that already lives in Amazon’s S3. CHAOSSEARCH essentially turns your data in S3 into a Warm Elasticsearch Cluster, which finally gives you the ability to search, query, and visualize months’ or years’ worth of log and event data without the onerous cost of running a Hot Elk Cluster for legacy data retention. Don’t move your data out of S3. Just connect the CHAOSSEARCH platform to your S3 Buckets and in minutes the data is indexed into a highly compressed data format and written back into your S3 Buckets, so it keeps the data under your control. You can then use tools like Kibana on top of that to search and visualize your data on S3, querying across terabytes of data within seconds. Reduce the size of your Hot Elk Clusters and waterfall your data to CHAOSSEARCH to get access to an unlimited amount of log and event data. Access more data, run fewer servers, spend less money. CHAOSSEARCH. To learn more, visit chaossearch.io and sign-up for a trial. Thanks to CHAOSSEARCH for their support of this episode.
Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. I'm joined today by Shawn Bice, AWS's VP of non relational databases, and Rahul Pathak, GM for Big Data, Data Lakes and Blockchain which is a whole bunch of words that don't really go together but I imagine there's a common thread in there. Welcome to the show folks, thanks.
Shawn: Hey Corey, thanks for having us.
Shawn: Yeah, it's a great question. As you know, our strategy is in partnership and driven with our customers. Frankly, when we sit down with customers and talk about their data strategy, there's one of two things that almost always comes up. One, customers have, the odds are they probably had plenty of relational applications on premise. Why? Because relational databases have been around since the 70s. When they start thinking about the Cloud because they want to free up their resources from that operational burden, they'll start thinking about, "Hey, I need to lift and shift to move those relational apps into something like Aurora with PostgreSQL," or they could take a commercial workload like SQL server, Oracle and move into RDS. That's a motion that’s in play.
I think the question you're really getting to is the second part that we hear from customers which is, "Hey, if I'm building a new application, what tool should I pick?" and frankly, the way we think about it is, these new apps or these new modern apps, if you think of the biggest write, share app, or a media service, or something like Snap, just think of these big scale apps. Most developers that are building these super large scale apps, they do what they do best. They basically break the app into smaller parts and then they pick the right tool for the right job.
If you just use that as a backdrop and in your mind you're like, "Okay, apps today could have millions of users. They could be geographically distributed anywhere," and everybody wants everything to run even faster. Well, that puts even more pressure to make sure that you really are truly picking the right tool for the right job so you don't overburden a single database. So with that, the one easy way to conceptualize this is years ago when you thought about a database category, it was just relational. Today, there's about six categories that developers think about relational as a category, key value as a category like DynamoDB is a great product inside of the key value category, document is a category.
Just yesterday, we introduced Amazon DocumentDB, graph is a category where we have Neptune and then of course, you have in memory as a category with the last cache [...]. We introduced a brand new category at re:Invent with the quantum ledger database and then time series is a category. If you think of those categories of data, our database strategy is super simple. We want to have the most popular best APIs in each of those categories so a developer never has to trade off on functionality, performance, or scale.
Corey: Let me caveat this conversation with anyone who's ever seen my terrifying development practices, understands why I generally tend to work with state listings you can reconstruct without destroying a company. With a database, you generally don't have that luxury. If you lose data, your company is having a bad day. As a result based on my own limitations and experience, I'm not particularly up to speed on the nuances of database design, database selection, and the rest.
Right now, I think the big question is when a developer is starting out with a new project something they want to build, one of the largest questions that looms in their mind is what tool do I use to do the job? As someone starts looking through the increasingly lengthy list of database offerings, what are the considerations that shape that choice?
Shawn: Great question and the answer is actually really straightforward. It has to do with the use case on access pattern. If you and I, let's say, we’re building an online commerce application, think of a shopping cart. We don't know if we're going to have 100 users or 100 million. Think of something like Black Friday where something could go on sale and you can have millions of customers all of a sudden needing to shop and make a purchase. If we're going to build a shopping cart, you just think of that shopping cart access pattern where you're just quickly adding things to it, you haven't done a transaction yet.
Those are simple puts and gets and key value is a awesome solution for an access pattern of simple puts and gets, because it provides. It's very fast, it's very efficient. You and I don't have to model anything in it and it can scale for as many users as it can. Meaning, it's Thursday and not Black Friday, we had 10 users and on Black Friday, a big thing goes on sale, we got a million people shopping. Key value can handle that access pattern, great.
Now, imagine in that shopping experience, you know when you're buying something and you see a product recommendation? Well, product recommendations are really about highly connected data, like who bought what and when so that if you and I wanted to make a product recommendation, graph database would be an excellent solution for it. For example, imagine a scenario where somebody's shopping for something and instead of just saying, "Hey, here's what others bought in the sports category," what if it was a little more personalized and say, "Here's some of the items your friends bought,” or people you know in a certain category. Graph database can help with that in a really big way.
That's kind of what developers do. They break these apps into smaller parts, they think of the access pattern use case, and then they pick the right tool for the right job.
Corey: So as you wind up going through a project of building out something that requires a data store, that one was common stories we see around this. It has to do with someone building out some form of analytics architecture. How does that wind up manifesting in your universe?
Rahul: That’s a great question. When it comes to analytics, what we recommend to customers is just think about S3 which is our core stored services your Date Lake and we recommend that customers put data into their Data Lake in S3 and open data format so CSV, JSON or query-optimized formats like Parquet or ORC. That open data gives them portability. They can take that data wherever they want. They can use it with whatever technology they want.
We've engineered our analytics services so that they can all run directly against open data and S3. So if you want to run the latest in Spark, you can run that through EMR. If you just want to run SQL on your data in S3, you can use Athena. If you're doing data warehousing and scale, Redshift is a great choice and that also works with data in S3. What you get is the ability to run multiple types of analyses using a preferred tool at the same time on the same data without interfering with the other party that's running its use. So to maximize your flexibility and portability, while getting all of the scale and durability benefits of this.
Corey: One of the increasing challenges is to some extent, it feels like you are this technical generation's pearl. In that, famously the programming language had the motor of there's more than one way to do it. Increasingly, it feels like you are heading down the road of no matter what you pay for any solution, the easiest thing in the world for someone to do is come along and say, "You made the wrong decision. You should instead use X, Y and Z," and even with a right, that's never a particularly helpful statement. But people love arguing about things either for semantic reasons or for purposes of religion. That brings us to Blockchain.
As far as Blockchain goes from my perspective as someone who stays as far away from this as humanly possible, there's been a lot of hype around it, there've been a tremendous number of use cases that make varying degrees of sense. But largely, the value of Blockchain for many of us has been drowned out by the hype and to be very direct, some of the worst people in the world are actively pushing some of these things to the point where it's almost a punch line more than anything else. That was my perspective a couple of months ago. Then at re:Invent, there were a few Blockchain announcements to my understanding QLDB Quantum Ledger Database and then there's Blockchain at which point my immediate response personally was, "Well, crap," because now I have to take this seriously and I can't just dismissively hand wave it. Can you explain to me please, what is the actual Blockchain use case and what is it driving customer needs?
Rahul: Absolutely. It's important when we talk about Blockchain in the AWS context to really separate out the crypto currency world because that's not what we're focused on. When we spend a lot of time with customers trying to understand what these use cases were, what we learned was really there were a couple of use cases that were at play. Typically, one of them was where there was a centralized entity that customers trusted. It could be think of it as a major manufacturer and they have a satellite of suppliers.
Everybody in that ecosystem trust that manufacturer and they're fine with that manufacturer maintaining a centralized record, but they wanted an immutable ledger. They wanted to be able to trace every element that flew through, but they were fine having the manufacturer control that central record of what happened. That's an area where the centralized trust is what we built QLDB for. QLDB is actually based on ledger technology that we've had for a while at Amazon, but the intent is to provide an immutable cryptographically verifiable, immutable record of what's happened.
It's typically owned by a centralized entity. Others can connect to it and verify the transaction history. The centralization there's keys. For that immutability, but no need for distributed trust, then QLDB is the database for choice and it frees customers from building those audit trails and relational databases where DBs could modify things or from using the Blockchain frameworks like Fabric or Ethereum which have a bunch of additional complexity related to the distributed trust and smart contracts that aren't really needed for this immutable record case.
The second type of use case we found was where there were perhaps more of a group of peers that were engaged in commercial transactions where they didn't want any single party to completely control the record of what took place. In this distributed world, they wanted the immutability of a ledger, but they wanted multiple participants to agree and validate what would go into that ledger. That's where the Blockchain frameworks not cryptocurrency technologies, but the frameworks like Hyperledger Fabric and Ethereum which allow multiple parties to agree on what truth is and then write out that truth to multiple copies of the data that's owned by each of the participants.
A great example of this is if you think about online advertising networks. What you’ve got is an exchange, you've got multiple parties bidding on ad slots, you've got publishers that display ads. What they would like with that exchange is to have a record of, "Hey, there was an auction, 50 people bid, one ad to one. This was the ad that was served on the site," but they don't want to actually own that infrastructure, they trust the exchange. The exchange would use QLDB to maintain a record of what happened for each auction.
But there's also a scenario where you get multiple exchanges that are sharing information because they're routing traffic to each other and they don't actually want to give all of their data to any one exchange. They want to use the Blockchain for that scenario and they can use Amazon to manage Blockchain to have distributed data, but within themselves, they might use QLDB to have an immutable record.
Corey: When you start talking about immutable records, transaction ledgers and the rest, because of my own prejudices and more accurately a regulated background, my immediate thought is compliance. Is this something that you could use for example to fulfill Sarbanes-Oxley or Worm requirements or write ones and read many, or is this the sort of thing where at least today, you show it to an auditor and they stare at you and now you have more questions that you just—you’ve effectively opened Pandora's Box of explaining complex concepts to people who are generally hoping to check a box. How does that manifest in the regulated world today?
Rahul: I am not deeply familiar with Sarbanes-Oxley or Worm, but what we have found is there is a lot of interest in both ledgers and Blockchain security DB and manage Blockchain for the audit and compliance use case. The reason is that you can independently verify that what was written has not changed. That’s sort of the central building block of audit and compliance.
We see scenarios like in the Guardian Life which is an insurance company that has multiple providers, customers and pairs, the ability to say yes, every single party that looks at this can agree that what was written here is what was originally written here and it hasn't changed since it was written, that's really powerful for the audit and compliance use case.
Corey: It'll be fascinating to see how this one's manifesting in a few years. I get the sense that people on the bleeding edge of compliance story are going to sort of pave that road for the rest of us. As a general best practice, if you wind up having to bring a mathematician into an audit to validate what you're saying, it's not going to be an easy conversation. It's always interesting to watch people forage that road ahead a little bit.
One of the interesting stories when I first saw the announcement of QLDB was again because of my own biases, when I don't understand something, the easiest thing in the world to do is make fun of it. I made a joke to someone in passing who worked at AWS, "Well, this one doesn't work out. it might very well be the first service that they'll wind up turning off and deprecating," and then very serious answer that I got in return was, "We're using this internally and if you turn this off, there aren't too many services left that will work."
It winds up to my understanding becoming a foundational part of building higher level services and it solves a need that at significant scale when you're generally distributing the systems, don't have easy answers for that. Is that an accurate assessment? Is that effectively someone shining me on? At this point, I don't know enough about this space to opine intelligently on it.
Rahul: The technologies behind QLDB are absolutely critical to how AWS and Amazon, a lot of our key internal technologies and with distributed systems having a high throughput way to understand what the state of the system is and the ability to replicate that state from point A to point B so you can use it for different things is crucial.
Shawn: Yeah, you could think of it like think of all the activity that goes through the EC2 control plane. Just imagine how many events are coming through there. If you and I were operating in a world like that, if we had to go to each and every place across the environment to see what kind of events were happening, that would be quite difficult. You could imagine us saying, "Gosh man, I wish there was a way to have sort of a ledger of all the transactions that were occurring, because it could help us troubleshoot and operate the environment better," and that's kind of the essence of where QLDB started this notion of a ledger many years ago, but you have to have this really big scale thing to drive a demand like that.
The interesting thing to your question as we've been on this journey together with QLDB, we’d start sitting down with customers and they'd say things like, "I wish you had an immutable database," they weren't really asking for a ledger, "Do you guys have something that’s immutable and cryptographically verifiable," because similar for them it's like, "Hey, there are certain sort of transactions that are happening in my environment. I wish there was a way to simply record that somewhere, know that it can't be changed and cryptographically verified if the auditing occurred," and that's kind of when we had this moment of, we've got the essence of technology that’s supporting some of the largest services in AWS. We have this new requirement coming in from customer. So maybe we can put those two things together and that's kind of the ingredients that led to QLDB.
Corey: It seems like it's a fascinating foundational technology. I still am having trouble bringing this into I guess mental focus for me where I'm going to build something that is a user facing that rides as a relatively thin layer on top of this. I don't have that problem for example with DynamoDB or RDS. I'm building a shopping cart, here's what you do. It's more challenging to think of I'm using an Instagram equivalent or something like that. Oh yes, that's backed by a ledger. That's almost assuredly an imagination failure on my part.
Shawn: It's funny, it's not a concept that you just kind of hear once like, "Oh, haha. I now know what a ledger database is," it takes a little bit but here's what I found, imagine a DMV scenario. So you've registered a car at some point I'm assuming, right?
Corey: Yes, I still have scars from it.
Shawn: Do you ever see these commercial sometimes where you'll see a company say, "Before you buy this car, there's been five registered owners for it," you see those kind of things and I've always wondered, how do you know that there has been five registered owners? Is that data or something you cooked up, or is it real? I don't know the authenticity of it.
So let's use DMV. You got all these people coming in and registering cars and that's going to get recorded in some database somewhere. In a traditional database, one of the troubles is if you have an access to that database, you could change that data in however you wanted, and if auditing was turned on or off depending or so—you could manipulate that data and it could be really difficult for somebody to know that that change was made.
On the flip side, if people are saying, "Hey, let's turn auditing on," auditing done the right way can sometimes slow databases down. The reality here is that DMV, imagine as a government agency are saying, "Hey, when somebody comes in and registers a vehicle, that's a transaction," so there's a van, there's an identity for the vehicle and you for example as a registered owner, let's write that once into a ledger and then that's it. Once it's written, it cannot be changed.
Let's say you trade the car in and then somebody else buys that car. Effectively, that's the change event. There's a new registered owner for that vehicle. That would just be the next transaction about that VIN number. You can imagine as time goes on, each time that car is sold, there's a record of it. It's just stored in this ledger database. And because that database has the property of the data that’s immutable can't be changed and its cryptographically verifiable. If anybody came back and said, "Hey, is it really true that that car has had five registered owners?" the DMV would have a very easy way to demonstrate that.
Corey: Not only is that a terrific example of applying this in a way that some of my limited understanding can grasp it, but it's also I think one of the first ledger explanations I've ever heard that wasn't condescending. One of the biggest challenges you see in many cases when you have a new and exciting technology that gets launched is you ask someone to explain it to you and suddenly it winds up being, "Well, actually it's very simple," and people start the most condescending explanation.
About three sentences in, "I don’t even care what this technology is but I hate this person," I'm continually amazed by the fact that AWS is able to explain these company's complex concepts in a number of different ways in such a way that I don't feel like a moron having asked the question in the first place. So first, thank you for that. I think that's something that a lot of people could wind up learning a fair bit from. I know it's something I struggle with myself.
One other service that was announced at re:Invent that I want a little help contextualizing while I have you here. Surprise, this entire podcast is a sham. It's all for my own education and because opening support tickets just seems too pedestrian. But Lake Formation was announced. That is one of those interesting services from a few different perspectives. First, it's an awesome name. It's a vocative of largely what it does, but at the beginning, what does Lake Formation do?
Rahul: Lake Formation is designed to be a way to allow customers to build and secure data lakes really in days versus what might have taken them months in the past. The reason setting up data lakes can be challenging is that, not only do you have to figure out how to lay out your data where it lives, how to get it into your data lake, but actually protecting your data is a huge part of making data lakes broadly available, because you don't want have everyone in the organization to have access to everything, you want to be able to define access policies that live with data so that customers can use any service they want at a query. Query in a way that’s controlled and governed.
The third piece is just data hygiene. How do you make sure they're not dumping vast amounts of data that don't make any sense into your data lake and you need some way to organize and curate and manage all of those things. One of the things you talked about earlier in the podcast was sort of this range of services can be confusing and all we wanted to do with Lake Formation was to provide very prescriptive repeatable way for customers to set these things up easily.
The key components of Lake Formation are one, blueprints that make it really easy to set up your initial data lake. Two, a centralized security mechanisms that let you define a data access policy so Query can see these tables and these columns where all can see those tables but not these ones. That stays with your data definition. Whether you use Athena or RedShift or EMR to get to get that data, Lake Formation will make sure that you're only ever able to see what you've been allowed to see regardless of the service that you choose and the same for me.
Because you have that central point of control, someone else can also then verify that yes, what we intended to happen is actually what happened. Then the third piece is a data deduplication and clean up activity that's driven by ML. Customers can say, "This is what my data should look like," these two things are actually related and that will train a model that can then go through and clean all the data sets up and that comes out of that technology that we've been using to do debar addresses and catalogs at amazon.com for a long time.
Corey: Hearing you describe it that way sends me back in time with various previous engagements and jobs where I wound up effectively having to build the foundation of a data lake. My approach then was all right, I'm going to throw everything into an S3 bucket. Well, that's part two, we'll figure this one out later. Step three, we have a data lake. Just hearing you describe this and all the things I didn't do and didn't conceive of when I was building that out tells me congratulations Corey, you built a data swamp.
That is in many cases where a lot of people tend to wind up getting stuck. Is Lake Formation envisioned to something that is best suited for Greenfield data lake projects, or is it something that can be applied to the existing corpus of data?
Rahul: Great question. It's really designed for both. The intent was to make it easier for people operating on Greenfield, but to the extent that you already have a date lake or a former data swamp. Lake Formation gaps loosely crawl through that and discover what you have, catalog it and then give you a starting point from which you can then curate and clean up. It's really designed for both.
Corey: Do you find that there's any other relationship and/or confusion in the name of Lake Formation to Cloud Formation?
Rahul: We haven't come across really any significant confusion. I think there might be times when customers are using both of them that it might get a little tricky to keep track of which thing they're forming but for the most part, it's been pretty clear. I think customers understand it's tied to their data lakes.
Corey: At launch, did Lake Formation had confirmation support? Because if it didn't and then you have to launch that late in time, that is going to be one of the most confusing headlines to read out loud.
Rahul: So at preview it does not, but it will entry it.
Corey: Wonderful. This is the problem with having a fire hose of released announcement. It's very easy to lose sight of what's available that I can use today, versus what's in developer preview, versus yes, announcing this service that we've been running that you've never heard about for the last five years. It's always interesting to wind up seeing how these stuff plays out. We're all since past the point where I think any one person can have an exhaustive list of everything that they don't know runs stuffed into their own hand.
I still wind up getting faked out from time to time of services that don't actually exist. Which brings us to the announcement of roughly a day ago now of Amazon DocumentDB. First, its formal name is Amazon DocumentDB with MongoDB compatibility which sets a new record for the largest number of syllables in a formal AWS product name. First, let me congratulate you on that. That was taken the crown from AWS systems manager, session manager, or parameter store which tie with a couple syllables less.
One challenge when I look at that is first, I know almost nothing about it, but we're about to fix that. My first instinct on seeing the name is everyone wound up chiming in of, "So, what do you think of the name?" Well, my honest answer is yeah, their biggest competitor is named Mongo. So you can name it pretty much whatever you want and get a buy on it. I don't have too much to say, but the concern I have and I'm wondering about here is I've always abbreviated DynamoDB as DDB. There is now a namespace collision where it's about to become confusing to people as far as which database they're talking about.
Shawn: I think you're asking a pretty reasonable question. I was just thinking as you were talking about that, in all the talks that I've done it re:Invent and I'm often the one speaking about our family of databases. I really haven't seen too many name collisions per se and I'll tell you why. It's because once you get an understanding of those categories, relational is a category. I'm just talking about data categories. Document's a category, key value is a category, graph, time series ledger, so on and so forth.
Once people understand those categories, they kind of have a light bulb moment. In fact, just yesterday I was in San Francisco with a customer from re:Invent. He's like, "We've never thought of data that way," so they really start thinking about categories of data first and then the API inside of that category. So in that context, there's not a lot of overlap because Dynamo is a flagship key value store and DocDB or DocumentDB as we refer to it yesterday, kind of fits nicely into that document category.
Corey: What I find fascinating about it just from early returns of people who have looked at this and played with it to some extent is that it tends to offer and please correct me if I'm wrong on this, a very approachable new user experience when you're just getting started with something like this. I'm told the documentation is terrific. There are a bunch of use cases. It's a lot less go poke around in a bunch of various forms across the internet and try and piece together half baked understanding of it.
The onboarding has had a significant time and attention paid to it by all reports. First, not having a play with it myself yet, is that accurate? And secondly if it is, wasn't it something that was a driving consideration prelaunch?
Shawn: We're always trying to improve the customer experience, pretty much any Amazonian you talk to is going to tell you that more than once. But it's true, and developer experience starts with documentation. Sometimes you might think, "Hey, it's just about the API or making it approachable," but most developers [...] say the same thing. They really appreciate a low bar to entry. They appreciate it when they can get up and running with very little cost, very little friction and simplicity typically wins. At least on that first day one experience so to speak. I think every team here that's trying to provide any customer experience is always trying to lower that bar and make it as easy as possible to get up and running. In the context of DocumentDB, we definitely wanted to make that as simple as possible.
Rahul: Yeah. One of my favorite memories from launching Athena which a serverless SQL S3 to re:Invent in 2016 was that 10 minutes after and Andy had announced it in his keynotes, someone who tweeted that they were using it in production to analyze their Cloud trail logs. That was a big one.
Corey: That’s a great example.
Shawn: It's always nice to see a service launch that doesn't feel like you're getting onto a carnival ride. There's the bar with a cartoon characters and you must be at least this smart to ride. It winds up being appreciated when regardless of the power and capability of a service, the onboarding isn't one of those trial by fire running of a gauntlet.
Corey: Yeah, to that point if you take DocumentDB, a lot of customers have actually been using Mongo that’s in AWS today for quite some time. You'll see that manifest by way of self manage Mongo in EC2 or running on a Mongo service that's in AWS. But in the end, most of these customers came back with the same thing and they say things like, "Hey, I really like the Mongo API. I like the flexibility of the document model, but it would be great if what I'm struggling with is rather just making it run in a very efficient performance, highly available way. Could you help us with that?"
Our mental model there is okay, we're going to remove all that operational burden from you. We kind of have to make that bar to entry super simple. So from your point of view, just an API that you connect to so that you do the dev and then we do the ops. That's kind of a simple mental model that kind of goes with that. That would sort of reinforce your example of somebody in 10 minutes getting into production because they're not having to deal with the ops, they just dev.
Shawn: I come from ops background myself. My conception of what makes things easy and understandable is diametrically opposed to what someone with a development background tends to see. We're seeing this melding of the two as the world continues to evolve. Increasingly, we're seeing the divide breakdown where it's no longer an operations person just looks like a crappy developer, or a developer looks like an ops person with no sense of responsibility.
We're seeing a sense of those two things known as one of the DevOps or whatever you want to call it movement, don't add me. There is an increasing awareness that there are people on both sides of that historical divide that need to be able to use a new product without having to go through a 18-step process to get things set up. Anytime that there's a launch that makes something accessible and easy to use, I'm fully in support of it.
I've never found that making things difficult to get started with has paid dividends. It's clear from what I've seen so far, there's been significant effort put into that across the board from AWS. Some of the launches recently have just been night and day difference compared to some of the early services. It turns out things don't get worse with time. I want to thank you both for taking the time to speak with me today. It's incredibly gratifying to be able to talk to some of the people who are behind the services that get built out.
It's easy to lose sight sometimes of the fact that when a service gets announced, you spend 18 months, two years on building a service with the service team. People wind up doing a bunch of work, blood, sweat and tears. They finally send the final reviews, documentation gets done, it gets in many cases a ridiculous name slapped onto it and then it gets launched. There's a blog post someone writes and people thanked the person that wrote the blog post that's great, but first, that person didn't build the product but there's a lot of people behind the scenes who build these things and get them out the door.
I'm curious from the perspective of having just spent time building and launching a number of services over the past few years, how is that seen internally? Is there a sense that you're seeing from the service teams and product teams that build this that their work is unheralded, do they understand the level of appreciation the community has for the incredible amount of work that goes into these things?
I always tend to look at this even beyond the pure engineering effort. The product managers, the marketing people, the folks who work on pricing. There's an awful lot of moving parts to launch anything at this sense of scale and, "Oh, it's a few engineers sitting in a room. I'm sure they can feed them all with two pizzas," it doesn't generally work that way. There's a lot of moving parts at this level of complexity.
Shawn: Yeah. Maybe both of us can share a thought and be brief. The one thing that I see is like, if we walked out of here and just walked around the hallways, you're going to bump into people that are just naturally focused on customers. That is not a set of words we just toss around lightly. We can walk into a meeting next door and if we were talking about any product that we are wanting to do, it is always working backwards from customers.
So you could be a marketer, you can be a seller, you could be an engineer, you could be in PR, any function you're in this company, no matter what building you and I can walk into, any problem we're talking about is always working backwards from the customer. If you use that as a mental model, there's just full engagement with what customers are doing across the board in every discipline. The nice thing there is you don't end up in a situation where one function as the customer interface and then there's everybody else.
You end up with the room, as you pointed out, it's not just one single function that gets a product done. It's a whole team effort, but everybody on that team is as curious, interested and committed to improving that customer experience. That's why I think a lot of people do appreciate all that goes into what gets built.
Rahul: Yeah. I'd go with that. I think there's a lot of awareness and recognition that it really does take a small army to conceive of, build, launch and then successfully grow a product and we try and acknowledge every participant in that.
Corey: What's always been amazing to me is talking to some of the people in the backend who build some of these things who are generally explicitly not customer facing. But this attitude of solving customer problems tends to permeate the entire arena. It's easy to look at leadership principles for example and discount them as marketing or sales speech. I have to admit, I did, when I first started getting fairly into the AWS Ecosystem. Yes, every company has a mission statement. Terrific. Great. But then you start talking to people and you see it manifest itself in ways that were not intuitively obvious at first.
It really does tend to lend itself to a cohesive excessive culture. There are certain trivias regardless of what AWS group I'm talking to, things that bleed through. It's nice to see the fruits of some of that as you build things out. It's not the sort of thing where you can just pick a part, piece a mail and drop on to some other company and expect to have the same results. It's something that I think was built in from the beginning here.
I don't think I've seen anything remotely like it in any other company I've ever spoken it. It's a need to say.
Rahul: Thank you.
Corey: Thank you both for your time. I appreciate this. It's been a hectic few days for you folks with the launch of this and I don't get the sense there's a whole lot of us sitting around and resting at AWS. Now it's always on to the next thing, on to improving intermittently.
Rahul: Thanks very much.
Shawn: Thank you.
Corey: Shawn Bice, VP of non relational databases, Rahul Pathak, GM for Big Data, Data Lakes and Blockchain. I'm Corey Quinn, this is Screaming in the Cloud.