Do you deal with a lot of data? Do you need to analyze and interpret data? Veritone’s platform is designed to ingest audio, video, and other data through batch processes to process the media and attach output, such as transcripts or facial recognition data.
Today, we’re talking to Christopher Stobie, a DevOps professional with more than seven years of experience building and managing applications. Currently, he is the director of site reliability engineering at Veritone in Costa Mesa, Calif. Veritone positions itself as a provider of artificial intelligence (AI) tools designed to help other companies analyze and organize unstructured data. Previously, Christopher was a technical account manager (TAM) at Amazon Web Services (AWS); lead DevOps engineer at Clear Capital; lead DevOps engineer at ESI; Cloud consultant at Credera; and Patriot/THAAD Missile Fire Control in the U.S. Army. Besides staying busy with DevOps and missiles, he enjoys playing racquetball in short shorts and drinking good (not great) wine.
Some of the highlights of the show include:
Various problems can be solved with AI; companies are spending time and money on AI
Tasks can be automated that are too intelligent to write around simple software
Machine learning (ML) models are applicable for many purposes; real people with real problems and who are not academics can use ML
Fargate is instant-on Docker containers as a service; handles infrastructure scaling, but involves management expense
Instant-on works with numerous containers, but there will probably be a time when it no longer delivers reasonable fleet performance on demand
Decision to use Kafka was based on workload, stream-based ingestion
Veritone’s writes code that tries to avoid provider lock-in; wants to make an integration as decoupled as possible
People spend too much time and energy being agnostic to their technology and giving up benefits
If you dream about seeing your name up in lights, Christopher describes the process of writing a post for AWS
Pain Points: Newness of Fargate and unfamiliarity with it; limit issues; unable to handle large containers
Full Episode Transcript:
Corey: This week’s episode of Screaming In The Cloud is generously sponsored by DigitalOcean. I’m going to argue that every cloud platform out there biases for different things. Some bias for having every feature you could possibly want offered as an added service at varying degrees of maturity. Others bias for, “Hey, we heard there’s some money to be made in the cloud space. Can you give us some of it?”
DigitalOcean biases for neither. To me, they optimize for simplicity. I polled some friends of mine who are avid DigitalOcean supporters about why they’re using it for various things, and they all said more or less the same thing. Other offerings have a bunch of shenanigans, root access, and IP addresses. DigitalOcean makes it all simple, “In 60 seconds, you have root access to a Linux box with an IP,” that’s a direct quote albeit with profanity about other providers taken out.
DigitalOcean also offers fixed-price offerings. You always know what you’re going to wind up paying this month, so you don’t wind up having a minor heart issue when the bill comes in. Their services are also understandable, without spending three months going to cloud school. You don’t have to worry about going very deep to understand what you’re doing. Its click a button or making API call, and you receive a cloud resource. They also include very understandable monitoring and alerting.
Lastly, they’re not exactly what I would call small-time. Over 150,000 businesses are using them today. Go ahead and give them a try. Visit do.co/screaming and they’ll give you a free $100 credit to try that. That’s do.co/screaming. Thanks again to DigitalOcean for their support to Screaming In The Cloud.
Corey: Hello and welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by Christopher Stobie, who's the director of SRE at Veritone. He's also a former TAM at AWS, but that's not really what I wanted to invite him here to talk about. Instead, a blog post went out somewhat recently about architecture that he's been working on. First, welcome to the show, Christopher.
Christopher: Hey, Corey. It's good to be here. Thanks for having me.
Corey: No, thanks for being so generous with your time. Let's start at the very beginning. I first became aware that you folks existed with a post that was put up on the Amazon Official Architecture Blog. It was titled Building Real Time AI with AWS Fargate. I read that five or six times and, eventually, I had a vague idea of what you were talking about and did a little more digging. For those who are starting off in the same place that I was, Veritone is a company that likes to position itself as a provider of artificial intelligence tools designed to help other companies analyze and organize unstructured data such as audio, video and images. What does that mean using small words?
Christopher: The description there is a little bit of a mouthful. I think the best way I would describe it is actually more of an angular story. With normal AI, if you want to do, say, something like image recognition or speech-to-text or this or that, any of these different capabilities that exist, you'd have to go write a service that connects to that engine and you have to write an API layer that's very specific and very singular. Veritone abstracts all that and says, "Hey, you can learn the Veritone API and you can get access to any engine that we have in our ecosystem and make a single column, describe what you want and get results against any of the engines that we support." I like to look at Veritone as a unification lawyer, a single API for lots of different AI.
Corey: It's easy to fall into the trap that I did when I started researching into what it is that you had actually built, that, "Oh, you're talking about AI and machine learning. It's probably a few people who are sitting in a garage somewhere. They've gotten to seed round, maybe a serious A and, holy crap, you're publicly traded on the NASDAQ." This is no longer the sort of thing that's just the remit of hobbyists or focused on, "What if our future technology…" This is something that the market believes in strongly. This is something that's here today albeit one that's still being built into a clear-cut use case. As things stand today, what problems might I have that look like something that AI might be able to help me with?
Christopher: I think there's a lot of different things that can be solved with AI, and I think a lot of really big companies are requiring a lot of time and money into building the AI that changes the world. I think, in the meantime, before 20 years passes, there's a lot of menial stuff–or maybe not 'menial' is the right word–but there's a lot of tasks that can be automated that are a little bit too intelligent to just write simple software on. Things like analyzing court case documents, ingesting them and transcribing them from text into an indexed, searchable object in a database is something that, traditionally, was done by humans and took a lot of time and energy. Instead, you can scan a document, run it through a transcription engine and you have your results indexed and searchable in a few hours or even faster, depending on whether or not you use Veritone.
Corey: One of the interesting challenges about this entire space from my perspective is just the sheer applicability of machine-learning models to different things. A while back when SageMaker first came out, I gave it a few months and then asked on my ridiculous newsletter, "Who's using SageMaker and for what?" because, personally, I'm not a data scientist; I'm not someone who has the wherewithal or the expertise to have intelligent conversations around these things. What amazed me was, first, the sheer volume of replies I got, secondly, the fact that everyone was doing something different with it and, lastly, that they all started with some form of the sentence, "I'm not a data scientist, but…" This is rapidly turning into something that real people with real problems who are not, themselves, academics are able to touch, use and get exposure to.
Christopher: I'm not a data scientist, but I definitely agree. I think that AI is expanding and we're growing into a field that demands accuracy and results in a much faster pace than humans can deliver.
Corey: Absolutely. You wound up mentioning in your post that this entire system that you described is built around Fargate. For those who aren't aware, this is effectively instant-on-Docker containers as a service. Picture serverless Docker and, in addition to starting a war with that phrase, you're effectively not that far from what this looks like. You throw a Docker container in AWS, it handles all of the infrastructure scaling for you. The downside to this, of course, is that, first, there is some management expense tied into that. On a one-to-one compute level, you will wind up spending more per container second than you would for a similar amount of compute on EC2. Do you find that the value that you get from having something managed entirely for you offsets that economic cost or is there a tipping point where, "Okay, we're now large enough on these workloads, moving to EKS, ECS or something else," eventually become a foregone conclusion?
Christopher: I think that when we went out and we started doing this, we looked at it twofold. We looked at it, first, with the assumption that, eventually, Fargate would have some sort of a pre-provisioned billing, kind of like pre-buying DynamoDB throughput or purchasing reserved instants. Fargate is young so I think we assumed that, eventually, it would have a better billing model than it currently does. Given that it doesn't today, part of the architecture actually includes mixing and matching Fargate and EC2. We have very bursty traffic and we designed the mean of our traffic. The average loads run on reserved instances on EC2 and for all the bursts to scale in Fargate. It is very expensive and we're very conscious of the economic impacts that our company would have internally than if we went entirely Fargate.
Corey: I think that there's a definite story around when Fargate becomes acceptable when using other things begin to make more sense, and one thing I'm starting to see more and more of as I talk to people about this is the idea of, traditionally, what you would see with on-param with versus cloud; you own the base and rent the peak. I'm starting to see people with EKS or ECS clusters or running it in K-Ops or whatever it is to run Kubernetes, but then having a burst-ability story, that goes to Fargate since it's instantly available, it scales effectively forever and the only real downside is a bit of cost at stupendous scale.
Christopher: I think with Fargate, the flexibility that you get from being able to scale quickly just outweighs any cost impact in my opinion. With EC2, even though with optimized ECS AMIs, you're still looking at a 1:00 or 1:30 just for an instance to be available and ready for traffic, and that doesn't even include starting a container whereas, with a lot of benchmarking with Fargate, we are benchmarking 5-second start times or nothing. Having a container not exist and be ready in 5 seconds to me, given our workload, outweigh the financial impact.
Corey: Do you find that instant-on-experience works just as well for one or two containers as it does for dozens, hundreds, thousands, et cetera, or are there certain tipping points where it no longer is able to deliver reasonable fleet performance on demand? I'm not talking about service limits; I'm just talking about raw capacity. As the old joke goes, "The cloud is not infinitely scalable. Source tried it." At what point do you wind up seeing inflections, if any, or aren't they really manifesting in the service?
Christopher: They haven't manifested for us yet. I assume, given all of the things in AWS, that it will eventually manifest. Luckily, we haven't hit that problem yet.
Corey: Yeah, it turns out that all things are finite at a large-enough scale. This is not incidentally intended to sound, in any way, shape or form, as a ding on Fargate. When someone approaches you with a new service and says, "Here you go. It's awesome," I have an office background. My immediate question is, "Terrific. Where's it going to break?" If you don't know and understand what the failure modes look like, you're in for a bad time when your customers discover them, and they will discover them.
Christopher: I absolutely agree. We ran our Fargate deployments through a lot of load tests, just trying to break it, basically trying to see when we started seeing issues. All things considered and the amount of time we wanted to put into it, we were not actually able to break it from an error that was AWS-related.
Corey: Right, and I think that there's a lot of challenge as far as trying to understand, "Okay, is this something that's local to my account? Is it local to this particular availability zone? Is it local to the service itself?" Were you in the pre-announced beta period where it was just limited to a few customers? Were you using it just from Day One where it went GA, was there something else or am I not allowed to ask you that question?
Christopher: I don't actually know but we were in the beta period, but only maybe a couple of weeks before it went GA.
Corey: By every account that I've been able to get, Fargate is awesome. My single complaint with it is that its name is absolutely terrible. It's almost like a codename that snuck out into the real world. If I tell someone I'm using Fargate, everyone looks at me blankly unless they know exactly what it is. There's no good way to infer a name from it such as Simple Storage Service. Well, if I've never heard of S3, I can probably ferret out what that means. With Fargate, give up. There's no good way to get there from first principles.
Christopher: I always go immediately to Stargate, which I assume most other nerds will recall.
Corey: Oh, thank God, it's not just me. Something else that was of note in your blog post was that the cues that exist between your components are using Kafka for communication between all these different pieces. Now, let me qualify this. I am not at all interested in starting a religious war over what is the chosen cue and what is awful and only used by heretics, but I will ask this: Was it a difficult decision arriving at, "Will we use Kafka?" or was it a relatively straightforward shot?
Christopher: It was relatively straightforward. We try and be fairly agnostic and also not incite our own holy wars. We use a number of other cue services internally. I think Kafka just made the most sense for this specific workload, stream-based ingestion so it wasn't too difficult of a decision.
Corey: There's a lot of noise these days about picking only things that you can pick up and move as they are to another cloud provider. Looking at what you've built, I'm not entirely sure what that would even begin to look like. Was avoiding provider lock-in in any way, shape or form on your strategic roadmap or was it, "Well, if we ever have to move, we'll deal with it then," or, "Did I just cause a whole bunch of executives to go completely white?" as they realized, "Oh, my word, we're locked in."?
Christopher: Veritone is very cognizant of vendor lock-in. We actually have an offering of our product that you can ship and run in your own datacenter. We're very cognizant of making sure that when we write code that's specific to a technology like Fargate, we write it very small and use shims and make the actual integration as decoupled as possible. For example, after we did the Fargate deployment, we reworked a lot of the APIs that use Fargate to use other things like Kubernetes or Docker Swarm as well.
Corey: I like the model because you're starting off with something that embraces whatever the providers are offering and then you go back and add shim layers that wind up making it portable if you need to. If you're going to be targeting the idea of being provider-agnostic, especially as you need to be as you're meeting your customers where they are in your use case, it makes perfect sense. That's why it's the best practice, not, "You must always do this." I think that's a terrific architectural model. First-generation, embrace whatever it is the provider gives you. Generation Two, "Let's see what we can do to decouple this in some areas where it makes sense."
Christopher: I think a lot of people get lost spending too much time and energy on being agnostic to their technology, and I think it's important but I also think that you get to a certain point where you're giving up all the benefits that you might have gained by using that technology just to be agnostic. At that point, to me, it doesn't make a lot of sense so I like to try and design things around best use case for the workload.
Corey: That's absolutely the right move. "Wait, what do you mean you're not doing something that's architecturally perfect in favor of chasing down something he'll never need to implement?" People like to focus on the wrong part of the story. Your blog post originally appeared on the Veritone corporate blog and, as someone who writes an awful lot of blog posts myself, this is of personal interest to me. Your blog was invited to have a guest spot on the AWS Architecture Blog. My blog post generally gets threatened with cease and desist letters if I go too far. How did you wind up getting your post featured on something that is an AWS property?
Christopher: We're a pretty large customer for AWS. We are large in the sense that we give them money, not large in the sense comparing to other AWS customers. We're big enough that they pay attention to our goal and so I think that they noticed when we started using Fargate that our solutions architects and TAMs all reached out and were like, "Hey, we see you guys are using this new technology. What are you using it for? We're really interested in your use case?" and it just set up some conversations with their product managers and lead architects around Fargate where we kind of walked through what we're building and they asked us if we'd be interested in co-writing a blog for AWS Architecture series.
Corey: What was that process like? Was it essentially, "Here you go. Here's a blog post that we wrote," and they said, "Cool," and published it as is and it surprised you? Was there a 15-round revision process? Sorry. For those of us who dream of, one day, seeing our name up in lights, it's interesting to understand what it is to go through that process.
Christopher: It actually was surprising to me because our internal processes took quite a lot longer than theirs. We did a lot of review internally with our marketing and legal teams before we sent it to AWS just to make sure that we had all of our bases covered and we were talking about things in the right way. By the time we sent it to AWS, they actually had no revisions for us. It was just a waiting period for them to find the right time and blog series to send it out with. From the time we gave it to them, there was not really a lot of back-and-forth until they told us, "Hey, your blog's being published."
Corey: It's nice to wind up having it just sail through like that. I generally tend to not write in a style that lends itself to that.
Christopher: I think we just got lucky.
Corey: To that end, anytime I've given a talk or written a blog post about a technical solution or an architecture I was proud of, if I've then taken that post and I go and show it to some of my coworkers who worked with me on building that thing in the first place, their response is," Yeah, it's a great piece of fiction you wrote there but that's not the project that I remember," and they're right. I'm of the personality type where I will block out some of the negative issues mostly to keep myself from waking up in the night, screaming, but it's always sort of a glossy, polished, final version. If you follow a lot of other blogs that discuss similar things, this is a common pattern.
There's generally some form of wishful thinking, and polishing it up, and, "Oh, it's easy. We just sat down on our computers one day and that was 9 o'clock in the morning. By lunchtime, we had this architecture that appears in this blog post." I don't care if you're writing, "Hello, World." It's never that simple or easy to pull off. Can you talk a little bit about behind-the-scenes? What were the pain points as you were building this out? What didn't go according to plan? What could have worked but didn't and needed to be backed around in some ways?
Christopher: Sure. I think the biggest issues we really had was based on how new Fargate is and our general understanding or laziness from that point to read the documentation of running into things like limited issues. We were basically requesting too many containers to be watched consecutively and AWS had to slap us on the wrist and tell us to stop. Luckily, we had some really good conversations with the engineering team and the service team for Fargate and we were able to get a lot of these limits increased, but that back-and-forth and kind of not knowing what was going on or why things were breaking was definitely a pain point.
I think another pain point that we ran into with Fargate specifically was it doesn't handle large containers very well. With a lot of AI engines, you have these really, really big flat files that are 6 gigs. Trying to launch a 6-gig container in Fargate–if anyone figures out how to do that, please reach out to me or let me hear about it. For us, comparing a regular go container that's 5 to 10 gigs to a 6-gig container was like 5 seconds compared to 15 minutes to launch containers. It was very, very slow and painful, and we actually ended up not being able to use Fargate for some of our larger containers.
Corey: This is far from an isolated occurrence, incidentally. I've spoken with other clients of mine who were in similar situations and their question is, "Great, so how do I go about effectively launching a 10-gigabyte container using–" I don't even need to listen to the rest of that sentence because the only answer almost regardless of technology provider is, "You don't launch a container that's that large unless you have nothing at the time because it's not going to be performed and getting it out to where it needs to go. It takes forever," and a whole host of things that arise from the idea that containers are envisioned, for better or worse, as being a relatively lightweight thin thing that winds up being tossed to do a bunch of things at the same time, not, "Well, okay, it's easier for us to deploy our container via one of those trucks on Amazon as a snowmobile that has 100 petabytes of storage in the back because it takes too long to get out there over the network." At some tipping point, this is, in some ways, the wrong tool for the job as it's currently being imagined.
Christopher: I agree. One of the reasons we use containers for everything including these engines that maybe it doesn't make sense for–Veritone actually has another service called Veritone Developer Application or VDA, and this allows anyone in the world, you and me, to go write an AI engine or any type of engine that you'd like and upload it into the Veritone system. If you want an engine that can tell you, "Hotdog or not hotdog?" very specifically, you can write one and upload it to Veritone and then use that engine later to go compare hotdogs. Given the nature of all the different developers that would be submitting code into our platform, we needed some sort of common technology that would allow us to ingest and deploy the engines that they submit to us in a predictable and similar fashion. Docker was the obvious choice.
Corey: I think that you're probably right based upon this. The challenge, of course, is always trying to disambiguate the hype from what people are actually doing and how they're approaching things. It's, "Everyone says I should be using this particular technology," and that technology incidentally changes from week to week. It's virtualization; it's cloud; it's containers; it's Kubernetes; it's server-less; it's, "Wait 20 minutes. We'll have another one of these." Making sure the problem you have looks an awful lot like the one that the tool is aimed at is usually the step that some people tend to gloss over. To that end, what advice would you give someone who read your blog post, was entranced by it and is determined to follow in your architectural footsteps?
Christopher: Don't be afraid to say, "Help," because we failed numerous times trying to build this, the first couple of go-arounds. Just go in, dive into it and you'll be surprised with how powerful technology can be. Fargate is a pretty cool tool and I expect it to evolve to be one of the base services over the next few years.
Corey: I suspect you're probably not going to be wrong with that. Thank you so much for being so generous with you time. Christopher Stobie of Veritone, I'm Corey Quinn and this is Screaming in the Cloud.