Energy
Unveiling Sai Computing: The Future of GPU-Powered Cloud Solutions by Applied Digital
Jason Zang, Co-founder of Applied Digital, introduced Sai Computing, the latest subsidiary of Applied Digital, at a recent event. Launched in May 2023, Sai Computing specializes in GPU cloud computing, catering predominantly to the high-performance computing and AI sectors. Their offerings span a range of services, from long-term, extensive GPU deployments under the “Reserve Compute”…
This story was produced through MarketScale. See how Energy teams put it to work with Customer Stories & Case Studies.
Promoted content from Applied Digital on MarketScale.
Jason Zang, Co-founder of Applied Digital, introduced Sai Computing, the latest subsidiary of Applied Digital, at a recent event. Launched in May 2023, Sai Computing specializes in GPU cloud computing, catering predominantly to the high-performance computing and AI sectors. Their offerings span a range of services, from long-term, extensive GPU deployments under the “Reserve Compute” category to more flexible, short-term solutions with “Burst and Short-term Compute”. The company has also updated its GPU Portfolio, which now includes both older and latest GPU models. This expansion illustrates Applied Digital’s commitment to supporting the rapid advancements in the AI and computing industry.
Video TranscriptExpand ↓
Hi, everyone. My name is Jason Zang. I'm one of the co founders of Applied Digital. I'll be sharing an update and an overview of side computing, which is our new cloud services business. We started this business. We started incubating and building the fundamental, parts of this business in twenty twenty two. And officially launched this business model in May of twenty twenty three. So site computing is a wholly owned subsidiary of applied digital, and we offer specialized computing around GPUs. And, we offer GPU cloud computing to end markets in high performance computing, and artificial intelligence, and the variety of other use cases that our end users are using the cloud computing for. But it's mainly focused on GPU compute resources and helping the ever growing, need in HBC. The cloud services offerings are focused on three major areas. So we have reserve compute, which is focused on much longer durations and much larger quantities of GPUs. And typically what we do here is a six months minimum contract length. All the way up to five to six years. And, it's typically a, a large sized training cluster that we deploy on behalf of our customers. And they're using these for large scale language modeling training or other types of model, training workloads. We also offer other types of compute contracts. We have burst compute and also short term compute. Again, you can think of these as much shorter duration length in in terms of the contract and also some on demand capacity, our ecosystem partners that will allow users to test and do shorter term, compute workloads, using our hardware and using our infrastructure. There are a variety of GPUs that are part of our portfolio offerings. We have the typical older generation eight forties, eight six thousands, a one hundreds, which were kind of the quintessential, GPUs in twenty twenty two and proceeding years, but twenty twenty three has been really focused on h one hundreds and our deployments of h one hundreds have really wrapped up since June of this year. For future offerings, we're now working closely with Nvidia to explore the deployments of the Grace Hopper, which is his GH two hundred, and Hopper next, which is the generation beyond GH two hundred. We're also currently in the mix of deploying L40S, which is a redeployment or replatform of the L40, and it's going to be much more benchmarked to the a one hundred performance, but specifically used for inference workloads. So we're working with our customers and Nvidia to do test unit deployments of these right now so that we can scale these out in large scale, for inference workloads. So the reason why we have these different types of contract lengths and durations and different types of deployments is because there's never a one size fits all for all the end users. And there are some customers who much rather wanna have a very large cluster that they build out. And in order for us to commit to such a large CapEx expenditure on the equipment side. Of course, it warrants a longer contract value, and a longer contract duration. So those types of consumers of compute are of course going to be different than your on demand type of needs, right, where sometimes you need burst capacity for certain workloads that are couple hours or even just a couple of days instead of committing to a reserve compute that is multi year sometimes you can just be in the market and absorb the burst capacity that is available in the market. So the on demand and burst and the shorter term capacity is also very important because, one, it allows customers to get a glimpse of our offerings, but also allows them to have not as much upfront commitment where they might not be as deep pocketed as some, some of the large reserve customers. So again, it's a good mix so that we can better serve our end users. On the short term type contracts, we've partnered with, a variety of different platform creators and software developers, who are developing platforms to better execute and, increase the utilization of these GPUs. When they're in idle mode. So in these instances, we can better increase the utilization of existing equipment but also offer attractive pricing and offer attractive deals for smaller customers who are just getting wrapped up and also exposed to GPU So in the, GPU cloud, revolution or this dynamic that has really taken off in the last twelve months. We've seen that location specific type of workloads are less and less so, because these types of very compute intensive workloads tend to be location agnostic. So it allows us as a company that was previously very focused on finding power and then building out computational resources. Really to play to our strengths. Right? We can go out and find locations where the power availability and the cost of delivering that compute is much more attractive than your typical computational epicenters, like in the Bay Area or like, on the East Coast around Virginia. Right? So we have really, put together a great offering from a geographical perspective having locations around the Midwest, and also mountain, mountain US regions. Where we can take advantage of, ample amounts of power and being able to deliver that power into computational resources. In a more effective and cost effective way for our customers. And because of these workloads that are a lot less location specific we can do that and take advantage of these opportunities. And as we are building out this cloud services offering, a very big component of the cost model is, of course, on the equipment and the facility. Right? We've partnered very closely with the largest equipment manufacturers, and, of course, Nvidia themselves, which provides the GPUs and the networking equipment to build out these clusters and build out these deployments. So we're we have a very close partnership with Supermicro. We also have very large orders in place with HPE and Dow, in addition to Supermicro. The key areas of differentiation for our GPU cloud services are in the following aspects. One, we've been one of the few cloud providers that have deployed h one hundred, with Infiniban and networking at scale. We are deploying a couple clusters that are ranging from three to even eight thousand h one hundred GPUs in one location. And these clusters are some of the first clusters in the world that are being deployed by Nvidia customers. So we're very fortunate to be one of the first to deploy these, but also working closely with our customers to work through a lot of the things that comes with, deploying cutting edge technology. We're also one of the only cloud providers that offers bare metal. In an instance, we hand over access to the actual, servers to our end users. And allow them to really control as much as, access as they would like to have when it comes to provisioning and using and utilizing, the equipment. We also have a team of very experienced HPC engineers storage and networking experts that help support our end users in these deployments. Again, we're doing something that has been done by very few companies in the world. And we need to be at the bleeding edge and helping our customers figure out a lot of the early kinks that need to be figured out when when your deploying is cutting edge, equipment. The last point is on vertical integration. I'd like to touch on the aspect of applied Digital's core business, which is building data centers with the fact that site computing is now deploying and building one of the largest and fastest growing GPU cloud specific operators in the world. Where we can use a pi digital to build GPU specific facilities for site computing. This allows us to remedy a very important constraint in the market, which is data center capacity. As we grow side computing, and we can deploy those GPUs in our own facilities. That allows us to, again, be a lot more flexible on how we deploy, what size we deploy, what timelines we deploy, these types of clusters for our end users, and we're not beholden to a third party that we work with or a third party that we have to contract capacity with. On the product road map, we are working today, again, in the, in the bare metal offerings where we provide the facility, provide the equipment, and then hand over access to those machines to our end users. But as we scale out our business offering and our services offering, we'll start to have a lot more virtualization and containers containerization orchestration tools that we built, that we will be building on top of the bare metal offerings. Again, these are additional offerings that continue to refine and improve the product offering. But it's not anything that's holding us back today. Again, a lot of our deployments today are bare metal deployments, and our customers are very satisfied with that because we are deploying with end users that typically are a lot more sophisticated and also have the internal infrastructure teams So bare metal access is what they prefer and what they work well with. So as I mentioned before, we started the business and started incubating the idea in twenty twenty two, but didn't really launch it until May of twenty twenty three. And that was the the catalyst to that is, of course, our signing of, our first large contract with character AI. And that relationship has blossomed and expanded from that initial, contract that we signed for five thousand H100 GPUs. So character AI, right off the bat, they were backed by some of the largest companies VCs in the world such as Google and a sixteen z, they raised a hundred and fifty million dollars at a billion dollar valuation pre product. And it was has been absolutely amazing working with them to deploy one of the largest training clusters focused on H one hundred Nvidia technology in the last couple months. Here's a recap of what has unfolded in the last couple months since we started working with character AI. We initially signed the first compute contract with them at the end of May, and we started deploying that first cluster for them in June. This is unheard of in terms of the the turnaround and the speed. We worked very closely with Supermicro and Nvidia to deploy as character was a very key strategic account for Nvidia. And Nom, of course, has connections all throughout Nvidia's leadership. We were able to deploy that first one thousand cluster for them within the first month of us signing that contract. And since then, we've scaled up our commitment from character all the way to ten thousand GPUs and now expanding to sixteen thousand GPUs and beyond for twenty twenty four. So again, a very good example of how we've landed a key account deployed and executed for them. And then expanded that relationship over time. So in the last twelve months, we've seen HPC really grow from a very niche offering to something that is on top of everyone's mind, with this AI boom and generative AI really taking over everything that we've seen in business applications to consumer applications, we've really seen the demand for the fundamental layer that powers a lot of that really explode. Right? Because all of these applications and all of these new models and new technology is based fundamentally on the equipment and the facilities and the computational resources that power all of these applications. Where it lucky to be at at the ground floor of all of this and having built a GPU cloud business in a matter of couple months where usually this takes many years. It's not decades to build, has been quite humbling for me to see and I've been super thrilled with the team that we've assembled to help pull these offerings together and deliver that offering to the market. And we've been overwhelmed by the amount of demand and interest from generative AI companies, large tech companies, research institutions, and all types of different end users that we've seen our demand and forecast over to Nvidia really skyrocket from couple thousand GPUs to now we're deploying thirty thousand plus GPUs before mid twenty twenty four. We started Applied Digital two and a half years ago, and we've seen that business really skyrocket and and grow into something that is hardly resentment of where we started on day one. Site computing is no different. We've only been added for for four or five months, but we've already seen a lot of traction in the market. And we've basically built up a whole new business segment within applied digital in a matter of months and we're super excited to see what the future holds for this business segment, but also for applied digital. Broadly.
Part of this channel
Applied Digital
News, updates, and expert insights from Applied Digital.