Welcome, Together AI!
perspectives / Portfolio

Welcome, Together AI!

Rob Keith and Emily Zhao
March 13, 2024
  • Founders: Vipul Ved Prakash, Ce Zhang, Chris Ré, Percy Liang
  • Sector: Artificial Intelligence
  • Location: San Francisco, CA

The Opportunity

2023 was a year of incredible growth and innovation in open-source AI. The year started with a meaningful gap in model performance between open-source and closed-source models, but the paradigm quickly shifted when Meta introduced Llama in February, catalyzing the emergence of a host of more powerful open-source models like Falcon and Mistral/Mixtral. At Salesforce Ventures we’re big believers in open source, having worked with and invested in numerous companies that focus on open-source, including Hugging Face, Airbyte, Astronomer, dbt, Starburst, and MuleSoft. 

We believe the AI world will be hybrid in the future, and that companies will utilize both closed-source and open-source technologies. Furthermore, having a powerful model is no longer enough. To fully unlock adoption, we need an entire stack built around the models to enable the right balance between performance and cost. With this context in mind, we searched for the right team with a bold vision that could build the best infrastructure for training and running open-source models in production, which is why we’re beyond excited to announce that we’re leading a new round in Together AI. 

The Solution

Together AI is capturing the market at exactly the right moment—open source is taking off and the need for good, performant, cost effective GPU infrastructure is more acute than ever. While the GPU hardware supply shortage is a prominent factor, it’s not the only factor that contributes to the question of “why now.” 

The GPU supply crunch makes it difficult for companies from all market segments to get their hands on GPUs. This shortage is exacerbated for companies that don’t have long-term relationships or big contracts with hyperscalers or cloud providers, given that’s where most GPU supply currently resides. 

Even for companies that do have these relationships, the ability to train and serve models in production well is a different set of challenges. There are many layers of complexity when it comes to running training and inference workloads to ensure criteria users care about, such as latency, throughput, lower failure rate during training, and effective training cycles. As such, there’s a growing desire for an abstraction layer to automate training, finetuning, and deployment so customers don’t have to spin up their own infrastructure or hire their own teams with backgrounds in AI or systems engineering.

Together AI captures customers’ end-to-end needs for AI workloads through its full-stack approach. The company offers both compute and software packaged in products that vary in volume of software abstraction on top of the hardware compute resources. For customers that want dedicated clusters for training (or inference), Together AI offers GPU Clusters comprised of high-end compute and networking, plus a software layer that incorporates the optimizations needed for users to run AI workloads efficiently without needing to manage infrastructure. The Together AI team has also led the development of efficient kernels like FlashAttention for training, and a number of techniques to speed up inference of transformer models that can extract 2-8 times more work from GPUs for generative AI workloads. 

For customers that want an additional layer of abstraction, typically for model deployments, Together AI supports an extensive list of open-source models (e.g., Llama-2, Code Llama, Falcon, Vicuna, Mistral/Mixtral, SDXL, etc.) as well as its own models (e.g. RedPajama, StripedHyena, Evo) that are at the frontier of model architecture research. Together AI will also soon support the option for customers to bring their own models. Customers can host these models using dedicated API instances built on top of the Together Inference Engine optimized with open-source and proprietary techniques. What’s more, Together AI will soon be able to deploy in customers’ own cloud environments. The ability to bring their own models and deploy in their own environments will open up numerous use cases for enterprise customers. 

Together AI also offers fine-tuning as part of this product, which is an attractive value-add for open source use cases. For customers who have specific needs, they can choose to build their own Custom Models, leveraging the Together AI software and hardware stack as well as support from their technical team on finding the optimal data design, architecture, and training recipe.

Why We’re Backing Together AI

We first met the CEO and co-founder of Together AI, Vipul Ved Prakash, back in the summer of 2023. It didn’t take long to realize Vipul had built something special, starting with an outstanding founding team. Both Vipul and co-founder Chris Ré are serial entrepreneurs with previous successful exits. Chris and fellow co-founder Percy Liang are also the leading researchers in AI model architectures and infrastructure. Ce Zhang (CTO and co-founder) is an expert in decentralized and distributed training/learning as well as data-centric ML Ops. The team’s combined expertise and experience suggest they can build proprietary IP around optimizing the AI infrastructure stack, as well as scale talent and functional teams.

We believe Together AI’s “full-stack” product approach is the right approach to the open-source AI market. By offering a suite of products that vary in how close they are to the underlying hardware—from dedicated clusters to serverless endpoints—Together AI captures demand from all types of workloads: training, fine-tuning, and inference. There is no inference without training or fine-tuning, and we think customers will increasingly want to deploy where they fine-tune and train their models, especially given some of the features that the Together AI team are working on to increase reliability, privacy, and security. 

As training and inference workloads ebb and flow, Together AI can move the freed up GPU resources across different types of workloads, allowing the company to scale efficiently. In this sense, Together AI straddles two distinct categories of companies that have emerged in this new AI era: specialized GPU cloud providers and serverless endpoint / LLM Ops platforms. We like this approach, as it provides differentiated software on top of hardware when compared to pure GPU cloud providers. This approach also provides more control over hardware utilization and optimizations compared to pure serverless endpoint providers / LLM Ops companies.

Together AI is also, at its core, an AI research company. Two of Together AI’s co-founders are bona fide luminaries in the AI research community. In addition to having Ce Zhang as the CTO, Together AI also brought onboard Tri Dao, the creator of FlashAttention and FlashAttention-2, as Chief Scientist in summer 2023. The power behind Together AI’s products is vested in the company’s ability to rapidly bring research innovations to production. 

In the spirit of open-source, Together AI has been prolific in publishing some of the optimization techniques (FlashDecoding, Medusa, Cocktail SGD) and model research (RedPajama, StripedHyena) that are garnering a lot of attention from the broader AI community. The research and models are great tools to drive demand to Together AI’s platform, creating a flywheel for growth. Together AI’s continuous research capability differentiates its platform from other GPU Cloud/serverless endpoint platforms and hyperscalers, cementing a longer-term technical moat. 

What’s ahead?

Amid the litany of 2024 AI predictions that have been circulating online, one common thread we keep seeing is how 2024 will be the year that customers go from training models to deploying models in production. While we believe this is directionally correct, we suspect that process to be much slower and more gradual than people anticipate. There’s still additional customer education to be done and infrastructure for model deployment needs to continue to improve. 

Together AI is not the only company that has realized this—the field of GPU cloud/serverless endpoints platforms keeps expanding to fill the need for better infrastructure. This entrepreneurial enthusiasm speaks to the immense market opportunity founders see in this space. Vipul and his team have a bold long-term vision that requires precise and vigorous execution to accomplish. We are confident they will realize their vision, and Salesforce Ventures will be there to support them every step of the way.

We’re thrilled to partner with Vipul, Ce, Chris, Percy and the rest of the Together AI team. Together, we will bring the fastest and most performant cloud in generative AI to more  users who are pushing the frontier of AI development. Welcome to the Salesforce Ventures family!