Over the past seven months, our team has met with more than 50 researchers and founders in robotics — from those building robotics foundation models (RFMs), to full-stack robots, and the tooling that supports them. Robotics is by far one of the hardest technical disciplines we’ve ever encountered, and there’s more to learn every day.
Through our conversations and independent research, one thing has become clear: unlike in the world of AI, there are no widely agreed-upon benchmarks for evaluating robotic capabilities and performance yet. This lack of standardization is a challenge echoed by nearly every founder and robotics investor we’ve spoken with. AI and robotics luminary Fei-Fei Li told us she’s waiting for just one investor to write the definitive guide on how to assess robotics startups.
Well, Fei-Fei — challenge accepted.
In our first article in our robotics series, we explained why the timing is finally right for a robotics breakthrough into the mainstream. This second piece explains how we intend to support such a robotics breakthrough by explaining our investment framework for robotics companies.
As we’ve said before, robotics is a long game. Building truly capable, safe, and scalable robotics systems will require patient capital, a long-term view, and deep conviction. By sharing our robotics investment framework, we aim to help equip our fellow investors with strategies and principles to ensure the best robotics companies get funded, and the entire robotics ecosystem thrives.
Without further ado, here are the 7 criteria we believe every investor should consider when evaluating robotics companies.
1) The Right Talent in Each Discipline
One reason robotics is such a technically complex domain is that so many variables and disciplines need to come together for it to work. Robotics is an AI problem, a model/robot control policy problem, a hardware design problem, a manufacturing problem, a supply chain problem, and an engineering problem. Decisions around strategy and hiring need to be made early for many of these disciplines, and pivoting could result in a lot of wasted time and effort.
For example, if you’re a robotics foundation model company building a generalist model that can tackle both dexterous manipulation and robust locomotion, you would need to decide early on:
Your data collection strategy (which we discussed at length in the previous article) given manipulation and locomotion often require different types of data,
Your main training strategy — are you focusing on adaptive reinforcement learning or imitation learning, or both?
Your hardware strategy — are you going to build your own custom hardware to collect data or partner with hardware providers?
What are the first skills and use cases you are going to build for, and how will you sequence them?
The list goes on, but just based on these few questions, this specific company might need to hire early for:
An AI research leader who has foundation model training (and maybe an RL background),
An experienced roboticist/robotics researcher who has locomotion experience (and maybe also someone who knows how to use simulation data),
An experienced roboticist/robotics researcher who has manipulation experience (and likely someone who has built a teleoperation team before) ,
A hardware expert who can do robotics hardware design, and
An operations leader who has managed and scaled supply chain and manufacturing before.
Further, looking at the pure pedigree of the team often isn’t enough. As investors, we also should be considering the philosophy and long-term vision that these leaders / key hires bring to the table. Given how quickly the industry is moving (as is every industry touched by AI), leaders may come from a prestigious academic institution or a well-known robotics company, but still possess an outdated view of the industry. After all, robotics as an industry has been around since the 1960s, but the discipline has evolved countless times since then. Investors should assess whether company leaders are embracing the latest innovations in robotics and AI. Are they building the model architecture to be modern and generalizable? Are they using the latest training methodologies? Are they thinking 5-10 years ahead?
Given the technical ambiguity in robotics, first principles thinking is a must.
2) Demo Videos
Demos are usually the best way to assess a robot’s capability. However, robotics companies have a long, notorious history of producing misleading demo videos of their product. To discern true capability — especially for non-technical individuals — it helps to ask a few critical questions about the demo you’re watching:
Is it staged? Staged demos can hide the robot’s limitations. Ensure that the demo reflects a realistic and dynamic environment, not a controlled and static one. Also be wary of quick cuts or camera skips in the video, as this could be further evidence the demo is staged. One way to discern if a video is staged is if items are cleverly placed in the environment (e.g., evenly spaced out, or all facing a certain direction). Items in a demo can be positioned in a way that makes it easier for the robot to pick them up or navigate around them (i.e., the robot has previously mapped them out). This can give a false impression of the robot’s capabilities. Look for scenarios where items are randomly placed or where elements in an environment are constantly moving to see how the robot handles real-world conditions. This will give you a sense of how robust the robot’s capabilities are.
Is the video sped up? Many videos willbe sped up to some extent because achieving low latency is quite challenging due to constraints on hardware, compute, and AI model/software. Some companies will be candid and clearly label when the video is sped up and some will not.
Is it partially teleoperated? In video demos, the robots can also be teleoperated — sometimes only partially — with the teleoperator off-camera so it leads the unsuspecting viewer to think the robot is fully autonomously doing something complex.
When possible, see the robots in person — it’s one thing to view a demo video but another to witness in-person how robust the robot’s capabilities are. Can the robot recover when interfered with? Does it only work in specific environments or can it navigate unseen landscapes and scenarios? It helps to see with your own eyes vs. watching it in a demo video.
3) Performance, Quality, and Robustness
Unlike LLMs, where standardized benchmarks like MMLU, HumanEval, or SWE-Bench (for coding) provide a common yardstick, robotics has no universally accepted framework for comparing models across companies. This makes it harder to assess how “good” a model really is. As an investor or evaluator, it’s important to understand how a company defines success and measures performance, both internally and against peers. If a company is focused on teaching their robots a certain task (e.g., pick and place), you might ask:
How many objects can the robot pick and place in an hour? How has this improved over time?
Is human intervention required?
How long can the robot operate autonomously before human intervention is needed (the question behind the question is, can the robot do long-horizon tasks autonomously)?
What is the task success rate or, conversely, the failure rate?
When we discussed model benchmarking with various robotics teams, most of them categorized the above questions into three pillars:
Quality: How accurately can the robot do a certain task? (e.g., if a robot is folding linens for a hotel chain, how accurately is the robot following the guidelines laid out by the hotel?).
Throughput: In the same use case, how many linens can the robot fold in a given shift? Is this similar to human speed or faster?
Robustness: Does the robot require human intervention? Can it perform long horizon tasks without going out of distribution (and thus requiring human intervention)? Note that a human shift is 8 hours and involves breaks, whereas a robot who has robustness can work continuously for two or three shifts (e.g., 24/7) provided they have the requisite battery power / energy access.
These kinds of metrics not only reveal the robot’s current capabilities, but also give insight into the pace of model and systems improvements over time — a key signal of a robotics team’s research velocity and feedback loop maturity.
4) Training and Deployments
The time and effort required to train a robot for specific tasks can significantly impact deployment cycles and time to market. Here are a few key points to consider when evaluating training and robotic deployments:
How long does it take to train the robot for a specific task? How long does it take to train for incremental tasks (e.g., a similar task vs. a net new task)? For example, how quickly can a robot that has been trained to fold napkins learn to start folding laundry? What about something that has nothing to do with clothes, like preparing food, or laying bricks? A robot that requires extensive training to get to the next incremental task is not as cost-effective or scalable, and is also indicative of a lack of generalizability. Conversely, a robot that can quickly learn incremental tasks exemplifies a strong robotics foundation model.
Has the robot been trained in this environment before? Repeated exposure to an environment can improve a robot’s performance. Understanding the training process and the robot’s history in different environments is crucial for assessing its capabilities. If a robot is doing a known task in a new environment that it hasn’t seen before, can the robot adapt, or is additional training required?
How long will it take to fully deploy the robot and what is the process involved? Many robotic solutions fail due to unsuccessful or unscalable deployment (we discussed this matter in detail in the previous article). Proper deployment requires a lot of coordination, either from the company’s in-house team or a system integrator. In both cases, deployment demands a meaningful investment of time and resources from both the vendor and the customer.
Ask about the company’s deployment process: How long does it take from pilot to production? How much customization is required? What support infrastructure is in place? Most importantly, make sure the company has a scalable deployment model so they’re not trapped in a cycle of one-off integrations that can’t scale beyond a handful of robots per customer, or a handful of customers.
And once again, if deployments take too long, it could be that the model is not adaptable or generalizable.
One of the founders we spoke to aptly termed the relevant metrics to measure around training and deployment as time-to-first-useful-task at a new site and config time per new environment.
5) Understanding What’s Under the Hood
As robotics research evolves rapidly, it’s crucial to understand what teams have built under the hood — and whether they’re incorporating the latest advancements to strengthen their software and models. Consider the following:
What is the underlying model architecture? What are the pros and cons of the design choices? We find it beneficial to dig deeper into the architecture behind a company’s robotic models. Some may rely on more classical control systems or legacy architectures to guarantee safety, reliability, and performance, especially in production environments. While this can help with early deployment and risk mitigation, it may come at the cost of flexibility and long-term scalability. If this is the case, try to understand the team’s long-term plan to scale their model and make sure they aren’t too dependent on outdated architecture.
Conversely, end-to-end learning systems or foundation models may offer greater generalization and long-term upside, but they’re harder to train (and to scale up the data needed to train them), and can be harder to validate, interpret, and deploy safely in the real world. Understanding the architectural tradeoffs gives you a window into the company’s strategy and the technical bets they’re making.
What is the data collection strategy, and how are the models improving? For companies developing their own models — whether large foundation models or more task-specific architectures — it’s critical to understand the nature and source of their data. What’s the mix of real-world vs. simulation data? Are they collecting demonstrations, using teleoperation, or leveraging simulation environments?
More importantly: Is their data strategy creating a durable advantage? Proprietary, high-quality, or uniquely structured datasets can become a long-term moat, especially in robotics where edge cases are abundant and data is expensive to generate. The best companies generate a strong data feedback loop between deployment and training.
How much data was required to achieve the capabilities of the model today? Robotics is inherently capital-intensive, and these models are extremely data-hungry. Ask the team how much data it took to reach their current level of performance (e.g., a few hundred hours or sessions, or more), and more importantly, how they plan to scale that data going forward. This gives you a sense of both the efficiency of their learning pipeline and the capital requirements ahead. Understanding the team’s data strategy can help you gauge future burn and whether they’re building toward a self-improving system or simply brute-forcing their way with expensive human-labeled data.
6) Market Validation and Customer Traction
Investing in robotics presents an interesting challenge to investors: How do we properly assess companies that don’t necessarily have the customer traction/validation that we can use for diligence? Robotics companies in their early phases will spend a massive amount on R&D, pushing back the milestones/timelines VCs typically expect. At Series A, a robotics company may only have pilot customer agreements. When valuations are high and commercial progress is minimal, how do we wrap our heads around investing in these startups? Here are a few questions to ask that can provide insight:
What problem is the company solving for customers, and what is the primary ROI customers achieve through this solution? This question is key for any investment. But when we’re looking at robotics in particular, understanding the ROI/pain point is crucial, as robots typically require either heavy capex investment or significant resources for proper deployment. Oftentimes, the ROI will equate to the labor the robot is replacing. For example, a robot may be automating the work of two full-time employees in a warehouse who earn $60k annually, equating to $120k in savings per year.
In areas with labor shortages and aging populations, we’re seeing robotics companies do meaningful work in industries where it’s difficult to hire and retain talent (the “dull, dirty, dangerous jobs”). Integrating robotics solutions into a company’s operations may even unlock greater productivity through further process optimization.
When investing in companies that don’t yet have live customers, ask customers in the pipeline what this problem costs their company annually? How much would the customer save by addressing this pain point? How much are they willing to pay for this robotic solution? This will help you triangulate the actual value the robotics company is bringing to customers.
How large is the market? Again, the TAMs in robotics are massive, and robotics companies can command such high valuations because of their potential to scale. For more vertical-specific/use case-specific approaches, the market may be limited, so companies that are focused on robots that can do multiple tasks in an environment, and are increasingly generalizable, expand the company’s TAM exponentially.
If possible, talk to customers and partners of the robotics provider. Ask them what a pilot phase looks like? What does success look like? What’s the long-term vision for these robots — how many do they expect to use them in their environment? Having clear answers to these questions will help you more deeply understand the procurement/evaluation process the robotics company will face from customers.
7) Nuances Around Robotics Companies’ Hardware Approach
We believe it’s important for robotics companies to work closely with hardware, whether that’s building custom hardware in-house or working with third-party hardware providers. The feedback loop between the robot model/control policy and hardware needs to be tight to build a great robotic system. Models need to learn the morphology of the robot hardware to translate sensory input into action output that’s translated into joint-space commands.
Given that companies building robotics foundation models are primarily focused on model training, it’s smart to assess how closely they’re working with the hardware. Are they collecting data directly on the hardware they plan to deploy the model with? Is there someone on the team who understands hardware?
For companies that are building a full-stack robotics company that has both the model / control policy and hardware — why have they decided to build their own hardware? Are they using off-the-shelf parts? Is their approach more margin-accretive or is it for deploying in a special use case where their hardware is better than existing options in the market? Is the hardware design optimal for the use case/environment they’re building for? For example:
Cobot chose a wheeled base for its robot, Proxie, rather than a legged one because logistics, manufacturing, and healthcare environments typically have flat floors, making wheels the more optimal choice. The team says this approach increases uptime, reduces deployment and maintenance cost, and accelerates scale across similar sites.
Figure has designed their humanoid with legs with the stated aspiration of entering consumers’ homes.
As we mentioned in the previous article, there are existing companies that are building great hardware at very affordable prices, so startups who decide to build their own should have a good reason for pursuing this strategy.
_
If you found this list helpful, download our Robotics Company Evaluation Checklist: a one-page guide to help assess the technical depth, scalability, and long-term viability of robotics investments.
Now that we’ve developed a framework for how to assess these companies, we can zoom out and ask: what areas of robotics are most exciting to us, and where do we see the greatest potential for outsized impact and returns?
We believe robotics is one of the largest emerging markets to gain exposure to. We’re excited to invest in companies building robotics foundation models (RFMs), full-stack robots, as well as the tooling that supports robots and the application layer aimed at addressing specific use cases.
Within the RFM realm, we’re energized by the progress Physical Intelligence is making in dexterous manipulation, and by the work Skild is doing to automate high-impact tasks across services and industrial sectors using both locomotion and manipulation skills. Generalist AIis also pushing the frontier of manipulation, recently showcasing abilities to quickly pick and sort small, thin objects from clutter (e.g., fasteners) and handle articulated or deformable objects over a long horizon with precision (e.g., folding a paper box). These abilities come from a model that transfers across different robotics arms with different degrees of freedom, and generalizes well to new environments.
While each of these companies may take a different approach to model architecture and data strategy (as we discussed at length in the previous article), we believe the future of robotics will likely involve a combination of models working together to perform complex tasks in real-world environments.
Companies taking a more vertically integrated, full-stack approach — building their own AI models and software alongside custom hardware — also play a crucial role. Companies like Cobot and Dyna have focused on specific applications of robotics to automate tasks. As previously mentioned, Cobot’s Proxie is designed for material transport in logistics, manufacturing, and healthcare settings. Dyna is building general-purpose robots designed to master high-throughput manipulation skills that generalize across environments and industries. The company is launching its own Dyna-designed robot platform to deliver production-grade performance at customer sites, combining frontier model research with a hardware team experienced in design and mass production.
These companies are all building their own hardware for specific (and different) reasons, but all represent a significant step toward robots that can support multiple use cases in dynamic, semi-structured environments involving humans — moving beyond the narrow, use-case-specific robotics that rely on classical autonomy techniques.
Given how complex the robotics space is, we also want to recognize the startups that’re contributing their innovation to open-source. Our portfolio company, Hugging Face, has long been an advocate for open-source, and that extends into the realm of robotics. They recently acquired French startup Pollen Roboticsto open‑source its humanoid robot, Reachy 2, with both the code and hardware designs now freely available. Pollen Robotics has introduced two new open‑source humanoid robots since then — Reachy Mini (desktop version) and HopeJR (full-size) — with designs, software, and assembly instructions open to developers. Pollen also released the LeRobot library in 2024, hosting robotics models, datasets, and tools on its Hub. K-Scale Labsis another project worth mentioning. They’re developing K‑Bot, an open‑source humanoid robot, striving to make robotics affordable, accessible and accelerate widespread adoption, just as Unitree has in China. Their roadmap includes open‑source stacks — from ML and vision-language-action frameworks to Rust-based robot OS and whole-body control — and ambitious autonomy targets. Overall, it wouldn’t surprise us if open-source becomes a key lever for growth in robotics, purely because of how many distinct and sophisticated disciplines have to work well together to deploy AI and software into the physical world. Robotics benefits from the virtuous cycle of innovating together.
The Robotics Imperative
It’s important to mention that we feel an imperative to invest in this category. As digital AI advances quickly, it will be the physical world that gets left behind from a technological standpoint. While AI models augment white collar workers across software engineering, customer support, data analysis, and countless other jobs, physical labor remains a fairly untapped area for automation. We need to invest in this category to enable more innovation and unlock greater human potential.
As AI models become more advanced, we see technical moats becoming less and less of a differentiation in the software world — while remaining a moat in the physical world. Given Claude Opus 4.1’s current capabilities in coding tasks, how much more advanced can we expect the model to become in 2-3 years time? Will there be true product differentiation when widely available coding agents can help competitors reach feature parity relatively quickly? It’s a mind-bending scenario in some ways — we need a separate blog to explore it. But the second order effect of that paradigm is that implementing AI in the physical world will become a harder problem than implementing AI in the digital world.
There are multiple entry points into the physical AI market. Companies may have a foundation model-only approach or they may build proprietary hardware alongside their AI model. The tooling layer — data providers, simulation platforms, etc. — are equally as interesting as they’re the harbinger of what’s working or not. We think there will be multiple winners in each category and there’s massive amounts of value to be created at each layer of the robotics stack.
The promise of a unified foundation model in robotics is in its ability to serve the widest possible market — powering any form factor and any use case. Such a model could support both the thousands of robots deployed today and the more complex, multi-purpose systems of tomorrow, including humanoids. In a world where robots become mainstream, a generalized robot policy will be critical to unlocking mass automation.
Building Our Robotic Future
Robotics is no longer science fiction — it’s a frontier of real commercial value, defensibility, and societal impact.
From companies automating high-friction tasks in manufacturing and logistics to ambitious generalist approaches powered by foundation models to emerging humanoid platforms adapting to human-designed environments, we’re witnessing a new phase of intelligence that doesn’t just think, but moves.
The promise of this market lies not only in automating labor but in creating systems that continuously learn and improve through real-world deployment. These are long-arc, technical businesses, but the compounding data advantage and depth of integration with physical environments offer real moats that many software-only businesses will increasingly struggle to maintain.
As investors, we have both an imperative and a market opportunity to accelerate this future. AI is already augmenting the digital economy — now, it’s time to bring that same momentum to the physical world.
_
We welcome all feedback on our views above. And if you’re building in robotics today, we’d love to connect and learn more about your business. Email Emily Zhao at emily@salesforceventures.com and Pascha Hao at pascha@salesforceventures.com.
This article is part of our series on robotics. Additional resources can be found here:
AI audio tools for creators, media, and businesses.
Continue Reading Welcome, ElevenLabs!
by
Nowi Kallen, Sam Ackah-Yensu and Jessica Bartos
Salesforce Ventures Guide to Dreamforce
Introducing the Salesforce Ventures’ Guide to Dreamforce, a resource to help you navigate the largest and most trusted AI event in the world. We’ve curated Dreamforce’s 1.5k+ events, roundtables, workshops, breakouts, and trainings to highlight the events we believe will be most impactful to founders and startups. This is Dreamforce — Salesforce Ventures-style! Download our
Continue Reading Salesforce Ventures Guide to Dreamforce