Lessons in Product Strategy from Tesla’s approach to autonomous driving
Tesla’s history with their Full Self Driving (FSD) autonomous driving solution is one marked by grand claims and proclamations. Since inception, Elon Musk has repeatedly stuck his neck out, stating and restating the imminent arrival of fully autonomous cars on the road. They have staked a lot on this capability, recognising the value it can provide to the world. It would enable a robotaxi network of millions of cars that can run without human intervention at a fraction of the cost, delivering cheap and abundant, and immensely profitable travel. Yet for all the promise, 9 years later we are still waiting for results - so much so that many doubt it is even possible.
However, recent footage released of their new solution, set for the roads with a public release this autumn, shows something that could be genuinely worthy of excitement. 45 minutes with just a single intervention, handling complex manoeuvres through previously-unseen situations with human-like calm assuredness. And it is achieved using end-to-end machine learning, trained on a large set of driving data over 10 billion miles strong.
The route they took may have been scenic, but it was by no means leisurely. Along the way, there is a story of failure, self-reflection, pivots in approach, thinking success was assured only to find they took a wrong turn, ending up at multiple dead ends. Yet despite all the bumps in the road, they might actually be nearing their destination, and behind it there are lessons in product strategy from which we can all learn.
A brief history of Full Self Driving
I will try to quickly recap on the story to date.
2014-2018
Tesla’s first step was to equip cars with hardware necessary to gather driving data. At a time when all the other competition in this new domain opted for Lidar sensors, Tesla opted for an inexpensive array of small cameras fitted to the exterior of their vehicles, plus hardware capable of processing large swathes of images in real time. At the time, they relied heavily on pre-planned routes and local mapping - a system that was very much “running on rails”, only operating in small areas of the United States, and only capable of certain tasks without driver intervention.
It was here that they hit their first roadblock along the way - they had built a system which worked only in very limited circumstances and was in no way scalable. Very basic scenarios like building work would trip it up. To progress further to a general solution that works across an entire country, they would have to end their reliance on local mapping, and create a system that could think on its own feet, perceive the world around it and improvise based on what it sees.
2019-2021
Yet by now, they had accrued something close to 1 billion miles of driving data, and had seen the value machine learning could provide when working with large datasets. So they set about taking the real world driving data it had collected in the past couple of years, and employed a massive team of data labellers to help train a model to look at images, recognise lane markings, stop signs and other cars on the road.
To help it make correct decisions, a team of software engineers created and maintained a codebase which fuelled a real-time decision making engine, and rolled that out to highways, which are substantially less complex than city streets. Data they collected and later shared showed that its usage gave rise to substantially lower crash rates than those who don’t use it. This was a big success as a driver assistance feature - people were able to travel hundreds of miles with minimal interventions, but was still a far cry from a full ‘Level 5’ autonomous vehicle, and it only functioned on highways, which don’t apply to most taxi journeys.
They were acutely aware that to fully realise their strategy, they would have to deliver a system for city streets which was capable of handling the complex and intricate streets of inner cities. And they also knew Rome was not built in a day. So they released a small beta program for city streets with heavy caveats. Only drivers with a safety score of 100 were invited to participate. Drivers must keep their hands on the wheel at all times, and be ready to take over. They were also invited to share the data for any manual interventions with the development team, to help train the model further.
The product improved incrementally over time as drivers across the country tested it, having to intervene constantly at first, to perhaps a handful of times per drive a year or so later. Slowly access to the beta software grew to those with a safety score of 99, 95, 90, and so on, until available to anyone who paid for it.
But Tesla found themselves at another roadblock. While the system relied on manual labelling inputs of raw images, there would always be a limit on the number of different scenarios they could hope to cater for. The real world is messy and there are an infinite number of different cases and edge cases, for which a finite team of humans could never hope to cater. Yet by this point their dataset had grown to around 5 billion miles of driving data, and if they could find a way to fully utilise it, they could remove the reliance on human labellers and unlock the power of their training data.
2022
So they embarked on a major rewrite, converting raw images into a 3d vector space, and training a machine learning model to perceive road markings, lanes, traffic signs, and other cars on the road itself from scratch, with no manual input from humans as to what it is seeing. This massive internal project was completed in November 2022, and the difference in the performance of the feature was stark. Cars were now able to perceive roads in the United States, even if it was a road for which Tesla had no data. In some sense of the word the cars were able to see the world around it. Perhaps controversially, they swiftly laid off the entire labelling team. Andrej Karpathy, Tesla’s head of AI at the time, also left the business shortly after, as he felt the problem had largely been solved.
To date, there are over 300,000 cars on the road in the United States participating in the beta program, with 0 recorded crashes over 2 years since its inception. Yet if you watch any youtube video of an FSD beta tester running the software, you will know it is liable to make some strange decisions at times. Their decision making engine - currently 300k lines of code - is still liable to fall over given the right edge case scenario. As mentioned earlier, the real world is messy, and it is hard to make the cars interact with it properly using a series of if -> then statements. The scale of edge cases are just too large, and robotaxis will not have a human present ready to take over if one comes along they hadn’t catered for.
2023
So over the course of 2023, they have curated videos from their vast training data as illustrations of good practice, and fed it into their model, having it teach itself what to do in those scenarios. This process is ongoing, but when they ship it to the public in the autumn, they will be in a state where their customers’ cars take in raw images of the road, perceive them in 3 dimensions plus time, and make sensible decisions to get passengers from A to B based on what it sees. No crutches, no expensive sensors, no reliance on maps. It doesn’t even have to be connected to the internet.
And when they release it to the public, it will come without the word ‘beta’ attached to it, as FSD V12. It will not be perfect, but it will be capable of learning from its mistakes, and capable of proliferating those learnings to every other car running this software on the road. And with millions of Tesla vehicles on the roads worldwide, you can expect it to learn fast. What will be deemed safe enough to satisfy the regulators that no human supervision is needed is yet to be determined. It will no doubt need to be safer than a human driver, but how much? 10 times safer? That is yet to be seen. But for the first time, it seems that with enough training data and computing power, their solution might be capable of clearing whatever bar is set for them.
So what can we learn from this strategy?
1 - Fail fast and repeatedly
Tesla’s history with autonomous driving is one littered with failed approaches, dead ends and local maximums. Yet each failure gave rise to something which worked better than before; each local maximum an insight into the next mountain to climb. They had a grand vision for the future, but their strategy over the years took the form of smaller, more incremental challenges to overcome and problems to solve. It was through action, failure, and self-reflection that they were able to perceive those next challenges, and apply the lessons learned from the past to help them overcome them.
2 - Set up your data pipeline
Without data none of these improvements would be possible. As product managers we become obsessed with delivering on customer and business outcomes. And for any given outcome there is always a metric which we can try to move to push us along towards a given outcome. It is imperative when building products to measure your outcomes with tangible metrics, grounded in real world data. Find scalable ways to gather that data, and continuously observe it in the real world, hypothesising and acting on ways to improve it.
3 - Meet outcomes as best you can now, but continuously investigate and invest in the future
Stephen Wilson wrote in 2016 about how companies can ready themselves for emergent disruptive technologies, outlining a variation on the Three Horizons model first set out in The Alchemy Of Growth in 1999.
In this model, businesses are encouraged to work with core technologies we use on a day-to-day basis at this moment in time. But at the same time, they should spend time and resources to investigate potential opportunities for growth, and to invest in those which have been validated as viable for a particular outcome. In other words, work with what is possible now for your customers, while at the same time probing what might be possible in the future.
Tesla’s early attempts at an autonomous driving solution can be seen as meeting horizon one - leaning heavily on manual processes and limited scope, doing enough to make the driving experience better than it was before. But in parallel they investigated and invested in the once-nascent machine learning technology, seeing its potential for this very application. Now, as the power of machine learning increases exponentially, both in terms of the effectiveness of its software and the computing power of its hardware, they are in prime position to capitalise.
We can all learn something from this approach. It is better to focus on a small number of outcomes, and deliver on them as best you can, while never resting on one’s laurels, than it is to treat it as one in a long list of features your product needs. Outcomes rarely change drastically over time, rather it is the way you solve for them and how that in turn impacts your customers’ experience which does, and those who don’t evolve constantly over time get left behind.