Inside Tesla’s FSD: Patent Explains How FSD Works

By Karan Singh
Not a Tesla App

Thanks to a Tesla patent published last year, we have a great look into how FSD operates and the various systems it uses. SETI Park, who examines and writes about patents, also highlighted this one on X.

This patent breaks down the core technology used in Tesla’s FSD and gives us a great understanding of how FSD processes and analyzes data.

To make this easily understandable, we’ll divide it up into sections and break down how each section impacts FSD.

Vision-Based

First, this patent describes a vision-only system—just like Tesla’s goal—to enable vehicles to see, understand, and interact with the world around them. The system describes multiple cameras, some with overlapping coverage, that capture a 360-degree view around the vehicle, mimicking but bettering the human equivalent.

What’s most interesting is that the system quickly and rapidly adapts to the various focal lengths and perspectives of the different cameras around the vehicle. It then combines all this to build a cohesive picture—but we’ll get to that part shortly.

Branching

The system is divided into two parts - one for Vulnerable Road Users, or VRUs, and the other for everything else that doesn’t fall into that category. That’s a pretty simple divide - VRUs are defined as pedestrians, cyclists, baby carriages, skateboarders, animals, essentially anything that can get hurt. The non-VRU branch focuses on everything else, so cars, emergency vehicles, traffic cones, debris, etc. 

Splitting it into two branches enables FSD to look for, analyze, and then prioritize certain things. Essentially, VRUs are prioritized over other objects throughout the Virtual Camera system.

The many data streams and how they're processed.
The many data streams and how they're processed.
Not a Tesla App

Virtual Camera

Tesla processes all of that raw imagery, feeds it into the VRU and non-VRU branches, and picks out only the key and essential information, which is used for object detection and classification.

The system then draws these objects on a 3D plane and creates “virtual cameras” at varying heights. Think of a virtual camera as a real camera you’d use to shoot a movie. It allows you to see the scene from a certain perspective.

The VRU branch uses its virtual camera at human height, which enables a better understanding of VRU behavior. This is probably due to the fact that there’s a lot more data at human height than from above or any other angle. Meanwhile, the non-VRU branch raises it above that height, enabling it to see over and around obstacles, thereby allowing for a wider view of traffic.

This effectively provides two forms of input for FSD to analyze—one at the pedestrian level and one from a wider view of the road around it.

3D Mapping

Now, all this data has to be combined. These two virtual cameras are synced - and all their information and understanding are fed back into the system to keep an accurate 3D map of what’s happening around the vehicle. 

And it's not just the cameras. The Virtual Camera system and 3D mapping work together with the car’s other sensors to incorporate movement data—speed and acceleration—into the analysis and production of the 3D map.

This system is best understood by the FSD visualization displayed on the screen. It picks up and tracks many moving cars and pedestrians at once, but what we see is only a fraction of all the information it’s tracking. Think of each object as having a list of properties that isn’t displayed on the screen. For example, a pedestrian may have properties that can be accessed by the system that state how far away it is, which direction it’s moving, and how fast it’s going.

Other moving objects, such as vehicles, may have additional properties, such as their width, height, speed, direction, planned path, and more. Even non-VRU objects will contain properties, such as the road, which would have its width, speed limit, and more determined based on AI and map data.

The vehicle itself has its own set of properties, such as speed, width, length, planned path, etc. When you combine everything, you end up with a great understanding of the surrounding environment and how best to navigate it.

The Virtual Mapping of the VRU branch.
The Virtual Mapping of the VRU branch.
Not a Tesla App

Temporal Indexing

Tesla calls this feature Temporal Indexing. In layman’s terms, this is how the vision system analyzes images over time and then keeps track of them. This means that things aren’t a single temporal snapshot but a series of them that allow FSD to understand how objects are moving. This enables object path prediction and also allows FSD to understand where vehicles or objects might be, even if it doesn’t have a direct vision of them.

This temporal indexing is done through “Video Modules”, which are the actual “brains” that analyze the sequences of images, tracking them over time and estimating their velocities and future paths.

Once again, heavy traffic and the FSD visualization, which keeps track of many vehicles in lanes around you—even those not in your direct line of sight—are excellent examples.

End-to-End

Finally, the patent also mentions that the entire system, from front to back, can be - and is - trained together. This training approach, which now includes end-to-end AI, optimizes overall system performance by letting each individual component learn how to interact with other components in the system.

How everything comes together.
How everything comes together.
Not a Tesla App

Summary

Essentially, Tesla sees FSD as a brain, and the cameras are its eyes. It has a memory, and that memory enables it to categorize and analyze what it sees. It can keep track of a wide array of objects and properties to predict their movements and determine a path around them. This is a lot like how humans operate, except FSD can track unlimited objects and determine their properties like speed and size much more accurately. On top of that, it can do it faster than a human and in all directions at once.

FSD and its vision-based camera system essentially create a 3D live map of the road that is constantly and consistently updated and used to make decisions.

Tesla Holiday Update Wishlist - Charging & Safety Edition

By Karan Singh
Not a Tesla App

As December approaches, Tesla’s highly anticipated Holiday update draws closer. Each year, this eagerly awaited software release transforms Tesla vehicles with new features and festive flair. If you’re not familiar with Tesla’s holiday updates, take a look at what Tesla has launched in the Holiday update the past few years.

While leaked features like Blind Spot Monitoring While Parked hint at thoughtful improvements, the real magic lies in the unexpected. From potential features such as the Apple Watch app to a smart assistant, the possibilities are endless.

For this chapter in our series, we’re dreaming up ways Tesla could improve the charging experience and even add some additional safety features. So let’s take a look.

Destination State of Charge

Today, navigating to a destination is pretty straightforward on your Tesla. Your vehicle will automatically let you know when and where to charge, as well as for how long. However, you’ll likely arrive at your destination at a low state of charge.

Being able to set your destination state of charge would be an absolute game-changer for ease of road-tripping. After all, the best EV to road trip in is a Tesla due to the Supercharger network. It looks like Tesla may be listening. Last week, Tesla updated their app and hinted at such a feature coming to the Tesla app. A Christmas present, maybe?

Battery Precondition Options

While Tesla automatically preconditions your battery when needed for fast charging, there are various situations where manually preconditioning the battery would be beneficial.

Currently, there is no way to precondition for third-party chargers unless you “navigate” to a nearby Supercharger. If you need to navigate to a Supercharger that’s close by, the short distance between your location and the Supercharger will also not allow enough time to warm up the battery, causing slower charging times.

In Europe, you can navigate to and precondition for Qualified Third Party Chargers, but not for unlabelled ones.

Live Activities

While we already mentioned Live Activities in the Tesla app wishlist, they’d be especially useful while Supercharging. Live Activities are useful for short-term information you want to monitor, especially if it changes often — which makes them perfect for Supercharging, especially if you want to avoid idle fees.

Vehicle-to-Load / Vehicle-to-Home Functionality

The Cybertruck introduced Tesla Power Share, Tesla’s name for Vehicle-to-Home functionality (V2H). V2H allows an EV to supply power directly to a home. By leveraging the vehicle’s battery, V2H can provide backup power during outages and reduce energy costs by using stored energy during peak rates.

Tesla Power Share integrates seamlessly with Tesla Energy products and the Tesla app. We’d love to see this functionality across the entire Tesla lineup. Recently a third party demonstrated that bidirectional charging does work on current Tesla vehicles – namely on a 2022 Model Y.

Adaptive Headlights for North America

While Europe and China have had access to the Adaptive Headlights since earlier this year, North America is still waiting. The good news is that Lars Moravy, VP of Vehicle Engineering, said that these are on their way soon.

Blind Spot Indication with Ambient Lighting

Both the 2024 Highland Model 3 Refresh and the Cybertruck already have ambient lighting features, but they don’t currently offer a practical purpose besides some eye candy. So why not integrate that ambient lighting into the Blindspot Warning system so that the left or right side of the vehicle lights up when there’s a vehicle in your blind spot? Currently, only a simple red dot lights up in the front speaker grill, and the on-screen camera will also appear with a red border when signaling.

Having the ambient lighting change colors when a vehicle is in your blind spot would be a cool use of the technology, especially since the Model Y Juniper Refresh and Models S and X are supposed to get ambient lighting as well.

Tesla’s Holiday update is expected to arrive with update 2024.44.25 in just a few short weeks. We’ll have extensive coverage of its features when it finally arrives, but in the meantime, be sure to check out our other wishlist articles:

How Tesla’s “Universal Translator” Will Streamline FSD for Any Platform

By Karan Singh
Not a Tesla App

It’s time for another dive into how Tesla intends to implement FSD. Once again, a shout out to SETI Park over on X for their excellent coverage of Tesla’s patents.

This time, it's about how Tesla is building a “universal translator” for AI, allowing its FSD or other neural networks to adapt seamlessly to different hardware platforms.

That translating layer can allow a complex neural net—like FSD—to run on pretty much any platform that meets its minimum requirements. This will drastically help reduce training time, adapt to platform-specific constraints, decide faster, and learn faster.

We’ll break down the key points of the patents and make them as understandable as possible. This new patent is likely how Tesla will implement FSD on non-Tesla vehicles, Optimus, and other devices.

Decision Making

Imagine a neural network as a decision-making machine. But building one also requires making a series of decisions about its structure and data processing methods. Think of it like choosing the right ingredients and cooking techniques for a complex recipe. These choices, called "decision points," play a crucial role in how well the neural network performs on a given hardware platform.

To make these decisions automatically, Tesla has developed a system that acts like a "run-while-training" neural net. This ingenious system analyzes the hardware's capabilities and adapts the neural network on the fly, ensuring optimal performance regardless of the platform.

Constraints

Every hardware platform has its limitations – processing power, memory capacity, supported instructions, and so on. These limitations act as "constraints" that dictate how the neural network can be configured. Think of it like trying to bake a cake in a kitchen with a small oven and limited counter space. You need to adjust your recipe and techniques to fit the constraints of your kitchen or tools.

Tesla's system automatically identifies these constraints, ensuring the neural network can operate within the boundaries of the hardware. This means FSD could potentially be transferred from one vehicle to another and adapt quickly to the new environment.

Let’s break down some of the key decision points and constraints involved:

  • Data Layout: Neural networks process vast amounts of data. How this data is organized in memory (the "data layout") significantly impacts performance. Different hardware platforms may favor different layouts. For example, some might be more efficient with data organized in the NCHW format (batch, channels, height, width), while others might prefer NHWC (batch, height, width, channels). Tesla's system automatically selects the optimal layout for the target hardware.

  • Algorithm Selection: Many algorithms can be used for operations within a neural network, such as convolution, which is essential for image processing. Some algorithms, like the Winograd convolution, are faster but may require specific hardware support. Others, like Fast Fourier Transform (FFT) convolution, are more versatile but might be slower. Tesla's system intelligently chooses the best algorithm based on the hardware's capabilities.

  • Hardware Acceleration: Modern hardware often includes specialized processors designed to accelerate neural network operations. These include Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs). Tesla's system identifies and utilizes these accelerators, maximizing performance on the given platform.

Satisfiability

To find the best configuration for a given platform, Tesla employs a "satisfiability solver." This powerful tool, specifically a Satisfiability Modulo Theories (SMT) solver, acts like a sophisticated puzzle-solving engine. It takes the neural network's requirements and the hardware's limitations, expressed as logical formulas, and searches for a solution that satisfies all constraints. Try thinking of it as putting the puzzle pieces together after the borders (constraints) have been established.

Here's how it works, step-by-step:

  1. Define the Problem: The system translates the neural network's needs and the hardware's constraints into a set of logical statements. For example, "the data layout must be NHWC" or "the convolution algorithm must be supported by the GPU."

  2. Search for Solutions: The SMT solver explores the vast space of possible configurations, using logical deduction to eliminate invalid options. It systematically tries different combinations of settings, like adjusting the data layout, selecting algorithms, and enabling hardware acceleration.

  3. Find Valid Configurations: The solver identifies configurations that satisfy all the constraints. These are potential solutions to the "puzzle" of running the neural network efficiently on the given hardware.

Optimization

Finding a working configuration is one thing, but finding the best configuration is the real challenge. This involves optimizing for various performance metrics, such as:

  • Inference Speed: How quickly the network processes data and makes decisions. This is crucial for real-time applications like FSD.

  • Power Consumption: The amount of energy used by the network. Optimizing power consumption is essential for extending battery life in electric vehicles and robots.

  • Memory Usage: The amount of memory required to store the network and its data. Minimizing memory usage is especially important for resource-constrained devices.

  • Accuracy: Ensuring the network maintains or improves its accuracy on the new platform is paramount for safety and reliability.

Tesla's system evaluates candidate configurations based on these metrics, selecting the one that delivers the best overall performance.

Translation Layer vs Satisfiability Solver

It's important to distinguish between the "translation layer" and the satisfiability solver. The translation layer is the overarching system that manages the entire adaptation process. It includes components that analyze the hardware, define the constraints, and invoke the SMT solver. The solver is a specific tool used by the translation layer to find valid configurations. Think of the translation layer as the conductor of an orchestra and the SMT solver as one of the instruments playing a crucial role in the symphony of AI adaptation.

Simple Terms

Imagine you have a complex recipe (the neural network) and want to cook it in different kitchens (hardware platforms). Some kitchens have a gas stove, others electric; some have a large oven, others a small one. Tesla's system acts like a master chef, adjusting the recipe and techniques to work best in each kitchen, ensuring a delicious meal (efficient AI) no matter the cooking environment.

What Does This Mean?

Now, let’s wrap this all up and put it into context—what does it mean for Tesla? There’s quite a lot, in fact. It means that Tesla is building a translation layer that will be able to adapt FSD for any platform, as long as it meets the minimum constraints.

That means Tesla will be able to rapidly accelerate the deployment of FSD on new platforms while also finding the ideal configurations to maximize both decision-making speed and power efficiency across that range of platforms. 

Putting it all together, Tesla is preparing to license FSD, Which is an exciting future. And not just on vehicles - remember that Tesla’s humanoid robot - Optimus - also runs on FSD. FSD itself may be an extremely adaptable vision-based AI.

Latest Tesla Update

Confirmed by Elon

Take a look at features that Elon Musk has said will be coming soon.

More Tesla News

Tesla Videos

Latest Tesla Update

Confirmed by Elon

Take a look at features that Elon Musk has said will be coming soon.

Subscribe

Subscribe to our weekly newsletter