House Flipping with AI Predictions - Case from Zillow
At the core of any successful online platform is the provision of relevant, accurate, and timely information. For e-commerce sites, this includes product details, pricing, seller information, and customer reviews—all of which reduce uncertainty for online shoppers. Similarly, platforms like Zillow provide potential buyers with essential property data, such as prices, features, images, and neighborhood insights, empowering users to make informed decisions. The importance of data has long been recognized, and with the rise of machine learning, online platforms are now able to offer highly sophisticated features. For instance, Zillow launched Zillow Offers in 2019, a program utilizing machine learning-based predictive modeling to enhance the home buying and selling experience, further solidifying the role of technology in real estate transactions.
Zillow Offers
How does the program work?
- Homeowners interested in the program first enter their property details on the platform.
- Zillow uses its predictive model to generate a preliminary offer.
- Sellers are not required to make any binding commitments until a Zillow representative conducts an in-person evaluation.
- Once the transaction is finalized, Zillow handles the repairs, and after completing these repairs, the home is relisted on the platform for resale.
The concept behind Zillow Offers mirrors the role of a market maker in financial markets. A market maker is a company that provides liquidity by continuously quoting buy and sell prices for an asset, profiting from the difference between the two prices, known as the bid-ask spread. Market makers thrive by offering immediate trade execution to less patient investors, typically managing short-term positions that are often squared off within a single trading day. This rapid turnover minimizes risk and enhances market efficiency.
Zillow sought to adapt this market-making model for its house-flipping program, aiming to generate profits in a similar manner. However, a key distinction lies in the duration that Zillow holds its inventory. Unlike financial assets that are typically traded within hours or days, Zillow’s holdings are far more long-term, with houses potentially remaining in their possession for months. This extended holding period introduces a significant risk factor that financial market makers generally do not face, as Zillow must account for both market fluctuations and the unpredictability of when a home will sell.
This longer holding period increased the uncertainty around the sale price, making Zillow’s house price prediction model extremely susceptible to incorrect predictions. The platform expected its machine learning model to be highly precise, thereby driving profitability from the program.
While the assumption that machine learning models can deliver highly accurate predictions may seem reasonable—especially given the success of house price prediction challenges on platforms like Kaggle—real-world forecasting is more complex. Many machine learning models excel at predicting outcomes based on historical data (with people able to predict house prices perfectly https://www.kaggle.com/c/home-data-for-ml-course/leaderboard), but their performance often diminishes when applied to future scenarios. For Zillow, the challenge was twofold: not only did it need to predict the eventual selling price of a home, but it also had to estimate the uncertain timing of when the house would sell. These dual uncertainties made the task of price prediction much more complicated than traditional models might suggest.
In this analysis, I explore the dynamics between home sellers, real estate platforms like Zillow, and other market participants, focusing on the role of information asymmetry in house price predictions. The aim of the analysis is to explain the functioning of the Zillow Offers program and argue regarding various outcomes that could have been expected in a simplistic manner.
Simple Model
Let there be two time periods $t \in 0,1$. At time 0 Zillow makes an offer $p_m$ to a house seller based on the machine learning price prediction model. However, the true market price is unknown to Zillow. Let the true market price be $p_t$ for each time period.
Assumptions
It is important to note that the predicted prices provided by platforms such as Zillow are based on data the home seller shares, which may not fully capture the true market value or may be distorted. Sellers typically possess more granular, firsthand knowledge about their properties’ condition and unique features, making their price estimates ($p_{st}$) to be closer to the true market price than those generated by a machine learning model. For this simple setting
Assumption 1: It is assumed that $p_{st}=p_t$
Hence, it is assumed that the seller has perfect knowledge on their house price.
Assumption 2: For simplicity I assume that both the seller and the platform Zillow have the same expected rate of change of house prices.
Furthermore, to simplify the analysis, Assumption 2 posits that both the seller and Zillow share the same expected rate of change in house prices. This assumption ensures that any discrepancies between the predicted price $p_m$ and the seller’s estimate $p_{st}$ stem solely from differences in information, rather than divergent expectations on future price trends. Under this assumption, both parties expect the house price in the next period to adjust by the same factor $p_{s0}(1+d)$, for the seller and $p_m(1+d)$ for the platform.
Assumption 3 : Existence of alternate housing flipping firms with information similar to the buyer.
Finally, Assumption 3 introduces the existence of competing local house-buying firms that, like buyers, are privy to the same set of information as the house sellers. These smaller, alternative firms—common in many U.S. regions—provide sellers with a competitive option and ensure that the market remains dynamic, with sellers always having the ability to explore multiple avenues for selling their properties.
Analysis
In this analysis, I explore Zillow’s profit dynamics under different conditions in a housing market where the platform’s price prediction model $p_m$ may either overestimate or underestimate the true market price $p_0$ of a property. The outcomes of these varying situations depend on the relationship between Zillow’s price prediction, seller’s objective of maximizing profits and the actual market value.
Case 1: $p_m < p_0$
When Zillow’s predicted price is lower than the true market value, the seller, armed with perfect knowledge, will reject Zillow’s offer.
Instead, the seller would either sell the house to a local firm, which has access to similar information, or hold out for a better offer from a buyer on a different platform. In this case, the transaction does not occur, and Zillow generates no profits, resulting in profits
$\pi = 0$.
Case 2: $p_m \ge p_0$
Time Period 0
When Zillow’s predicted price is higher than or equal to the true market value, the seller is likely to accept Zillow’s offer, as it exceeds what they would receive from a local firm with perfect knowledge. The transaction goes through, and Zillow acquires the house at price $p_m$. For simplicity, I assume the cost of repairs (which Zillow typically undertakes before reselling) is negligible and can be ignored in this analysis.
Time Period 1
At this point, the market price for the house is realized as $p_1$.
Scenario 1: $p_1 \ge p_m$
If the market price increases between time periods, Zillow makes a profit equal to the difference between the selling price at time 1 and the purchase price at time 0, or $p_1 - p_m$.
Scenario 2: $p_0 < p_1 < p_m$
If the market price rises but not enough to exceed Zillow’s higher purchase price $p_m$ Zillow experiences a loss, since the house was acquired at a price greater than its current market value. The loss in this case is $p_1 - p_m$
Scenario 3: $p_1 < p_0 < p_m$
If the market price falls between time periods, Zillow suffers an even greater loss, as it purchased the house at a higher price than its current market value, resulting in a loss of $p_1 - p_m$.
The overall profit function for Zillow under this setting is as follows:
Equation 1
$$
\pi = \begin{cases}
0, & \text{if } p_m < p_{s0} \\
p_1 - p_m, & \text{if } p_0 < p_m \leq p_1 \\
-(p_m - p_1), & \text{if } p_0 < p_1 < p_m \\
-(p_m - p_1), & \text{if } p_1 < p_0 < p_m
\end{cases}
$$
Finally, I introduce a parameter $\delta$ to represent the accuracy of Zillow’s machine learning model in predicting house prices.
$$p_m = p_{s0} + \delta $$
A higher error in the model—denoted by a larger value for this parameter—leads to greater losses in scenarios where the market price rises modestly (Scenario 2) or falls (Scenario 3). In Scenario 1, where market prices increase beyond Zillow’s purchase price, the model’s inaccuracies will result in lower profits for the platform. It’s important to note that if the model’s error is less than zero (i.e., the predicted price is significantly below the true market price), no transaction would occur, making this case irrelevant. Therefore, only situations where the error parameter is greater than zero are of concern.
A more accurate machine learning model (i.e., one with a smaller error parameter $\delta$) would reduce the risk of substantial losses for Zillow, improving its ability to predict price movements and acquire properties at favorable prices. However, this setup still leaves Zillow vulnerable to extreme losses, particularly in cases where house prices decline. In such scenarios, the platform can only generate profits if house prices rise sharply in the subsequent period, making it highly sensitive to market fluctuations. This highlights the inherent risk platforms like Zillow would face in relying on a predictive model in volatile markets.
Outcome of Zillow Offers
In November 2021, Zillow made the decision to shut down its Offers program, incurring a significant financial loss of $400 million in a single quarter and a total of $881 million in that year. This led to Zillow laying off 25% of its workforce. Zillow’s competitor, Opendoor, which operates with a similar house-flipping business model, faced comparable struggles, posting a staggering $1.4 billion loss for 2022 and a $275 million loss in the next year, with revenue plummeting by 55%. These financial results from key players in the house-flipping industry raise questions about the long-term sustainability of this business model. The persistent losses suggest that the current setup—where platforms like Zillow and Opendoor purchase homes and rely heavily on predictive models for pricing and timing—may be inherently flawed. A major challenge is the information asymmetry between the platforms and sellers, which skews risk and pricing, making profitability difficult to achieve. This situation calls into question whether such a model can be adjusted or reimagined to ensure more sustainable operations in the future.
Solution
Could such platforms make changes to the business model to obtain better outcomes?
Operating at scale and providing quick preliminary price estimates are crucial steps for real estate platforms aiming to accelerate the house selling process. While machine learning models play an essential role in generating these initial price predictions, the model’s accuracy is vital to minimizing potential losses, as demonstrated in the previous analysis. Overprediction of house prices can lead to significant losses, while underprediction may result in missed sales opportunities. To strike a balance, platforms could adopt some of the following ways to obtain a more accurate predictive model:
- A modified mean squared error loss function that assigns different weights to over- and under-predictions, allowing for more nuanced pricing decisions.
- Additionally, the pricing model could be enhanced by incorporating three key components:
- Inherent Value of the House: This would reflect the property’s value based on historical data, independent of market trends.
- Forecast for Time to Sell: This aspect models the expected time to sell a property, akin to a “time-to-event” model, helping estimate how long it may take to find a buyer.
- Market Trend Forecasts: A time series-based model that forecasts regional market trends, allowing for adjustments in pricing based on local market conditions.
- Regularly updating the model with fresh data is essential for ensuring its predictions remain accurate and relevant.
- Incorporating alternative data sources—such as property images or detailed seller profiles—into the machine learning framework could provide additional context, further refining the price estimate.
While a more accurate machine learning model will undoubtedly reduce losses, it is important to note that this business model may still face challenges in sustaining profitability. A potential solution could involve experimenting with different pricing strategies or adjusting the way offers are made to ensure that the platform remains financially viable while still providing value to sellers and buyers alike.
Eliciting Willingness to Accept Solution
A potentially viable model for real estate platforms could involve modifying the current system to better elicit the true value of a house from the seller. In the traditional approach, the platform provides a price estimate to the seller, who then decides whether to accept or reject the offer. However, a more effective strategy might involve letting the seller state their minimum acceptable price upfront. This quoted price would then be compared to the price predicted by the platform’s machine learning model (which is not revealed to the seller). If the seller’s quoted price is below the model’s predicted price, the platform would purchase the house at the seller’s quoted price; otherwise, no transaction would occur.
The price the seller quotes, denoted as $p_q$, can be analyzed through several scenarios:
Scenario 1
If the seller knows the true value of the house ($p_0$) but quotes a price lower than $p_0$ ($p_q < p_0$), the seller would not benefit from quoting a price lower than the true market value, given that alternative buyers are likely willing to pay $p_0$. Therefore, the seller would never quote a price below $p_0$.
Scenario 2
If the seller quotes a price higher than the true value ($p_q > p_0$), several sub-cases arise:
Case 1: $p_m > p_q > p_0$
In this case, the platform would accept both prices $p_0$ and $p_q$, but the seller would earn more by quoting.
Case 2: $p_q > p_m > p_0$
The platform would only accept the price $p_0$, and the seller gains nothing by quoting a higher price.
Case 3: $p_q > p_0 > p_m$
The platform would not accept either quoted price, and the seller would be indifferent to their choice of price.
The only scenario where quoting a price higher than the true value ($p_0$) benefits the seller is Case 1, where the platform would accept both $p_0$ and $p_q$. However, if the machine learning model has a low error rate and the seller expects the platform’s predictions to be fairly accurate, it would be more prudent for the seller to quote the true value of the house ($p_0$), rather than attempting to overprice it. For simplicity, I assume that the seller quotes $p_0$—the true market value.
Profits of the platform under this setting are as follows:
$$
\pi = \begin{cases}
0, & \text{if } p_m < p_{0} \\
p_1 - p_0, & \text{if } p_0 < p_m \leq p_1 \\
p_1 - p_0, & \text{if } p_0 < p_1 < p_m \\
-(p_0 - p_1), & \text{if } p_1 < p_0 < p_m
\end{cases}
$$
Taking the difference of equation 2 from equation 1 we obtain the following equation:
$$
\pi = \begin{cases}
0, & \text{if } p_m < p_{s0} \\
p_m - p_0, & \text{if } p_0 < p_m \leq p_1 \\
p_m - p_0, & \text{if } p_0 < p_1 < p_m \\
-p_0 +2*p_1 + p_m, & \text{if } p_1 < p_0 < p_m
\end{cases}
$$
As highlighted, the proposed model represents a significant improvement over the current system. Under this new design, the platform is able to generate strictly higher profits compared to the status quo. Notably, the platform experiences losses only when house prices are falling over time—an outcome that is less severe than under the current system. In the traditional model, the platform required a significant increase in house prices to ensure profitability. In contrast, the new approach enables platforms to generate profits even in markets with modestly rising or stable house prices, thus improving overall business sustainability.
Limitations/Improvements
- The seller is assumed to have perfect knowledge on the true market price of the house; this assumption may be very simplistic. Furthermore, the existence of local buying companies with perfect knowledge might also be very simplistic.
- The model assumes that both the platform and seller have the same expectation of price change, again this assumption may be very simplistic.
- I make an assumption of the sellers quoting the true value of the house and not overreporting. A more richer model could have incorporated seller beliefs about the accuracy of the house prediction model as well as the risk preferences of the seller to come up with a more richer model.
- This setting does not account for the existence of competing machine learning based house flipping buyers. It would be interesting to observe competing buyers under different scenarios such as:
- Both have accurate machine learning price prediction models
- One is an inaccurate price prediction model and the other an accurate one.
- Both have inaccurate price prediction models
- Competing buyers with different market share and sizes would be an interesting case as well to explore.
- The analysis assumed that the buyer either accepts or rejects the offer from Zillow only once. It does not consider the case where the buyer rejects in time period 0 but may again try to sell the house in a future period to obtain a more favorable price. Exploring such scenarios could potentially generate several unique insights.
- A policy maker would be more interested in the total surplus arising from the program. In the status quo approach the house sellers are likely to be enjoying higher surplus, it is also likely that the different proposed approach may ensure that the total surplus remains unchanged with the seller’s surplus falling and buyers (Zillow) surplus increasing. However, a stable business offering may have several other positive downstreaming economic implications.
Conclusion
Based on the reasonable assumptions outlined in the analysis, Zillow and similar house-flipping platforms can reduce their losses or even generate profits by implementing simple design changes alongside machine learning-based pricing systems. A key takeaway from the analysis is that while improvements in predictive accuracy are important, they may not be as transformative as expected. Instead, platforms may benefit more from exploring design-based approaches that optimize the interaction between predicted prices and seller expectations.
However, it is important to recognize the limitations of the assumptions made in this analysis. These assumptions may be oversimplified or inaccurate, hence the results should be interpreted with caution. Further analysis and refinement of these models are necessary to fully understand their applicability in dynamic, real-world markets.