Real estate abstract
An automated real estate appraisal system (100) and method generates
estimates of real estate value using a predictive model such as
a neural network (908). The predictive model (908) generates these
estimates based on learned relationships among variables describing
individual property characteristics (905) as well as general neighborhood
characteristics at various levels of geographic specificity (906).
The system (100) may also output reason codes indicating relative
contributions (1009) of various variables to a particular result,
and may generate reports (701) describing property valuations, market
trend analyses, property conformity information, and recommendations
regarding loans based on risk related to a property.
Real estate claims
What is claimed is:
1. A computer-implemented process for appraising a real estate
property, comprising the steps of:
collecting training data;
developing a predictive model from the training data;
storing the predictive model;
obtaining individual property data for the real estate property;
generating a signal indicative of an appraised value for the real
estate property responsive to application of the obtained individual
property data to the stored predictive model;
developing an error model from the training data;
storing the error model; and
generating a signal indicative of an error range for the appraised
value responsive to application of the individual property data
to the stored error model.
2. The computer-implemented process of claim 1, wherein the error
model comprises a regression model.
3. A computer-implemented process for appraising a real estate
property, comprising the steps of:
collecting training data;
developing a predictive model from the training data;
storing the predictive model;
obtaining individual property data for the real estate property;
generating a signal indicative of an appraised value for the real
estate property responsive to application of the obtained individual
property data to the stored predictive model;
developing a lower percentile error model from the training data;
developing an upper percentile error model from the training data;
storing the lower percentile error model;
storing the upper percentile error model;
generating a signal indicative of a lower bound value for the real
estate property responsive to application of the obtained individual
property data to the stored lower percentile error model; and
generating a signal indicative of an upper bound value for the
real estate property responsive to application of the obtained individual
property data to the stored upper percentile error model.
4. The computer-implemented process of claim 3, wherein:
the lower percentile error model is a computer-implemented neural
network; and
the upper percentile error model is a computer-implemented neural
network.
5. A computer-implemented process for appraising a real estate
property, comprising the steps of:
obtaining individual property training data describing past real
estate sales;
aggregating the obtained property training data into area training
data sets, each area training data set describing a plurality of
sales within a geographic area;
developing a predictive model from the training data;
storing the predictive model;
obtaining individual property data for the real estate property;
and
generating a signal indicative of an appraised value for the real
estate property responsive to application of the obtained individual
property data to the stored predictive model.
6. The computer-implemented process of claim 5, wherein the step
of aggregating is repeated using successively larger geographic
areas until the number of sales within the geographic area over
a predetermined time period exceeds a predetermined number.
7. A computer-implemented process for appraising a real estate
property, comprising the steps of:
collecting training data;
performing the iterative substeps of:
applying input data to a predictive model;
ranking output data produced thereby responsive to a measure of
quality; and
adjusting operation of the model responsive to the results of the
ranking substep;
storing the predictive model;
obtaining individual property data for the real estate property;
and
generating a signal indicative of an appraised value for the real
estate property responsive to application of the obtained individual
property data to the stored predictive model.
8. The computer-implemented process of claim 7, wherein the predictive
model comprises a computer-implemented neural network having a plurality
of interconnected processing elements, each processing element comprising:
a plurality of inputs;
a plurality of weights, each associated with a corresponding input
to generate weighted inputs;
combining means, coupled to the weighted inputs, for combining
the weighted inputs; and
a transfer function, coupled to the combining means, for processing
the combined weighted inputs to produce an output.
9. A computer-implemented process for appraising a real estate
property, comprises the steps of:
selecting a geographic area surrounding the real estate property;
obtaining area data for the geographic area;
collecting training data;
developing a predictive model from the training data;
storing the predictive model;
obtaining individual property data for the real estate property;
and
generating a signal indicative of an appraised value for the real
estate property responsive to application of the obtained individual
property data and the obtained area data to the stored predictive
model.
10. The computer-implemented process of claim 9, further comprising
the steps of:
developing an error model from the training data; storing the error
model; and
generating a signal indicative of an error range for the appraised
value responsive to application of the individual property data
to the stored error model.
11. The computer-implemented process of claim 10, wherein the error
model comprises a regression model.
12. The computer-implemented process of claim 9, further comprising
the steps of:
developing a lower percentile error model from the training data;
developing an upper percentile error model from the training data;
storing the lower percentile error model;
storing the upper percentile error model;
generating a signal indicative of a lower bound value for the real
estate property responsive to application of the obtained individual
property data to the stored lower percentile error model; and
generating a signal indicative of an upper bound value for the
real estate property responsive to application of the obtained individual
property data to the stored upper percentile error model.
13. The computer-implemented process of claim 12, wherein:
the lower percentile error model is a computer-implemented neural
network; and
the upper percentile error model is a computer-implemented neural
network.
14. A computer-implemented process for appraising a real estate
property, comprising the steps of:
collecting training data;
developing a predictive model from the training data;
storing the predictive model;
obtaining individual property data for the real estate property,
the individual property data comprising a plurality of elements;
generating a signal indicative of an appraised value for the real
estate property responsive to application of the obtained individual
property data to the stored predictive model; and
for each element of the individual property data:
determining a relative contribution of the element to the appraised
value;
determining from each relative contribution a reason code value;
and
generating a signal indicative of the reason code value.
15. A system for appraising a real estate property, comprising:
a predictive model for determining an appraised value for the real
estate property;
training data input means, coupled to the predictive model, for
obtaining training data;
training data aggregation means, coupled to the training data input
means, for aggregating the training data into training data sets,
each training data set describing a plurality of sales within a
geographic area;
a model development component, coupled to the predictive model,
for training the predictive model from the training data;
a storage device for storing the trained predictive model;
individual property data input means, coupled to the predictive
model, for obtaining individual property data and sending the individual
property data to the predictive model;
area data input means, coupled to the individual property data
input means and to the predictive model, for selecting a geographic
area surrounding the real estate property, obtaining area data,
and sending the area data to the predictive model; and
an output device, coupled to the predictive model, for generating
a signal indicative of the appraised value.
16. The system of claim 15, wherein the predictive model comprises
a neural network.
17. The system of claim 15, further comprising:
an error model for determining an error range for the appraised
value;
and wherein:
the training data input means is coupled to the error model;
the model development component trains the error model from the
training data;
the storage device stores the trained error model;
the individual property data input means is coupled to the error
model and sends the individual property data to the error model;
the area data input means is coupled to the error model and sends
the area data to the error model; and
the output device generates a signal indicative of the error range.
18. The system of claim 17, wherein the error model comprises a
regression model.
19. The system of claim 15, further comprising:
a lower percentile error model for determining an lower bound for
the appraised value;
an upper percentile error model for determining an upper bound
for the appraised value;
and wherein:
the training data input means is coupled to the error model;
the model development component trains the lower percentile error
model and the upper percentile error model from the training data;
the storage device stores the trained lower percentile error model
and the trained upper percentile error model;
the individual property data input means is coupled to the lower
percentile error model and the upper percentile error model, and
sends the individual property data to the lower percentile error
model and the upper percentile error model;
the area data input means is coupled to the lower percentile error
model and the upper percentile error model and sends the area data
to the lower percentile error model and the upper percentile error
model; and
the output device generates a signal indicative of the lower bound
and the upper bound for the appraised value.
20. The system of claim 19, wherein:
the lower percentile error model comprises a neural network; and
the upper percentile error model comprises a neural network.
Real estate description
CROSS-REFERENCE TO RELATED APPLICATION
The subject matter of this application is related to the subject
matter of pending U.S. application Ser. No. 07/814,179, for "Neural
Network Having Expert System Functionality", by Curt A. Levey,
filed Dec. 30, 1991, the disclosure of which is incorporated herein
by reference.
The subject matter of this application is further related to the
subject matter of pending U.S. application Ser. No. 07/941,971,
for "Fraud Detection Using Predictive Modeling", by Krishna
M. Gopinathan et al., filed Sep. 8, 1992, the disclosure of which
is incorporated herein by reference.
37 C.F.R.1.71 AUTHORIZATION
A portion of the disclosure of this patent document contains material
which is subject to copyright protection. The copyright owner has
no objection to the facsimile reproduction by anyone of the patent
document or the patent disclosure, as it appears in the Patent and
Trademark Office records, but otherwise reserves all copyright rights
whatsoever.
BACKGROUND OF THE INVENTION
1. Field Invention
This invention relates generally to real estate appraisals and
sales price predictions. In particular, the invention relates to
an automated real estate appraisal system and method that uses predictive
modeling to perform pattern recognition and classification in order
to provide accurate sales price predictions.
2. Description of Related Art
The "appraised value" of a real estate parcel, or property,
comprises some estimate of the full market value of the property
on a specified date. A property's appraised value is of great importance
in many types of real estate transactions, including sales and loans.
Conventionally, appraised value is determined by a professional
appraiser using both objective and subjective factors. One disadvantage
of such a method is the difficulty in ensuring that the appraiser
conducts a neutral, unbiased analysis in arriving at the appraised
value. This difficulty is often compounded by the fact that the
appraiser may be retained and paid by an interested party in the
contemplated transaction, such as a lender, mortgage broker, buyer,
or seller.
In order to reduce bias and provide more accurate appraisals, statistical
techniques may be used to obtain an independent, consistent, mathematically
derived estimate of a property's value to assist an appraiser in
generating an appraised value. Traditional statistical techniques,
such as multiple linear regression and logistic regression, have
been tried, but such techniques typically suffer from a number of
deficiencies. One deficiency is the inability of traditional regression
models to capture complex behavior in predictor variables resulting
from nonlinearities and interactions among predictor variables.
In addition, traditional regression models do not adapt well to
changing trends in the data, so that automated model redevelopment
is difficult to implement.
One example of the difficulty of applying a regression model to
appraisal problems is the uncertainty as to the optimal temporal
and geographical sample size for model development. A model developed
using all homes in one square city block might theoretically be
an effective predictor for that particular neighborhood, but it
may not be possible to develop such a model with sufficient stability
and reliability, due to the relatively small sample size. On the
other hand, a model developed using all homes sold in the United
States in the past month might have a sufficiently large sample
size, but might be unable to capture local, neighborhood characteristics
to provide an accurate appraisal. Thus, a significant deficiency
of traditional regression modeling techniques when applied to real
estate appraisals is the inability to successfully model neighborhood
characteristics while including a sufficiently large sample size
to develop a robust, stable statistical model.
It is desirable, therefore, to have an automated system that uses
available information regarding real estate properties to provide
accurate estimates of value. Preferably, such a system should be
flexible enough to allow model development in a relatively small
geographic area, it should be able to handle nonlinearities and
interactions among predictor variables without advance specification,
it should have high predictive accuracy, and it should have capability
for redevelopment of the underlying system model as new patterns
of real estate pricing emerge.
SUMMARY OF THE INVENTION
In accordance with the present invention, there is provided an
automated system (100) and method for real estate appraisals, which
uses one or more predictive models such as neural networks (908)
to generate estimates of real estate value. The predictive models
(908) generate these estimates based on learned relationships among
variables describing individual property characteristics (905).
The models (908) also learn relationships between individual property
characteristics (905) and area characteristics (906). Area characteristics
(906) are stored and applied at a level of geographic specificity
that varies according to the amount of data available at each of
several successively larger geographic areas. In this way the models
(908) are able to capture local neighborhood characteristics without
unduly reducing sample sizes, which would reduce reliability and
predictability.
The learned relationships among individual property characteristics
(905) and area characteristics (906) enable the system (100) to
estimate the value of the property being appraised. Error models
(909) may also be provided to generate an estimated value range
or error interval for the sales price. The appraised value and error
estimate may then be provided as output (907) to a human decision-maker,
along with other related information such as: reason codes that
reveal the relative contributions of various factors to the appraised
value; and various measures of market trends. Finally, the system
(100) periodically monitors its performance, and redevelops the
models (908,909) when performance drops below a predetermined level.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of an implementation of the present invention.
FIG. 2 is a sample data entry screen that forms part of a typical
input/output interface for the present invention.
FIG. 3 is a sample quick data entry screen that forms part of a
typical input/output interface for the present invention.
FIG. 4 is a sample record selection screen that forms part of a
typical input/output interface for the present invention.
FIG. 5 is a sample sales price estimate screen that forms part
of a typical input/output interface for the present invention.
FIG. 6 is a sample area averages screen that forms part of a typical
input/output interface for the present invention.
FIG. 7 is a sample report produced by the present invention.
FIG. 8 is a flowchart illustrating the major functions and operation
of the present invention.
FIG. 9 is a block diagram showing the overall functional architecture
of the present invention.
FIG. 10 is a block diagram showing the property valuation process
of the present invention.
FIG. 11 is a flowchart showing a method of determining area and
obtaining area data according to the present invention.
FIG. 12 is a flowchart showing a method of generating reports according
to the present invention.
FIG. 13 is a flowchart showing a method of performing market trend
analysis according to the present invention.
FIG. 14 is a flowchart showing a method of determining property
conformity according to the present invention.
FIG. 15 is a flowchart showing a method of comparing an estimated
property value to user-specified values according to the present
invention.
FIG. 16 is a flowchart showing a method of generating recommendations
according to the present invention.
FIG. 17 is a diagram showing an example of geographic subdivision
according to the present invention.
FIG. 18 is a flowchart showing a method of aggregating individual
property data into successively larger geographical areas according
to the present invention.
FIG. 19 is a diagram showing an example of the relationship between
individual property characteristics and area characteristics.
FIG. 20 is a diagram of a single processing element within a neural
network.
FIG. 21 is a diagram illustrating hidden processing elements in
a neural network.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The Figures depict preferred embodiments of the present invention
for purposes of illustration only. One skilled in the art will readily
recognize from the following discussion that alternative embodiments
of the structures and methods illustrated herein may be employed
without departing from the principles of the invention described
herein.
Referring now to FIG. 1, there is shown a block diagram of a typical
implementation of a system 100 in accordance with the present invention.
The user supplies property data to system 100 via input device 105.
Central processing unit (CPU) 101 runs software program instructions,
stored in program storage 107, which direct CPU 101 to perform the
various functions of system 100. In the embodiment illustrated herein,
the software program is written in the Microsoft Excel macro language
and the ANSI C language. Each of these languages may be run on a
variety of conventional hardware platforms. Data storage 103 contains
data describing real estate properties, as well as regional data.
It also contains model parameters. In accordance with the software
program instructions, CPU 101 accepts input from input device 105,
accesses data storage 103, and uses RAM 102 in a conventional manner
as a workspace. CPU 101, data storage 103, and program storage 107
operate together to provide predictive neural network models 908
for real estate appraisal, as well as error models 909 for generating
error ranges for the appraised values. If desired, multiple models
908 and 909 may be used (for example, one for each geographic region),
particularly when property pricing characteristics vary widely from
region to region. After neural network models 908 and error models
909 process the information, as described below, to obtain estimates
of property value and error range, a signal indicative of the estimate
and error range is sent from CPU 101 to output device 104.
In the embodiment illustrated herein, CPU 101 can be a mainframe
computer or a powerful personal computer; RAM 102 and data storage
103 are conventional RAM, ROM and disk storage devices for the CPU;
and output device 104 is a conventional means for either printing
results based on the signals generated by neural network models
908 and error models 909, displaying the results on a video screen
using a window-based interface system, or sending the results to
a database for later access.
Referring now also to FIGS. 2 through 7, there are shown sample
screens from a conventional window-based interface system (not shown)
that forms part of output device 104. FIG. 2 shows data entry form
201 that allows the user to enter data describing a property for
appraisal. Form 201 is also known as a Uniform Residential Appraisal
Report (URAR) form. It contains a number of data fields 202. Scroll
bars 203 are provided to allow navigation throughout form 201.
FIG. 3 shows quick data entry form 301 that allows quick entry
of property data without using a complete URAR form 201. This form
is intended for use when a quick estimate of property value is required.
A number of fields 302 are provided, which represent a subset of
the fields 202 in URAR form 201. Data entered on URAR form 201 for
a particular property is automatically transferred to quick form
301, and vice versa.
FIG. 4 shows record selection screen 401 that allows the user to
select among previously-entered property records in order to view
URAR form 201 for the selected record. Record selection screen 401
lists a plurality of records 402, showing the address 403, city
404, map reference 405, sale price 406, assessor parcel number (APN)
group 407, ZIP code 408, and sale date 409 for each record. Scroll
bars 410 are provided to allow navigation throughout the list of
records, and selected records are indicated by highlighting 411.
FIG. 5 shows sales price estimate screen 501 that provides appraisal
information. Estimated sales price 502 is shown, along with lowest
typical sales price 503 and highest typical sales price 504. Also
shown are positive contribution factors 505 that tend to drive the
price of the property up, and negative contribution factors 506
that tend to drive the price of the property down.
FIG. 6 shows area averages screen 601 that shows average values
602 for several property criteria 603 for a selected geographic
area, alongside comparative values for a selected property 605.
Clicking arrow buttons 606 changes the level of geographic specificity,
according to the following sequence: neighborhood, local, extended,
region, and county. The example shows neighborhood values, representing
the average values for all properties sold in the same neighborhood
as the selected property, over a period of time prior to the selected
sale.
Referring now to FIG. 7, there is shown statistical review report
701 summarizing property information and estimated value, and providing
recommendations regarding loan processing with respect to the property.
This type of report would typically be used when system 100 of the
present invention is employed to appraise properties in connection
with loan processing. Identification portion 702 identifies the
loan, property, and appraisal to which report 701 pertains. Explanatory
portion 703 gives general explanatory information concerning the
report. Regional trend analysis portion 704 reports average sales
prices for the county and ZIP code in the four preceding semiannual
periods, indicating market stability and providing a broad foundation
for valuation and risk analysis. Local trend analysis portion 705
reports average sales prices for smaller geographic areas, such
as the census tract, map grid, and assessor parcel number (APN)
group, in the four preceding semiannual periods, indicating local
market stability and providing a further information useful for
valuation and risk analysis. Subject conformity portion 706 compares
sales price, square footage, and price per square foot for the property
with the norms for the neighborhood. Subject valuation portion 707
provides a value range for the property based on the characteristics
of the property and the region, and compares the value range with
an appraisal value determined by an independent human appraiser.
Subject valuation portion 707 also provides an indication of the
loan-to-value (LTV) ratio of the loan, and a comparison with a user-supplied
maximum LTV ratio. Summary and recommendations portion 708 summarizes
the information given in the other portions and recommends one of
"Proceed", "Caution", or "Suspend".
Referring now to FIG. 8, there is shown an overall flowchart illustrating
the major functions and operation of system 100. First neural network
models 908 are trained 801 using training data describing a number
of individual real estate properties, characteristics, and prices,
as well as area characteristics. If real estate pricing characteristics
vary widely from region to region, it may be advantageous to use
different models 908 for the different regions (counties, for example).
Once neural network models 908 are trained, neural network model
parameters are stored 802. Error models 909, which are typically
regression models, are then trained 803 using additional training
data and output from neural network models 908. Once error models
909 are trained, error model parameters are stored 804, and system
100 is able to estimate prices and pricing errors for a subject
property. System 100 obtains 805 property data describing the subject
property 905, as well as data describing the area in which the subject
property is situated 906. System 100 then applies 806 property data
905 and area data 906 to the appropriate stored neural network model
908. It then applies 807 property data 905 and area data 906 to
the appropriate stored error model 909. The models 908 and 909 estimate
sales price, reason codes (described below), and estimated error,
which are output 805 to the user, or to a database, or to another
system via output device 104.
Referring now to FIG. 9, the overall functional architecture of
system 100 is shown. System 100 is broken down into two major components:
model development component 901 and property valuation component
902. Model development component 901 uses training data 904 describing
a number of real estate properties, characteristics, and prices
to build neural network models 908 containing information representing
learned relationships among a number of variables. Together, the
learned relationships form models 908 of the behavior of the variables.
Although neural network models 908 are used in the embodiment illustrated
herein, any type of predictive modeling technique may be used, such
as regression modeling. For purposes of illustration, the invention
is described here in terms of neural network statistical models
908. Model development component 901 also uses training data 904
to develop error models 909, which are typically regression models
used to estimate error in predicted sales prices generated by neural
network models 908.
Property valuation component 902 feeds input data describing the
subject property 905 and its geographic area 906 to neural network
models 908 and error models 909. It obtains results from models
908 and 909 and generates price estimates, error ranges, and reason
codes. A report is prepared using this information, and the report
is output 907 either to a screen display, printer, or stored in
a database for future access.
Each of the two components 901 and 902 of system 100 will be described
in turn.
Model Development Component 901
Neural networks employ a technique of "learning" relationships
through repeated exposure to data and adjustment of internal weights.
They allow rapid model development and automated data analysis.
Essentially, such networks represent a statistical modeling technique
that is capable of building models 908 from data containing both
linear and non-linear relationships. While similar in concept to
regression analysis, neural networks are able to capture nonlinearity
and interactions among independent variables without pre-specification.
In other words, while traditional regression analysis requires that
nonlinearities and interactions be detected and specified manually,
neural networks perform these tasks automatically. For a more detailed
description of neural networks, see D. E. Rumelhart et al, "Learning
Representations by Back-Propagating Errors", Nature v. 323,
pp. 533-36 (1986), and R. Hecht-Nielsen, "Theory of the Backpropagation
Neural Network", in Neural Networks for Perception, pp. 65-93
(1992), the teachings of which are incorporated herein by reference.
Neural networks comprise a number of interconnected neuron-like
processing elements that send data to each other along connections.
The strengths of the connections among the processing elements are
represented by weights. Referring now to FIG. 20, there is shown
a diagram of a single processing element 2001. The processing element
receives inputs X.sub.1, X.sub.2, . . . X.sub.n, either from other
processing elements or directly from inputs to the system. It multiplies
each of its inputs by a corresponding weight w.sub.1, w.sub.2, .
. . w.sub.n and adds the results together to form a weighted sum
2002. It then applies a transfer function 2003 (which is typically
non-linear) to the weighted sum, to obtain a value Z known as the
state of the element. The state Z is then either passed on to another
element along a weighted connection, or provided as an output signal.
Collectively, states are used to represent information in the short
term, while weights represent long-term information or learning.
Processing elements in a neural network can be grouped into three
categories: input processing elements (those which receive input
data values); output processing elements (those which produce output
values); and hidden processing elements (all others). The purpose
of hidden processing elements is to allow the neural network to
build intermediate representations that combine input data in ways
that help the model learn the desired mapping with greater accuracy.
Referring now to FIG. 21, there is shown a diagram illustrating
the concept of hidden processing elements. Inputs 2101 are supplied
to a layer of input processing elements 2102. The outputs of the
input elements are passed to a layer of hidden elements 2103. Typically
there are several such layers of hidden elements. Eventually, hidden
elements pass outputs to a layer of output elements 2104, and the
output elements produce output values 2105.
Neural networks learn from examples by modifying their weights.
The "training" process, the general techniques of which
are well known in the art, involves the following steps:
1) Repeatedly presenting examples of a particular input/output
task to the neural network model;
2) Comparing the model output and desired output to measure error;
and
3) Modifying model weights to reduce the error.
This set of steps is repeated until further iteration fails to
decrease the error. Then, the network is said to be "trained."
Once training is completed, the network can predict outcomes for
new data inputs.
In the present invention, data used to train models 908 are drawn
from various database files containing data on individual properties.
These data are aggregated to obtain medians and variances across
geographic areas. Thus, models 908 are able to capture relationships
among individual property characteristics, as well as relationships
between individual property characteristics and the characteristics
of the surrounding geographic area.
Referring now to FIG. 19, there is shown an example of this technique.
House 1901 has associated with it individual property characteristics
1902, namely 2500 square feet, 3 bedrooms, and a 6000 square foot
lot. In order to provide effective predictive modeling of the estimated
selling price of house 1901, neural network models 908 use area
characteristics 1904 for geographic area 1903. The area characteristics
1904 are 2246.4 average square feet, 2.5 average bedrooms, 7267.2
average square foot lot size, and $267,000 average selling price.
These represent averages for homes sold in area 1903 in the last
x months, where x is a predetermined time period. By comparing individual
property characteristics 1902 with area characteristics 1904, neural
network models 908 are able to more effectively estimate the selling
price of house 1901.
An important factor in the effectiveness of neural network models
908 is the sample size used to train models 908. Conventional regression
models and model designs for real estate appraisal generally use
small samples in an attempt to provide a homogeneous group of properties
in the developmental sample. See J. Mark & M. Goldberg, "Multiple
Regression Analysis and Mass Assessment: A Review of the Issues",
The Appraisal Journal v. 56(1)., pp. 89-109 (1988), and H.-B. Kang
& A. Reichert, "An Empirical Analysis of Hedonic Regression
and Grid-Adjustment Techniques in Real Estate Appraisal", AREUEA
Journal v. 19, no. 1, pp. 70-91 (1991), the teachings of which are
incorporated herein by reference. For example, properties within
a single city block would generally provide effective predictor
models for capturing neighborhood characteristics within the block.
A problem with this approach is that a large number of distinct
models must be built. Since each model is created using a set of
training data describing properties within the associated city block,
an extremely large number of properties is required to effectively
train all the models.
On the other hand, use of larger geographic areas such as ZIP codes
results in diminished ability to capture local neighborhood characteristics.
The method of the present invention provides effective predictor
variables that preserve information describing neighborhood characteristics
without unduly increasing the number of models and predictor variables
required for training. It accomplishes this by aggregating individual
property data in the training data set into area characteristics
in a flexible manner, using the smallest geographic areas containing
sufficient data to produce reliable models 908. The models 908 are
thus able to capture area characteristics for relatively small geographic
areas where the data describing these characteristics are available.
Referring now to FIG. 17, there is shown an example of geographic
subdivision according to the present invention. Each region 1701
is divided into successively smaller geographic areas. In the example
shown, the geographic areas are ZIP codes 1702, census tracts 1703,
map coordinates 1704, and assessor parcel number (APN) groups 1705.
Other geographic areas, such as census blocks, or lot blocks, may
also be used.
Referring now also to FIG. 18, there is shown a flowchart of the
aggregation method. System 100 uses data describing real estate
sales activity for each month of a user-specified training period,
such as eighteen months. For each month within the training period,
system 100 performs the steps shown in FIG. 18. System 100 initially
defines 1804 the "neighborhood" as the smallest geographical
area, such as the APN group 1705. If there have been any sales in
the previous 12 months 1805, system 100 proceeds to step 1813. If
not, it defines 1806 the neighborhood as the next larger geographic
area, the map code 1704. If there have been at least 3 sales in
the previous 12 months 1807, system 100 proceeds to step 1813. System
100 continues to enlarge the definition of the neighborhood until
a predetermined minimum number of sales have occurred within a predetermined
period of time. The minimum number of sales and the period of time
in steps 1805, 1807, 1809, and 1811 may vary according to the optimal
sample size and geographic specificity required. In addition, the
number and size of the geographic areas may vary. Once the predetermined
minimum number of sales for a particular geographic area has been
met, system 100 determines 1813 medians, averages, and variances
for various property characteristics such as sales price, square
feet, number of bedrooms, etc.
Property characteristics used as predictor variables in the embodiment
illustrated herein include, for example:
PREDICTOR VARIABLES IN AREAS MODEL
AIR.sub.-- COND: type of air conditioning
BA.sub.-- FLCND: condition of bathroom floor
BA.sub.-- FLMAT: bathroom floor material
BA.sub.-- NUM: number of bathrooms
BA.sub.-- WNCON: condition of bathroom wainscot
BA.sub.-- WNMAT: bathroom wainscot material
BEDRMS.sub.-- N: number of bedrooms
FRPL.sub.-- NUM: number of fireplaces
FRPL.sub.-- TYP: type of fireplace
FL.sub.-- ZONE: flood zone?
FLR.sub.-- MAT: main floor material
FND.sub.-- INF: foundation infestation?
FND.sub.-- SETL: foundation settlement?
IMP.sub.-- TYPE: improvement type (attached, townhouse, etc.)
LANDSCAP: adequate landscaping?
MAN.sub.-- HOME: manufactured house?
OWN.sub.-- TYPE: ownership type (condo, single family residence,
etc.)
P.sub.-- COND: condition of parking structure
P.sub.-- SPACES: number of parking spaces
P.sub.-- STRAGE: type of parking (garage, carport, etc.)
P.sub.-- DOROPN: electric garage door opener?
ROOFTYPE: type of roofing material
R.sub.-- TOT.sub.-- N: number of rooms
LOT.sub.-- SHAP: lot shape
SITE.sub.-- INF: site influence (ocean, mountains, etc.)
SI.sub.-- STM: public or private street maintenance?
SI.sub.-- STT: street surface material
PARCLSIZ: size of parcel (typical, undersized or oversized)
SQ.sub.-- FT.sub.-- LA: square footage of living area
STRYSFDU: number of stories
STYLECOD: style of house (colonial, ranch, etc.)
TOPOCODE: topography of lot (level, hilly, etc.)
WAL.sub.-- EXTT: exterior wall material
POOLTYPE: pool, spa, both, or none
AGE: age of home
HOA: home owner's dues?
ECONLIFE: economic life (remaining years) of house
LN.sub.-- LOT: natural log of lot size
APN.sub.-- COMP: median comps price (local neighborhood)
ZIP.sub.-- COMP: median comps price (zip code or county wide)
ZIPCMPBE: median # of bedrooms in comps (zip code or county-wide)
ZIPCMPBA: median # of bathrooms in comps (zip code or county-wide)
ZIPCMPSQ: median square footage in comps (zip code or county-wide)
ZIPCMPAG: median age in comps (zip code or county-wide)
ZIPCMPRM: median # of rooms in comps (zip code or county-wide)
ZIPCMPGA: median # of parking spaces in comps (zip code or county-wide)
ZIPCMPFP: median # of fireplaces in comps (zip code or county-wide)
APNDIFAG: age differential (current property minus local comps)
APNDIFBA: # bathrooms differential (current property minus local
comps)
APNDIFBE: # bedrooms differential (current property minus local
comps)
APNDIFFP: # fireplaces differential (current property minus local
comps)
APNDIFGA: # park spaces differential (current property minus local
comps)
APNDIFRM: # rooms differential (current property minus local comps)
APNDIFSQ: sq. footage differential (current property minus local
comps)
Once system 100 has obtained predictor variables as described above
for each month in the training period, the predictor variables are
fed to networks 908 and networks 908 are trained. The embodiment
illustrated herein uses a modeling technique known as a backpropagation
neural network 908. This type of network 908 estimates parameters
which define relationships among variables using a training method.
The preferred training method, well known to those skilled in the
art, is called "backpropagation gradient descent optimization"
and is described in Gopinathan et al., although other well-known
neural network training techniques may also be used.
Once the neural networks have been trained using training data,
the network model definitions are stored in data files in a conventional
manner. These data files describe the neural network architecture,
weights, the data configuration, data dictionary, and file format.
In addition to developing and storing neural network models 908
as described above, model development component 901 also develops
error models 909. Typically, these error models 909 are implemented
as conventional regression models, known to those skilled in the
art, although other predictive modeling-techniques, such as neural
networks, may be used. As with neural network models 908, different
error models 909 may be provided for different regions.
To develop error models 909, system 100 determines the absolute
percent error of the neural network model estimate for each record
in the training data set. Based on a set of input parameters, error
model 909 is trained to forecast the absolute percent error of the
neural network model estimate. Training methods for regression models
are well known in the art. An example of a set of input parameters
used in the embodiment illustrated herein is given below:
PREDICTOR VARIABLES IN AREAS ERROR MODEL
PRED.sub.-- SP: predicted sales price
PRED.sub.-- SP2: square of PRED.sub.-- SP
PRED.sub.-- SP3: cube of PRED.sub.-- SP
APNPDIF: normalized difference between PRED.sub.-- SP and local
median price
APNPDIF2: square of APNPDIF
APNPDIF3: cube of APNPDIF
ZIPPDIF: normalized difference between PRED.sub.-- SP and zip code
median price
APNSRC: size of local neighborhood
AIR.sub.-- COND: type of air conditioning
BA.sub.-- FLMAT: bathroom floor material
BA.sub.-- WNCON: bathroom wainscot condition
FND.sub.-- INF: foundation infestation?
FRPL.sub.-- TYP: type of fireplace
IMP.sub.-- TYPE: improvement type (attached, townhouse, etc.)
MAN.sub.-- HOME: manufactured house?
OWN.sub.-- TYPE: ownership type (condo, single family residence,
etc.)
PARCLSIZ: size of parcel (typical, undersized or oversized)
POOLTYPE: pool, spa, both, or none
P.sub.-- COND: condition of parking structure
P.sub.-- DOROPN: electric garage door opener?
P.sub.-- STRAGE: type of parking (garage, carport, etc.)
SI.sub.-- STM: public or private street maintenance?
TOPOCODE: topography of lot (level, hilly, etc.)
WAL.sub.-- EXTT: exterior wall material
AGE: age of home
AGE2: square of AGE
AGE3: cube of AGE
APNDIFAG: difference between age of home and local median age
APNDFAG2: square of APNDIFAG
APNDFAG3: cube of APNDIFAG
APNDFBE3: cube of difference between # bedrooms and local median
APNDFFP2: square of difference between # fireplaces and local median
APNDFFP3: cube of difference between # fireplaces and local median
APNDIFGA: difference between # parking places and local median
APNDFGA2: square of APNDIFGA
APNDFRM2: square of difference between # rooms and local median
APNDFSQ3: cube of difference between square footage and local median
BA.sub.-- NUM: number of bathrooms
ECONLIFE: economic life (remaining years) of house
ECONLIF3: cube of ECONLIFE
PSPACES3: cube of number of parking spaces
R.sub.-- TOT N: total number of rooms
R.sub.-- TOT N2: square of R.sub.-- TOT.sub.-- N
R.sub.-- TOT N3: cube of R.sub.-- TOT.sub.-- N
SQ.sub.-- FT.sub.-- LA: square footage
ZIPCMPAG: difference between age of home and zip code median age
ZIPCMAG3: cube of ZIPCMPAG
ZIPCMPBA: difference between # bathrooms and zip code median
ZIPCMBA2: square of ZIPCMPBA
ZIPCMPBE: difference between # bedrooms and zip code median
ZIPCMBE2: square of ZIPCMPBE
ZIPCMBE3: cube of ZIPCMPBE
ZIPCMFP2: square of difference between # fireplaces and zip code
median
ZIPCMFP3: cube of difference between # fireplaces and zip code
median
ZIPCMGA2: square of difference between # parking places and zip
code median
ZIPCMSQ2: square of difference between square footage and zip code
median
ZIPCMSQ3: cube of difference between square footage and zip code
median
ZIP.sub.-- COMP: median sales price across zip code
ZIPCOMP2: square of ZIP.sub.-- COMP
ZIPCOMP3: cube of ZIP.sub.-- COMP
Property Valuation Component 902
As seen in FIG. 9, property valuation component 902 reads data
905 describing the property to be appraised (known as subject property)
and data 906 describing the surrounding geographical area 906, and
generates as output 907 a price estimate for the subject property.
Furthermore, property valuation component 902 outputs a range of
values based on the estimated maximum error of the price estimate,
as well as a list of contributing variables, or reason codes, for
the price estimate.
Property data 905 are generally entered by the user on a data entry
form such as those shown in FIGS. 2 and 3. The data may be entered
either interactively, or in batch mode using tape or disk storage
devices. Property data 905 describe the particular property to be
appraised, and they typically include the same types of predictor
variables as listed above for training data 904.
Area data 906 are collected from databases describing properties
in geographical areas surrounding the subject property. The method
by which area data 906 are collected is described below. Typically,
area data 906 include averages of the same types of predictor variables
as listed above for training data 904.
Referring now to FIG. 10, there is shown a flowchart of the property
valuation process of the present invention. System 100 obtains 1001
property data 905 from user input or batch input. Based on property
data 905, it determines 1001 the appropriate region to be used for
the analysis. As shown in FIG. 17, a region 1701 is a relatively
large geographic area containing a number of smaller geographic
areas. Each region 1701 may be associated with a separate neural
network model 908, as well as a separate error model 909. System
100 then loads 1003 neural network model 908 and error model 909
for the region 1701 containing the subject property.
System 100 then determines 1005 which area to use in the analysis
and obtains area data 906. In determining 1005 the optimal area
for the analysis, system 100 uses a technique that captures local
neighborhood characteristics while including a sufficiently large
sample size to preserve predictability and reliability. Generally,
system 100 accomplishes this by seeking the smallest geographic
area containing both the subject property and at least one other
property that was sold within the past x months, where x is a predetermined
time period.
Referring now also to FIG. 11, there is shown the method of determining
1005 the optimal area to use. As shown in the flowchart, system
100 uses the smallest geographic area containing at least one property
that was sold within the past x months, in addition to the subject
property. The minimum number of properties, the time period, and
the particular areas available for use, may vary depending on the
level of geographic specificity and sample size desired.
System 100 applies 1006 the appropriate neural network 908 to subject
property data 905 and area data 906. Neural network 908 generates
estimated value 1007. System 100 then determines 1008 reason codes
indicating which inputs to model 908 are most important in determining
estimated value 1007. Any technique to generate such reason codes
may be used. In the embodiment illustrated herein, the technique
set forth in co-pending U.S. application Ser. No. 07/814,179, for
"Neural Network Having Expert System Functionality", by
Curt A. Levey, filed Dec. 30, 1991, the disclosure of which is hereby
incorporated by reference, is used. System 100 uses the reason codes
to generate 1009 a list of contribution factors to the estimated
value, shown in FIG. 5 as positive contribution factors 505 and
negative contribution factors 506.
System 100 also estimates 1010 the error range of its appraisal.
In the embodiment illustrated herein, error estimation is performed
by applying error model 909, typically a regression model, to subject
property data 905 and area data 906. Error model 909 uses conventional
regression techniques to generate an absolute percent error estimate
E. System 100 generates a lower bound and an upper bound for the
error range by applying the following formulas:
where P is the estimated property value and E is the absolute percent
error estimate.
Alternatively, system 100 may estimate the error range using a
technique known in the art as robust backpropagation, as described
in H. White, "Supervised Learning as Stochastic Approximation",
International Joint Conference on Neural Networks, San Diego, Calif.
(1990), the teachings of which are incorporated herein by reference.
When robust backpropagation is used, system 100 does not require
error model 909. Rather, two additional secondary neural network
models 908 are used. Each of the two secondary models 908 is trained
to estimate a specified percentile of the conditional distribution
of sales prices. For example, the first secondary model 908 may
be trained to estimate the 10th percentile of the conditional distribution,
while the second secondary model 908 estimates the 90th percentile.
These models 908 are trained and implemented in the same technique
and using the same predictor values as described above for neural
network model 908. When estimating a sales price for a subject property,
the property data is sent to secondary models 908 in addition to
primary neural network model 908. Secondary models 908 produce lower
and upper bounds L and U for the error range.
Whichever technique is used to generate the lower and upper bounds
L and U, system 100 then outputs 1011 the estimated property value,
as well as the range defined by L and U designated in FIG. 5 as
a lowest typical sales price 503 and a highest typical sales price
504.
Finally, system 100 generates 1012 reports as appropriate and as
requested by the user. A typical report is statistical review report
701 shown in FIG. 7.
Referring now to FIG. 12, there is shown a method of generating
1012 reports according to the present invention. System 100 analyzes
1202 market trends as shown in FIG. 13. It first determines 1302
and 1303 county and ZIP code sales price trends over the past 24
months. Then it determines 1304 local sales price trends by census
tract, map code, and APN group over the past 24 months. It classifies
1305 trends as stable, moderate upward or downward trend, or steep
upward or downward trend. The trends and their classifications are
used in generating regional trend analysis portion 704 and local
trend analysis portion 705 of statistical review report 701 shown
in FIG. 7. If alternative geographical subdivisions are used, the
above-described method of market trend analysis is altered accordingly.
System 100 then determines 1203 the degree of conformity of the
subject property with regard to the neighborhood. This is done according
to the method shown in FIG. 14. System 100 determines the median
and variance of neighborhood sales prices 1402, square footages
1403, and prices per square footage 1404. Medians and variances
for other variables may be collected as well, if desired. The distribution
within the neighborhood is used in generating subject conformity
portion 706 of statistical review report 701. System 100 determines
1405 whether the property deviates by more than one standard deviation
from the neighborhood norm. If not, system 100 classifies 1406 the
property as conforming. If the property deviates by more than one
standard deviation, system 100 determines 1407 if the property deviates
by more than two standard deviations. If not, system 100 classifies
1408 the property as non-conforming. If the property deviates by
more than two standard deviations, system 100 classifies 1409 the
property as extremely non-conforming. Additional levels of conformity
classification may be provided. System 100 uses the conformity classification
in generating summary and recommendations portion 708 of statistical
review report 701.
System 100 generates 1204 subject valuation portion 707 based on
the estimated value determined by neural network 108.
System 100 generates 1205 summary and recommendations portion 708
using the method shown in FIGS. 15 and 16. Referring now to FIG.
15, there is shown the method of comparing an estimated property
value to user-specified values. System 100 determines 1502 whether
the appraised value as determined by a human appraiser falls within
the valuation range generated by neural network model 908 and error
model 909. If not, system 100 determines the percent outside the
range and outputs 1503 this value in summary and recommendations
portion 708. System 100 then determines 1504 if the loan-to-value
(LTV) ratio, based on the estimated value of the property and the
amount of the contemplated loan, is within a user-specified maximum
LTV. If not, system 100 determines the percent above the maximum
and outputs 1505 this value in summary and recommendations portion
708.
Referring now to FIG. 16, there is shown the method of generating
recommendations. System 100 determines 1602 whether the LTV is within
the maximum LTV. If not, it recommends suspension of the loan 1603.
If the LTV is within the maximum LTV, system 100 determines 1604
whether the property is conforming. If the property is not conforming,
system 100 determines 1605 whether the property is extremely non-conforming.
If the property is not extremely non-conforming, system 100 recommends
caution with regard to the contemplated loan 1606. If the property
is extremely non-conforming, system 100 recommends suspension of
the loan 1607. If the property is conforming, system 100 determines
1608 whether the market is declining. If so, it recommends caution
1609. If the market is not declining, system 100 determines 1610
whether the appraisal as performed by the human appraiser falls
within the range generated by neural network model 908 and error
model 909. If the appraisal does not fall within the range, system
100 recommends caution 1611. If the appraisal falls within the range,
system 100 recommends that the loan proceed 1612. System 100 outputs
its recommendation as part of summary and recommendation portion
708 of statistical review report 701.
As an additional disclosure, the source code for the embodiment
illustrated herein of the invention is included below as an appendix.
It should be noted that terminology in the source code may differ
slightly from that in the remainder of the specification. Any differences
in terminology, however, will be easily understood by one skilled
in the art.
From the above description, it will be apparent that the invention
disclosed herein provides a novel and advantageous method of real
estate appraisal. The foregoing discussion discloses and describes
merely exemplary methods and embodiments of the present invention.
As will be understood by those familiar with the art, the invention
may be embodied in many other specific forms without departing from
the spirit or essential characteristics thereof. For example, other
predictive modeling techniques besides neural networks might be
used. In addition, other variables, geographic subdivisions, and
report generation techniques might be used.
Accordingly, the disclosure of the present invention is intended
to be illustrative of the preferred embodiments and is not meant
to limit the scope of the invention. The scope of the invention
is to be limited only by the following claims. ##SPC1##
|