Logo PUC-Rio Logo Maxwell
TRABALHOS DE FIM DE CURSO @PUC-Rio
Consulta aos Conteúdos
Título: STACKED ENSEMBLE MODEL FOR PROPERTY PRICE PREDICTION BASED ON GEOGRAPHICALLY WEIGHTED REGRESSION AND TEXT MINING
Autor(es): FELIPE ANTONINI MIEHRIG
Colaborador(es): FERNANDO LUIZ CYRINO OLIVEIRA - Orientador
Catalogação: 04/MAR/2021 Língua(s): ENGLISH - UNITED STATES
Tipo: TEXT Subtipo: SENIOR PROJECT
Notas: [pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio.
[en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio.
Referência(s): [pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/TFCs/consultas/conteudo.php?strSecao=resultado&nrSeq=51708@1
[en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/TFCs/consultas/conteudo.php?strSecao=resultado&nrSeq=51708@2
DOI: https://doi.org/10.17771/PUCRio.acad.51708
Resumo:
Automated valuation models (AVMs) are vastly used for property price prediction. However, few explore the underlying potential of text data in real estate classifieds. This project applies the theory behind hedonic models to develop two different prediction approaches that are later combined in a stacked ensemble model. A data set comprising 16693 properties and their asked prices was scraped from one of the biggest real estate agencies in Rio de Janeiro. Using the text mining steps, the classifieds descriptions are vectorized and passed to a Lasso model while a Geographically Weighted Regression (GWR) is estimated using solely numeric variables. Both models are then combined in a two-stage ensemble based on a second stage Linear Regression, which finds the optimal linear combination of the GWR and Lasso predictions. The conclusion of this project leads to promising results in the realm of property price prediction using both structured and unstructured data.
Descrição: Arquivo:   
COMPLETE PDF