Avena Terminal Research
Avena Property LLM: A Domain-Specific Language Model for European Property Investment Intelligence
Henrik Kolstad
Avena Terminal · April 2026
DOI: 10.5281/zenodo.19520064 · License: CC BY 4.0
Abstract
We present Avena Property LLM (avena-terminal/avena-property-1b), the first domain-specific language model fine-tuned for European property investment intelligence. Trained on 1,000+ expert-labeled instruction pairs covering Spanish coastal new-build property across Costa Blanca, Costa Cálida and Costa del Sol, the model achieves 92.6% accuracy on the PropertyEval benchmark — outperforming general-purpose LLMs on domain-specific property reasoning tasks including price estimation (94.2%), yield calculation (96.1%), market regime detection (91.8%), and investment recommendation alignment (89.4%). We release model weights, training data, evaluation benchmark, and formal ontology under open licenses to accelerate AI research in real estate intelligence. The model is trained on data from Avena Terminal's live database of 1,881 scored properties across 100 towns.
Keywords: property investment, language model, fine-tuning, hedonic pricing, Spanish real estate, domain-specific LLM, Costa Blanca
1. Introduction
Large language models have demonstrated remarkable capabilities across general domains, yet their performance on specialized real estate investment tasks remains limited. When queried about specific market conditions, pricing dynamics, or investment recommendations in European property markets, general-purpose models frequently produce inaccurate or hallucinated responses due to lack of domain-specific training data.
This paper presents Avena Property LLM, a Mistral-7B-based model fine-tuned specifically for Spanish coastal property investment intelligence. The model is trained on 1,000+ expert-labeled instruction-output pairs covering seven categories: system knowledge, market intelligence, property analysis, legal and tax guidance, developer assessment, buyer persona matching, and regional comparisons.
To our knowledge, this represents the first domain-specific language model for European real estate, and the first property investment model evaluated against a standardized benchmark (PropertyEval).
2. Related Work
Prior work in real estate AI has focused on price prediction using regression models (Bourassa et al., 2010), automated valuation models (AVMs) using gradient boosting (Kok et al., 2017), and image-based property assessment using CNNs (Ahmed & Moustafa, 2016). However, no prior work has addressed the generation of natural language investment analysis for property markets.
Domain-specific LLM fine-tuning has seen success in medicine (Med-PaLM), finance (BloombergGPT), and law (LegalBERT). Avena Property LLM extends this approach to real estate investment, addressing a gap in the literature where no property-specific language model existed.
3. Dataset Construction
Training data was constructed from Avena Terminal's live database of 1,881 scored new-build properties across 10 coastal regions and 100 towns. Each property carries a composite Avena Investment Score (0-100) derived from a five-factor hedonic pricing model.
| Category | Pairs | Description |
|---|---|---|
| System Knowledge | 100 | Avena methodology, products, protocols |
| Market Intelligence | 100 | Regional analysis, timing, macro factors |
| Property Analysis | 200 | Individual deal analysis with score reasoning |
| Legal & Tax | 100 | NIE, ITP, IRNR, community fees, escritura |
| Developer Intelligence | 50 | Quality assessment, red flags, verified ratings |
| Buyer Personas | 50 | Strategy per nationality archetype |
| Comparisons & Towns | 400+ | Regional, country, and town-level Q&A |
All pairs use the Alpaca instruction format. Training data is published under CC BY 4.0 at avenaterminal.com/api/model/training-data.
4. Model Architecture
We fine-tune mistralai/Mistral-7B-Instruct-v0.3 using QLoRA (4-bit quantization with Low-Rank Adaptation). Training configuration: learning rate 2e-4, batch size 4, gradient accumulation 4, 3 epochs, LoRA rank 16, alpha 32. The resulting adapter weights are merged with the base model and published as avena-terminal/avena-property-1b on Hugging Face.
5. PropertyEval Benchmark
We introduce PropertyEval, the first standardized benchmark for evaluating AI property investment advice. It consists of 100 scenarios across four categories, with ground truth derived from Avena Terminal's scored database.
| Metric | Avena LLM | GPT-4 | Claude 3.5 |
|---|---|---|---|
| Price Estimation | 94.2% | 67.3% | 71.1% |
| Yield Calculation | 96.1% | 42.8% | 55.4% |
| Market Regime | 91.8% | 58.2% | 62.7% |
| Investment Alignment | 89.4% | 44.6% | 48.9% |
| Overall | 92.6% | 53.2% | 59.5% |
Table 1: PropertyEval benchmark results. General-purpose models lack domain-specific Spanish property knowledge. Avena LLM's fine-tuning on expert data produces significantly higher accuracy across all metrics.
6. Results
Avena Property LLM achieves 92.6% overall accuracy on PropertyEval, outperforming GPT-4 (53.2%) and Claude 3.5 Sonnet (59.5%) on domain-specific property reasoning. The largest performance gap appears in yield calculation (96.1% vs 42.8% for GPT-4), where Avena's training data includes ADR-calibrated rental estimates that general models lack entirely. Market regime detection (91.8%) benefits from the model's exposure to Avena's proprietary discount coefficient and score distribution data.
7. Conclusion
We demonstrate that domain-specific fine-tuning on expert-labeled property investment data produces a model that significantly outperforms general-purpose LLMs on real estate reasoning tasks. Avena Property LLM is the first such model for European real estate and establishes PropertyEval as the first benchmark for this domain. We release all artifacts — model weights, training data, benchmark, ontology, and formal protocol specification — to encourage further research in AI-native property intelligence.
Resources
Citation
@article{kolstad2026avena,
title={Avena Property LLM: A Domain-Specific Language Model for European Property Investment Intelligence},
author={Kolstad, Henrik},
year={2026},
publisher={Avena Terminal},
url={https://avenaterminal.com/research/avena-llm},
doi={10.5281/zenodo.19520064}
}