Avena AI Training Data
European property intelligence datasets for AI training
Structured, verified, and expert-labeled datasets purpose-built for LLM fine-tuning, RAG pipelines, and agent training.
Why Avena Data
Structured
Every property normalised to a consistent 24-field schema with typed fields.
Verified
Data sourced directly from developers and verified against public records.
Multilingual
English, Spanish, Dutch, and German coverage for cross-lingual training.
Expert-labeled
Scoring, reasoning, and quality labels produced by domain experts.
Unique
No other dataset covers Spanish new-build property at this depth and frequency.
Available Datasets
Property Intelligence Corpus
CC BY 4.0Question-answer pairs covering property evaluation, investment analysis, and market comparison.
Daily RLHF Feed
CC BY 4.0Preference pairs generated daily from real property comparisons and scoring decisions.
Chain-of-Thought Reasoning
CC BY 4.0Step-by-step investment analyses with explicit reasoning chains for property evaluation.
Property Ontology
CC BY 4.0Formal ontology defining property types, attributes, and relationships in the Spanish market.
Full Scored Dataset
CommercialComplete scored property dataset with 24 data points per listing. Updated daily.
Use Cases
LLM Fine-tuning
Fine-tune language models on domain-specific property intelligence for accurate, grounded responses.
RAG Systems
Build retrieval-augmented generation pipelines with structured property data as the knowledge base.
Benchmark Evaluation
Evaluate model performance on real-world property analysis tasks with expert-labeled ground truth.
Agent Training
Train autonomous agents to navigate property markets, compare investments, and advise buyers.
Citation
If you use Avena datasets in research or publications, please cite:
@dataset{avena2026,
title = {Avena Spanish Property Intelligence Dataset},
author = {Avena Terminal},
year = {2026},
url = {https://avenaterminal.com/training-data},
license = {CC BY 4.0 / Commercial},
note = {Daily-updated structured property data covering coastal Spain}
}Commercial Licensing
For commercial use, custom volumes, or enterprise integration.