Data Sources & Pipeline
Where our data comes from and how it stays current
Pipeline Overview
Avena Terminal operates an automated data pipeline that ingests, validates, and scores property listings every 24 hours. The pipeline runs in three stages: ingestion (raw data pulled from source feeds), enrichment (cross-referencing with benchmark datasets and computing derived metrics), and scoring (applying the hedonic regression model to produce composite scores).
Each stage includes validation checks. Listings missing critical fields (price, area, or location coordinates) are flagged and excluded from scoring until the data is resolved. Price anomalies (changes greater than 30% between consecutive updates) trigger a manual review flag before the new price is accepted into the scoring model.
Primary Source: RedSP XML Feed
The core listing data comes from the RedSP (Red de Servicios de Promociones) XML feed, a structured data source that aggregates new build development listings from property promoters across Spain. The feed provides machine-readable data for each property including asking price, built area in square metres, bedroom and bathroom count, property type, GPS coordinates, energy rating, estimated delivery date, and developer name.
Geographic Coverage
The terminal currently tracks new build properties across four coastal regions of Spain, covering more than 100 individual towns and municipalities. Coverage is concentrated on the areas with the highest density of new build activity and international buyer demand.
Torrevieja, Orihuela Costa, Guardamar, Pilar de la Horadada, Los Montesinos, San Miguel de Salinas, and surrounding municipalities. The highest volume of tracked listings.
Javea, Altea, Calpe, Moraira, Denia, Benidorm, Villajoyosa, and premium hillside and coastal towns. Higher average price points and lifestyle-driven demand.
Marbella, Estepona, Benahavis, Fuengirola, Mijas, Malaga city, Manilva, and the Golden Mile corridor. Spain's most internationally recognised property market.
Mar Menor, Mazarron, Aguilas, Cartagena, and La Manga. Emerging market with lower entry prices and growing infrastructure investment.
How Yield Is Calculated
Gross rental yield is estimated using a comparable-based approach. For each tracked property, the pipeline identifies short-term rental listings on Airbnb and Booking.com within the same postcode area that match on property type (apartment, townhouse, or villa) and approximate bedroom count.
From these comparables, we extract a median nightly rate and apply seasonally adjusted occupancy assumptions to estimate annual gross revenue. The formula is:
Occupancy assumptions vary by region and season. Summer months (June through September) use higher occupancy rates (75-90%) while winter months use lower rates (20-45%), reflecting the seasonal nature of coastal Spanish tourism. Year-round destinations like Marbella carry higher baseline winter occupancy than seasonal markets.
The resulting yield is a gross figure and does not account for management fees, maintenance, community charges, IBI tax, or income tax on rental earnings. Investors should expect net yields to be approximately 25-35% lower than the gross figures displayed.
Market Price Benchmarks
To assess whether a new build is priced above or below the market, we need a reliable benchmark for what "the market" charges per square metre in each location. This benchmark is constructed from multiple data sources:
Primary benchmark. Actual recorded transaction prices at the municipal level, updated quarterly. The most authoritative source for what properties actually sell for (as opposed to what they are listed at).
Provincial-level price trends used to extrapolate between Registradores reporting periods and to calculate location CAGR values.
Supplementary source for municipalities where Registradores data is sparse. Listing prices are discounted by a region-specific negotiation factor (typically 5-12%) to approximate transaction prices.
Update Frequency
Synced from RedSP XML feed every 24 hours.
Comparable nightly rates refreshed weekly from platform data.
Re-calculated after each listing sync to reflect price changes.
Updated when new quarterly transaction data is published.
Provincial price trends updated on the INE publication schedule.
Model coefficients re-estimated monthly using rolling 12-month data.