Identity
Every entity is uniquely identified by three things: its name, city, and entity type. Pointset normalises all three (lowercased, punctuation stripped) to deduplicate entries.O'Brien's Café and obriens cafe resolve to the same entity.
Each entity is assigned an id prefixed ent_.
Fields
Entity data is stored in two layers: First-class columns hold standard properties that Pointset always attempts to extract for every entity type:phone, website, latitude, longitude, instagram_url, facebook_url, open_hours, and google_maps_category. These are indexed and queryable.
The fields JSONB blob holds all other enriched data — custom fields like price_of_americano, timetable, membership_fee, whatever your subscription requested. The blob stores the full superset of all known field values across every subscription that has ever requested data on this entity. When the API responds to your subscription, it filters to only the fields you asked for.
Lifecycle
| Status | Meaning |
|---|---|
scraped_at set | Entity was discovered via Outscraper |
enriched_at set | Custom fields have been extracted |
removed_at set | Entity no longer appears in Outscraper results |
removed_at is set). An entity.removed webhook event is sent to affected subscriptions.
Deduplication
Pointset generates a normalised key for each entity on write. If a scrape returns a name that already exists for the same city and type after normalisation, it updates the existing row rather than creating a duplicate.Entity types
Entity type is a free-form string (cafe, muay_thai_gym, coworking_space). Pointset does not maintain a fixed taxonomy — any entity type you request is valid. The Discovery Catalog lists entity types that have been successfully enriched.