Private clubs have a data fragmentation problem that is specific, common, and fixable. Membership data lives in a CRM. Golf activity is captured in a tee sheet platform. On-site spending sits in a POS system. None of these share a reliable common key.
Names are formatted inconsistently across systems, emails are sometimes missing from the tee sheet, and the POS has no concept of a member ID. The result is that operators, general managers, golf operations leads, and membership teams cannot connect golf play to spending behavior at the member level.
They run three separate reports and try to reconcile them manually. Key decisions about staffing, merchandising, and member engagement get made without the data that would make them better. The problem isn't the data โ it's that no one has built the plumbing to connect it.
BackNineIQ is that plumbing. Built with synthetic but controlled data to simulate a realistic club environment, it answers one question: when a member comes to play golf, what else do they do and spend?
Five stages. Each one cleanly separated so the pipeline is explainable, testable, and extensible.
backnine_global_id per resolved member.Every tool chosen for a specific reason โ and a Lucid-style medallion flow diagram built from the architecture document.
A live Databricks Streamlit app with three stakeholder views โ each page answering exactly one operational question.
platinum.rpt_member_visit_summary.
backnine_global_id โ plus the identity bridge layer.The choices that shaped the architecture โ and what was knowingly accepted or given up.
| Decision | Rationale | Tradeoff |
|---|---|---|
| Deterministic identity only | A confirmed match is always more valuable than a probable one. Stakeholder trust in the data depends on confidence that joined records are real โ not inferred. Probabilistic methods would require a confidence threshold that no one in a v1 product has calibrated yet. | Lower initial match rate โ some members without consistent cross-system fields are excluded from analysis until source data quality improves at origin. |
| Identity resolved in Silver, not Gold | The backnine_global_id is solved once and reused everywhere. New source systems can integrate into the identity model without changing any Gold design. |
Clean separation of concerns โ Silver owns identity, Gold owns business behavior. No logic leaks between layers. |
| Streamlit over Power BI | Databricks-native deployment, Python-first development, full layout control per persona, and no licensing constraints. Each page could be tuned independently without fighting a shared report canvas. | Lower executive familiarity โ Power BI is the known tool in most club contexts. Streamlit requires a URL deployment step that embedded Power BI avoids. |
| Platinum layer for app-ready outputs | The core business question requires a pre-joined, pre-aggregated surface. Pushing that logic into the Streamlit app would couple presentation to transformation โ making both harder to maintain and test. | Extra pipeline stage and Platinum refresh latency, accepted in exchange for faster app queries and cleaner Streamlit code. |
| Three separate app pages vs. one unified dashboard | Each stakeholder asks fundamentally different questions. A unified dashboard would require cognitive filtering that reduces adoption. One question per page is a product decision, not an engineering shortcut. | More maintenance surface โ three views to update when schema changes. Accepted as the correct product tradeoff for usability and stakeholder adoption. |
Every app view, pipeline artifact, and data layer โ click any screenshot to expand.