Mobile-first ticketing platforms generate data across every touchpoint β ad impressions, session behavior, promo code redemptions, purchases, and post-sale activity. Each system captures a piece of the customer story. None of them complete it.
Marketing teams end up with attribution reports that don't account for promo impact, funnel reports disconnected from LTV, and campaign ROI calculations that can't be compared channel-to-channel because the underlying data was never modeled relationally.
The result is that decisions about where to spend the next marketing dollar get made with incomplete signal β often optimizing for volume metrics when the actual margin story looks completely different.
TicketPush simulates that environment end to end β schema designed from scratch, realistic data generated via Python, and a four-page Power BI dashboard built to answer the specific questions a growth or revenue team actually asks.
Five stages β each one producing an artifact that the next stage depends on.
Four tools, eight tables β and a relational schema diagram showing how the data model connects.
A Power BI dashboard built around four distinct analytical questions β each page answers one and connects to the others through shared filters and consistent KPI definitions.
Pages 03 and 04 screenshots pending upload β placeholders shown above
The modeling choices that shaped the output β and what each one cost.
| Decision | Rationale | Tradeoff |
|---|---|---|
| Both attribution models, same view | First-touch and last-touch tell different stories about the same campaign spend. Showing them side-by-side in the same dashboard lets analysts see where they diverge β and where that divergence changes the channel investment decision. | More complex DAX β maintaining two attribution paths in the same Power BI model required careful measure isolation to prevent filter context bleed between models. |
| Customer Journey Summary as a pre-built table | Consolidating lifecycle metrics into a single per-user table makes every downstream query simpler. Without it, the LTV and funnel calculations would require multi-join logic on every dashboard page β adding latency and maintenance risk. | Refresh dependency β the summary table must be rebuilt whenever source data changes. Acceptable for a simulation; in production this becomes a scheduled pipeline job. |
| MySQL over a cloud warehouse | The relational schema logic is identical whether run in MySQL, BigQuery, or Redshift. MySQL was chosen for local development speed and to demonstrate that the analytical design is tool-agnostic β the queries transfer directly. | Fully portable β every SQL file in this project runs on BigQuery or Redshift with minimal syntax changes, making it a genuine production blueprint. |
| Behavioral repeat risk, not ML classifier | A rule-based repeat risk score (recency Γ frequency) is transparent and explainable. A trained classifier would produce better predictions but would require ground truth labels that don't exist in a simulated dataset. | Lower predictive accuracy β acknowledged as a v2 upgrade path once the dataset has real behavioral history to train against. |
| Python simulation over static seed data | Generating data programmatically means the dataset can be scaled, varied, or re-seeded to test edge cases. It also demonstrates the ability to model realistic behavioral distributions β not just assign random values. | Reproducible and scalable β the simulation can generate 1,000 users or 100,000 users with a single parameter change, making it a proper load-testing environment. |
Dashboard views, SQL logic, and Python simulation β click any screenshot to expand.