Customer Request
The government of a European country commissioned a mobile app to estimate the likelihood of Covid-19 infection based on the phone owner's geolocation history.
Project Goal
Our primary objective was to implement data streaming, ensuring it scales efficiently from a few users to millions.
The main challenges we faced:
1. The complexity of this project was the need to merge multiple data streams. This required the use of an experimental tool, which we had no experience with yet.
2. Addressing unforeseen load testing challenges and infrastructure costs.
Solution and Technologies
We worked with Python, Apache Beam, Airflow, and various Google Cloud services including Storage, BigQuery, Cloud SQL, Pub/Sub, Dataflow, and Composer.
The main challenge was integrating two data streams using Tula, a tool we hadn't previously used. The solution was designed to handle varying user loads, ranging from 10 users to 1 million.
Conclusions on the Project
We successfully developed a functional version of the application, aligning with the project's main objectives.
Dual data stream integration for