Developing scalable models for prediction of Demographic, Economical and Human Settlement parameters of different geographies and
providing business solutions based on the estimated attributes
December, 2018 - Present
Task Summary
- Developed scalable models for prediction of Demographic, Economical and Human Settlement parameters of different geographies for providing business solutions. Parameters included affluence index , economic potential index, ambient populatoon, traffic flow.
- Developed a graph based analysis model for identifying human Settlement patterns, and distribution of family types across Indian metro cities.
- Developed rural growth index based on identifying patterns in man made structures, Indian census data, government schemes employed in areas and year on year farming yields. Heavily involved in creating data cleaning and scraping pipelines on Airflow.
- Built data quality as a service using Spark to calculate statistical metric using amazon dequee while providing an interface to implement custom metrics with support for data sources like Snowflake, Data Lakes on S3, Oracle. User can also setup custom checks on metrics, done to ensure trust in the data.
- Introducing full observability and orchestration to serverless spark applications Spark-on-K8s
- Built ETL pipelines using Airflow and created a patch for Airflow 10.0.3 to get more granular metrics for generating dashboards and setting up alerts using Prometheus and Grafana.
- Creating data ingestion and exgestion pipeline for S3 -> PostGres (with geospatial manipulations) -> S3
Knowledge Gained
- Different applications and usecases of geospatial data.
- Creating data ingestion and exgestion pipeline from S3 - PostGres (with manipulations)- S3
- Using distributed Technologies ( Spark and Hadoop) for big data.
- Object detection, recognition, and localization from Satellite Imagery.
- Different economic and demographic Systems that contribute to human settlement.
- Developing, scaling and deployement of different statistical models
- Understanding DAAS (Data as a Service) business model
- Handling Data Security issues across platforms