Bright possibilities ahead

Pavan Dileep Tadikamalla

Data Engineer

Montevideo, Uruguay

About Me

Senior Data Engineer with 4+ years of experience building enterprise-scale data pipelines and big data platforms for Apple AdPlatforms at Tata Consultancy Services (TCS). I specialise in designing, developing, and optimising high-performance ETL pipelines that process 2TB+ of advertising data daily using PySpark, Apache Spark, Databricks, and Delta Lake on Amazon Web Services (AWS).

Experience

Senior Data Engineer

Tata Consultancy Services (TCS) -- Montevideo, Uruguay

September 2022 - Present

Designed and developed scalable ETL pipelines and data pipelines using PySpark, Apache Spark, Databricks, and Delta Lake on Amazon Web Services (AWS), processing 2TB+ daily big data advertising datasets and achieving 30% reduction in pipeline latency for Apple marketing analytics. Optimized distributed computing workloads and SQL queries via partition pruning, broadcast joins, and cluster tuning, delivering 40% improvement in query performance and reducing AWS cloud computing costs by 18%. Built a comprehensive data quality and data governance framework with automated schema enforcement, data validation, and anomaly detection, reducing data defects by 35% and maintaining 99.5% data accuracy across 8+ data sources. Architected data warehouse and data lake solutions integrating 8+ advertising data sources including impression data, clickstreams, and conversion events into Delta Lake Lakehouse, enabling real-time data processing and analytics. Orchestrated end-to-end data pipelines using Apache Airflow DAGs and GitHub Actions CI/CD pipelines, enabling automated testing, deployment, and scheduling across development, staging, and production environments. Developed Python automation scripts for internal tooling and client-facing workflows, eliminating manual processes; built Python Slack bots for real-time pipeline alerts, job status notifications, and SLA breach warnings. Monitored pipeline health and data infrastructure using Datadog, Grafana, Splunk, and AWS CloudWatch, proactively resolving bottlenecks and reducing incident response time by 25%. Managed data infrastructure inventory across AWS and on-premises systems, tracking resource utilization and rightsizing compute for cost efficiency. Leveraged AI-assisted development tools including Claude Sonnet and Claude Code daily for code generation, debugging, and pipeline optimization, improving engineering productivity by 30%. Collaborated cross-functionally with Apple product managers, data scientists, and business stakeholders to translate requirements into optimized data models, improving dashboard performance by 25%.

Designed and developed scalable ETL pipelines and data pipelines using PySpark, Apache Spark, Databricks,
and Delta Lake on Amazon Web Services (AWS), processing 2TB+ daily big data advertising datasets and
achieving 30% reduction in pipeline latency for Apple marketing analytics.
Optimized distributed computing workloads and SQL queries via partition pruning, broadcast joins, and cluster tuning,
delivering 40% improvement in query performance and reducing AWS cloud computing costs by 18%.
Built a comprehensive data quality and data governance framework with automated schema enforcement, data
validation, and anomaly detection, reducing data defects by 35% and maintaining 99.5% data accuracy across 8+ data
sources.
Architected data warehouse and data lake solutions integrating 8+ advertising data sources including impression data,
clickstreams, and conversion events into Delta Lake Lakehouse, enabling real-time data processing and analytics.
Orchestrated end-to-end data pipelines using Apache Airflow DAGs and GitHub Actions CI/CD pipelines, enabling
automated testing, deployment, and scheduling across development, staging, and production environments.
Developed Python automation scripts for internal tooling and client-facing workflows, eliminating manual processes;
built Python Slack bots for real-time pipeline alerts, job status notifications, and SLA breach warnings.
Monitored pipeline health and data infrastructure using Datadog, Grafana, Splunk, and AWS CloudWatch, proactively
resolving bottlenecks and reducing incident response time by 25%.
Managed data infrastructure inventory across AWS and on-premises systems, tracking resource utilization and
rightsizing compute for cost efficiency.
Leveraged AI-assisted development tools including Claude Sonnet and Claude Code daily for code generation,
debugging, and pipeline optimization, improving engineering productivity by 30%.
Collaborated cross-functionally with Apple product managers, data scientists, and business stakeholders to translate
requirements into optimized data models, improving dashboard performance by 25%.

NOC Support Engineer

Tata Consultancy Services (TCS) -- Hyderabad, India

December 2021 - September 2022

Provided 24/7 production support for mission-critical Apache Spark and ETL pipeline workflows on AWS, maintaining 99.8% system availability across high-volume big data batch processing environments. Delivered bare-metal hardware and software support for on-premises Linux servers, managing full incident lifecycle from triage to resolution across hybrid on-premises and cloud computing environments. Diagnosed data pipeline failures and performance bottlenecks using AWS CloudWatch, Datadog, and Splunk, performing root-cause analysis and delivering fixes that reduced recurring incidents by 30%. Automated monitoring, alerting, and data validation workflows using Python scripting and Apache Airflow, reducing manual operational intervention by 40% and improving mean time to resolution (MTTR). Maintained incident runbooks and technical documentation, enabling faster on-call handoffs and reducing escalation rates through cross-functional team collaboration.

Provided 24/7 production support for mission-critical Apache Spark and ETL pipeline workflows on AWS, maintaining
99.8% system availability across high-volume big data batch processing environments.
Delivered bare-metal hardware and software support for on-premises Linux servers, managing full incident lifecycle
from triage to resolution across hybrid on-premises and cloud computing environments.
Diagnosed data pipeline failures and performance bottlenecks using AWS CloudWatch, Datadog, and Splunk, performing
root-cause analysis and delivering fixes that reduced recurring incidents by 30%.
Automated monitoring, alerting, and data validation workflows using Python scripting and Apache Airflow, reducing
manual operational intervention by 40% and improving mean time to resolution (MTTR).
Maintained incident runbooks and technical documentation, enabling faster on-call handoffs and reducing escalation rates
through cross-functional team collaboration.

Associate System Engineer

Tata Consultancy Services (TCS) -- Hyderabad, India

April 2021 - December 2021

Completed TCS onboarding training covering AWS cloud fundamentals, Linux system administration, and Python scripting basics. Assisted the team in monitoring data pipelines, logging incidents, and resolving support tickets under senior engineer supervision. Practiced SQL query writing, shell scripting, and basic ETL concepts in a collaborative project environment. Participated in Agile daily standups and sprint ceremonies, developing familiarity with professional software delivery workflows.

Completed TCS onboarding training covering AWS cloud fundamentals, Linux system administration, and Python
scripting basics.
Assisted the team in monitoring data pipelines, logging incidents, and resolving support tickets under senior engineer
supervision.
Practiced SQL query writing, shell scripting, and basic ETL concepts in a collaborative project environment.
Participated in Agile daily standups and sprint ceremonies, developing familiarity with professional software delivery
workflows.

Skills

Cloud & Infrastructure: Amazon Web Services - AWS (S3

EMR

Glue

Redshift

Lambda

Athena

CloudWatch)

Docker

On-Premises

Bare-Metal

Linux

Cloud Computing

Data Engineering: PySpark

Apache Spark

Databricks

Delta Lake

ETL Pipeline Development

Data Ingestion

Orchestration & CI/CD: Apache Airflow

Prefect

GitHub Actions

CI/CD Pipelines

Batch Processing

Real-time Data

Performance & Query: Spark Tuning

SQL Optimization

Partition Pruning

Broadcast Joins

Caching

Query

Programming: Python (Advanced)

SQL

Scala

Bash/Shell Scripting

Pandas

Git

Monitoring: Splunk

Datadog

Grafana

AWS CloudWatch

Incident Management

SLA Tracking

Alerting

Automation & AI Tools: Python Automation

Slack Bots (Python)

Claude Sonnet

Claude Code

Cursor IDE

AI-assisted

Soft Skills: Cross-functional Collaboration

Stakeholder Communication

Problem Solving

Analytical Thinking

Agile

Education

Bachelor of Technology in Computer Science and Engineering

Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology -- Chennai, India

2016 - 2020

Let's Connect

I'm always open to new opportunities and interesting conversations.

Send an email +91 9492694933