Skip to main content

Golden hours, golden opportunities

Pavan Dileep Tadikamalla

Pavan Dileep Tadikamalla

Data Engineer

Montevideo, Uruguay

About Me

Senior Data Engineer with 4+ years of experience building enterprise-scale data pipelines and big data platforms for Apple AdPlatforms at Tata Consultancy Services (TCS). I specialise in designing, developing, and optimising high-performance ETL pipelines that process 2TB+ of advertising data daily using PySpark, Apache Spark, Databricks, and Delta Lake on Amazon Web Services (AWS).

Experience

Senior Data Engineer

Tata Consultancy Services (TCS) -- Montevideo, Uruguay

September 2022 - Present

Designed and developed scalable ETL pipelines and data pipelines using PySpark, Apache Spark, Databricks, and Delta Lake on Amazon Web Services (AWS), processing 2TB+ daily big data advertising datasets and achieving 30% reduction in pipeline latency for Apple marketing analytics. Optimized distributed computing workloads and SQL queries via partition pruning, broadcast joins, and cluster tuning, delivering 40% improvement in query performance and reducing AWS cloud computing costs by 18%. Built a comprehensive data quality and data governance framework with automated schema enforcement, data validation, and anomaly detection, reducing data defects by 35% and maintaining 99.5% data accuracy across 8+ data sources. Architected data warehouse and data lake solutions integrating 8+ advertising data sources including impression data, clickstreams, and conversion events into Delta Lake Lakehouse, enabling real-time data processing and analytics. Orchestrated end-to-end data pipelines using Apache Airflow DAGs and GitHub Actions CI/CD pipelines, enabling automated testing, deployment, and scheduling across development, staging, and production environments. Developed Python automation scripts for internal tooling and client-facing workflows, eliminating manual processes; built Python Slack bots for real-time pipeline alerts, job status notifications, and SLA breach warnings. Monitored pipeline health and data infrastructure using Datadog, Grafana, Splunk, and AWS CloudWatch, proactively resolving bottlenecks and reducing incident response time by 25%. Managed data infrastructure inventory across AWS and on-premises systems, tracking resource utilization and rightsizing compute for cost efficiency. Leveraged AI-assisted development tools including Claude Sonnet and Claude Code daily for code generation, debugging, and pipeline optimization, improving engineering productivity by 30%. Collaborated cross-functionally with Apple product managers, data scientists, and business stakeholders to translate requirements into optimized data models, improving dashboard performance by 25%.

  • Designed and developed scalable ETL pipelines and data pipelines using PySpark, Apache Spark, Databricks,
  • and Delta Lake on Amazon Web Services (AWS), processing 2TB+ daily big data advertising datasets and
  • achieving 30% reduction in pipeline latency for Apple marketing analytics.
  • Optimized distributed computing workloads and SQL queries via partition pruning, broadcast joins, and cluster tuning,
  • delivering 40% improvement in query performance and reducing AWS cloud computing costs by 18%.
  • Built a comprehensive data quality and data governance framework with automated schema enforcement, data
  • validation, and anomaly detection, reducing data defects by 35% and maintaining 99.5% data accuracy across 8+ data
  • sources.
  • Architected data warehouse and data lake solutions integrating 8+ advertising data sources including impression data,
  • clickstreams, and conversion events into Delta Lake Lakehouse, enabling real-time data processing and analytics.
  • Orchestrated end-to-end data pipelines using Apache Airflow DAGs and GitHub Actions CI/CD pipelines, enabling
  • automated testing, deployment, and scheduling across development, staging, and production environments.
  • Developed Python automation scripts for internal tooling and client-facing workflows, eliminating manual processes;
  • built Python Slack bots for real-time pipeline alerts, job status notifications, and SLA breach warnings.
  • Monitored pipeline health and data infrastructure using Datadog, Grafana, Splunk, and AWS CloudWatch, proactively
  • resolving bottlenecks and reducing incident response time by 25%.
  • Managed data infrastructure inventory across AWS and on-premises systems, tracking resource utilization and
  • rightsizing compute for cost efficiency.
  • Leveraged AI-assisted development tools including Claude Sonnet and Claude Code daily for code generation,
  • debugging, and pipeline optimization, improving engineering productivity by 30%.
  • Collaborated cross-functionally with Apple product managers, data scientists, and business stakeholders to translate
  • requirements into optimized data models, improving dashboard performance by 25%.

NOC Support Engineer

Tata Consultancy Services (TCS) -- Hyderabad, India

December 2021 - September 2022

Provided 24/7 production support for mission-critical Apache Spark and ETL pipeline workflows on AWS, maintaining 99.8% system availability across high-volume big data batch processing environments. Delivered bare-metal hardware and software support for on-premises Linux servers, managing full incident lifecycle from triage to resolution across hybrid on-premises and cloud computing environments. Diagnosed data pipeline failures and performance bottlenecks using AWS CloudWatch, Datadog, and Splunk, performing root-cause analysis and delivering fixes that reduced recurring incidents by 30%. Automated monitoring, alerting, and data validation workflows using Python scripting and Apache Airflow, reducing manual operational intervention by 40% and improving mean time to resolution (MTTR). Maintained incident runbooks and technical documentation, enabling faster on-call handoffs and reducing escalation rates through cross-functional team collaboration.

  • Provided 24/7 production support for mission-critical Apache Spark and ETL pipeline workflows on AWS, maintaining
  • 99.8% system availability across high-volume big data batch processing environments.
  • Delivered bare-metal hardware and software support for on-premises Linux servers, managing full incident lifecycle
  • from triage to resolution across hybrid on-premises and cloud computing environments.
  • Diagnosed data pipeline failures and performance bottlenecks using AWS CloudWatch, Datadog, and Splunk, performing
  • root-cause analysis and delivering fixes that reduced recurring incidents by 30%.
  • Automated monitoring, alerting, and data validation workflows using Python scripting and Apache Airflow, reducing
  • manual operational intervention by 40% and improving mean time to resolution (MTTR).
  • Maintained incident runbooks and technical documentation, enabling faster on-call handoffs and reducing escalation rates
  • through cross-functional team collaboration.

Associate System Engineer

Tata Consultancy Services (TCS) -- Hyderabad, India

April 2021 - December 2021

Completed TCS onboarding training covering AWS cloud fundamentals, Linux system administration, and Python scripting basics. Assisted the team in monitoring data pipelines, logging incidents, and resolving support tickets under senior engineer supervision. Practiced SQL query writing, shell scripting, and basic ETL concepts in a collaborative project environment. Participated in Agile daily standups and sprint ceremonies, developing familiarity with professional software delivery workflows.

  • Completed TCS onboarding training covering AWS cloud fundamentals, Linux system administration, and Python
  • scripting basics.
  • Assisted the team in monitoring data pipelines, logging incidents, and resolving support tickets under senior engineer
  • supervision.
  • Practiced SQL query writing, shell scripting, and basic ETL concepts in a collaborative project environment.
  • Participated in Agile daily standups and sprint ceremonies, developing familiarity with professional software delivery
  • workflows.

Skills

Cloud & Infrastructure: Amazon Web Services - AWS (S3
EMR
Glue
Redshift
Lambda
Athena
CloudWatch)
Docker
On-Premises
Bare-Metal
Linux
Cloud Computing
Data Engineering: PySpark
Apache Spark
Databricks
Delta Lake
ETL Pipeline Development
Data Ingestion
Orchestration & CI/CD: Apache Airflow
Prefect
GitHub Actions
CI/CD Pipelines
Batch Processing
Real-time Data
Performance & Query: Spark Tuning
SQL Optimization
Partition Pruning
Broadcast Joins
Caching
Query
Programming: Python (Advanced)
SQL
Scala
Bash/Shell Scripting
Pandas
Git
Monitoring: Splunk
Datadog
Grafana
AWS CloudWatch
Incident Management
SLA Tracking
Alerting
Automation & AI Tools: Python Automation
Slack Bots (Python)
Claude Sonnet
Claude Code
Cursor IDE
AI-assisted
Soft Skills: Cross-functional Collaboration
Stakeholder Communication
Problem Solving
Analytical Thinking
Agile

Education

Bachelor of Technology in Computer Science and Engineering

Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology -- Chennai, India

2016 - 2020

Let's Connect

I'm always open to new opportunities and interesting conversations.

Built with GradJobs Portfolio