Portfolio
charitha sree sakhamuri
Data Engineer
Data Engineer with 3+ years of experience designing and operating scalable data pipelines, analytical data models, and cloud data platforms. Strong expertise in SQL, Python, and Spark for transforming raw, high-volume data into trusted datasets used for reporting, optimization, and AI/ML workflows. Proven ability to translate business and commercialization requirements into robust data models while enforcing data quality, privacy, and governance standards
[email protected] (678)-739-1251
Experience
Data Engineer
Saayam For ALL — Remote, USA
- —Built a real-time data processing pipeline using Apache Kafka and Apache Flink to ingest and process high-volume event streams,
- —enabling near real-time reporting and operational insights.
- —Designed and implemented scalable batch and streaming workflows on Databricks (Spark Structured Streaming) over an AWS S3 data
- —lake, following Medallion (Bronze/Silver/Gold) architecture.
- —Developed and managed production-grade DAGs in Apache Airflow to orchestrate ingestion, transformation, and validation workflows
- —across multiple data sources.
- —Engineered optimized data transformations using Python, PySpark, and advanced SQL (window functions, CTEs, performance tuning) to
- —handle large-scale structured datasets efficiently.
- —Integrated AWS services including S3, Lambda, IAM, and CloudWatch to manage storage, automation, access control, and monitoring of
- —data pipelines.
- —Improved pipeline performance and reliability by implementing Delta Lake schema enforcement, partitioning strategies, and streaming
- —data validation checks, reducing latency and failure rates.
Data Engineer
RecVue — Hyderabad, Telangana
- —Translated business and commercialization requirements into scalable conceptual, logical, and physical data models supporting revenue,
- —usage, and churn analytics.
- —Designed and maintained batch data pipelines using Python, SQL, and Spark to transform raw source data into analytics-ready and ML-
- —ready datasets.
- —Implemented a data quality framework covering schema enforcement, null checks, duplicate detection, and aggregation accuracy.
- —Ensured proper handling of sensitive data by enforcing consistent schemas, controlled access patterns, and governed data exposure.
- —Built structured warehouse tables optimized for complex analytical queries and downstream reporting.
- —Optimized Spark transformations and SQL queries, improving pipeline reliability and reporting consistency by ~20%.
- —Stored and managed large-scale structured datasets in cloud object storage (AWS S3).
Junior Data Engineer
Vijeta High School — Guntur, Andhra Pradesh
- —Designed and maintained ETL pipelines integrating operational and reporting data from multiple source systems.
- —Modeled structured datasets and aggregated tables to support dashboards and recurring analytical reports.
- —Developed and optimized complex SQL queries for time-based analysis and performance reporting.
Data Engineer Intern
Skill Banc — Hyderabad, Telangana
- —Assisted in building foundational data ingestion and transformation pipelines.
- —Performed data validation, preprocessing, and schema alignment to support analytics workflows.
- —Wrote optimized SQL queries for internal reporting and insights.
- —Improved batch pipeline performance through Spark job tuning and query optimization.
Expertise
Languages & Querying: SQL (advanced joins
window functions
CTEs
query optimization)
Python
PySpark
Data Processing & Platforms: Apache Spark
Databricks
Delta Lake
Spark Structured Streaming
Orchestration: Apache Airflow (DAG design & scheduling)
Dagster
Streaming & Real-Time Systems: Apache Kafka
AWS Kinesis
Apache Flink
Data Architecture & Modeling: Medallion Architecture (Bronze/Silver/Gold)
Star & Snowflake schema
Conceptual/Logical/Physical modeling
Cloud & Storage: AWS (S3
IAM
Lambda)
Cloud Data Warehousing (Snowflake/Redshift/BigQuery-style systems)
Data Lake & Lakehouse: S3 Data Lake
Delta Lake
Partitioning strategies
Schema enforcement
Containerization & DevOps: Docker
CI/CD pipelines (GitHub Actions/Jenkins)
Git
Analytics & BI: Tableau
Power BI
Excel
Education
Master of Science
The University of Texas at Arlington
Aug 2023 — May 2025
Bachelor of Science
ICFAI University, Hyderabad
July 2019 — May 2023