Location
Des Moines, IA
Salary
Not specified
Type
fulltime
Posted
Today
Job Description
Hi,
Greetings from Smart Work IT Services.
Role Title: Sr. Databricks Architect \& Developer
Location: - Remote
We are seeking a highly experienced
Senior Databricks Architect \& Developer
to design, build, and optimize enterprise-grade ETL pipelines for large-scale data migration initiatives. The ideal candidate will have deep expertise in Databricks architecture, PySpark, Delta Lake, AWS cloud services, and PostgreSQL, with proven experience migrating legacy systems (including VSAM and sequential databases) to modern cloud-based platforms.
This role requires strong hands-on development skills, architecture leadership, performance optimization capabilities, and experience managing full-scale Databricks environments.
Required Skills
Databricks, Data Factory, PySpark, Pandas, SQL, PL/SQL, Spark Structured Streaming, AWS (S3, Glue, Lambda, Redshift, EMR, and cloud infrastructure), Python, ETL pipeline design, DevOps automation, Data validation and transformation, SFTP, DoDSAFE, NIPRGPT, Data Visualization; Optimization and Monitoring - Cluster autoscaling, spot instances, cost management, Azure Monitor, CloudWatch, Databricks logs
Preferred Skills
- Experienced with designing and building high-performance ETL pipelines using Databricks (PySpark, Delta Lake, Databricks Workflows) for data migration from various sources (including VSAM files) to PostgreSQL.
- Must be proficient on architecting and configuring all aspects of Databricks platform, including set up of Landing and Staging environments, job automation, performance monitoring tool etc., ensuring smooth, secure, and efficient data flows for migration.
- Should be strong in applying advanced database knowledge (SQL, Databricks SQL, and Postgres) to optimize load performance, load balancing and manage rapid, large-volume cutovers.
- Additionally, this role contributes to data mapping, conceptual and technical design, application and technical testing with Databricks Notebooks, implement data masking, and actively support and remediate defects.
Responsibilities
- Develop high-performance scalable ETL pipelines using Databricks (PySpark, Python, Delta Lake, Databricks Workflows) for migration from legacy source databases (sequential DBs), VSAM files, and other data sources to PostgreSQL.
- Proficient on configuration and managing Databricks Landing and Staging schemas ensuring smooth, secure, and efficient data flows for migration.
- Apply advanced database knowledge (SQL, Databricks SQL, and Postgres) to optimize load performance and manage rapid, large-volume cutovers.
- Contribute to data mapping, conceptual and technical design, application and technical testing with Databricks Notebooks, implement data masking, spider web and reverse spider web logic, and actively support defect resolution to achieve a high success rate. Expected Deliverable(s) : Set up and maintenance of Databricks.
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.