Spark Developer
Descrição da Vaga
We are seeking an experienced Senior Spark Data Developer to design, build, and optimize large\-scale data processing solutions. The ideal candidate has deep expertise in Apache Spark, distributed systems, and modern data engineering practices. In this role, you will work closely with data architects, analysts, and cross\-functional teams to deliver high\-performance data pipelines and support advanced analytics initiatives. ### **Key Responsibilities** * Design, develop, and optimize large\-scale ETL/ELT pipelines using Apache Spark (Spark SQL, DataFrames, PySpark/Scala). * Build and maintain scalable data solutions across batch and streaming architectures. * Work with data architects to define data models, data flows, and storage strategies. * Ensure high performance, reliability, and fault tolerance of data processing workloads. * Collaborate with cross\-functional teams including Data Science, Product, BI, and Engineering. * Perform code reviews, enforce engineering best practices, and provide technical mentorship. * Troubleshoot performance issues and optimize Spark jobs and cluster configurations. * Contribute to CI/CD automation and data workflow orchestration. * Work within Agile/Scrum processes, providing estimates and delivering on sprint commitments. ### **Required Qualifications** * Bachelor’s degree in Computer Science, Engineering, Data Science, or related field (or equivalent experience). * 5–8\+ years of experience in data engineering or big data processing. * Strong hands\-on experience with Apache Spark (Spark SQL, RDDs, DataFrames, Structured Streaming). * Proficiency in Python or Scala for distributed data processing. * Strong understanding of distributed systems, parallel processing, and performance tuning. * Experience with cloud data platforms such as AWS EMR, Databricks, Azure Synapse, or GCP Dataproc. * Expertise with relational and NoSQL databases (e.g., PostgreSQL, Snowflake, Delta Lake, Cassandra). * Experience building automated pipelines using tools such as Airflow, ADF, Prefect, or similar. * Strong knowledge of Git, CI/CD, and DevOps practices. * Solid understanding of data warehousing concepts and best practices. ### **Preferred Qualifications** * Hands\-on experience with Databricks including Delta Lake, Unity Catalog, and notebook workflows. * Knowledge of real\-time streaming technologies (Kafka, Kinesis, EventHub). * Experience with containerization (Docker, Kubernetes). * Familiarity with machine learning workflows and data preparation. * Experience with Terraform or IaC tools. * Background in finance, insurance, or other enterprise domains (optional). ### **Soft Skills** * Excellent communication and collaboration skills. * Ability to work independently and lead complex technical initiatives. * Strong analytical and problem\-solving mindset. * Ability to manage priorities and mentor junior engineers.
Vaga originalmente publicada em: indeed
💼 Encontre as melhores oportunidades para desenvolvedores no Job For Dev