** Unfortunately no visa sponsorship ( now or in the future) or contracting**
Job Title: Spark Scala Developer (GCP Batch Processing)
Location - Hybrid- Bentonville, AR
We are looking for a skilled Spark Scala Developer with strong experience in Google Cloud Platform (GCP) to support and enhance our batch data processing pipelines. The ideal candidate will be hands-on, detail-oriented, and comfortable working across the full development lifecycle in a production environment.
Required Skills & Qualifications
· Strong programming experience in Scala with hands-on expertise in Apache Spark.
· Experience working with GCP services such as Dataproc, BigQuery, Cloud Storage, and Airflow.
· Solid understanding of batch data processing frameworks and distributed systems.
· Knowledge of data validation techniques, data quality checks, and debugging production data issues.
· Ability to analyze logs, identify root causes, and resolve production incidents efficiently.
· Familiarity with Unix/Linux environments and scripting.
Soft Skills
· Strong problem-solving and analytical skills.
· Ownership mindset with attention to detail and quality.
Role Expectations (Day-to-Day Activities)
· Develop and enhance Spark Scala batch jobs.
· Test and validate data pipelines in staging and production.
· Deploy code to production and ensure successful job execution.
· Investigate and resolve data or pipeline issues.
· Support ongoing operations and ensure pipeline reliability.
Key Responsibilities
· Design, develop, and maintain batch data processing jobs using Apache Spark (Scala) on GCP.
· Perform end-to-end development activities including coding, unit testing, and integration testing.
· Manage and execute production deployments following established release processes.
· Conduct data validation and reconciliation across production and staging environments to ensure data accuracy and consistency.
· Monitor batch jobs, troubleshoot failures, and provide timely support for production issues.
· Optimize Spark jobs for performance, scalability, and cost efficiency on GCP.
· Maintain documentation for pipelines, processes, and operational procedures.