Data Engineer

Rivago Infotech Inc
Charlotte, NC

Role : Data Engineer

Location : Charlotte, NC (Onsite)

Contract

End Client - Banking

Experience - 8+


Required Skills & Experience

Programming: Python/PySpark, Scala is a plus

Big Data: Hadoop (HDFS, YARN), Hive, Spark (optimization, tuning)

Orchestration: Apache Airflow

Databases/ETL: MongoDB (indexing, sharding, tuning) SQL Server & SSIS (development, migration) Strong SQL & stored procedures

Data Lake: HDFS, Hive, Parquet/ORC, partitioning, compaction

APIs: REST-based ingestion Reverse engineering & lineage tools

CI/CD & DevOps: Git, Jenkins, Docker, IaC

Monitoring: logging, metrics, lineage


Key Responsibilities

Reverse Engineering & Data Mapping

Reverse engineer ETL pipelines (SSIS, Spark, stored procedures) to document data

flows, logic, and transformations.

Perform detailed source-to-target mappings with field-level transformations and business

rules.

Build data dictionaries, lineage, and mapping artifacts.

Collaborate with SMEs to uncover undocumented logic.

Identify data model gaps and recommend remediation.

ETL Pipeline Remediation

Design and refactor pipelines aligned to new source APIs and data contracts.

Re-engineer ETL for 1:1 functional parity during migrations.

Implement schema evolution, transformations, and mapping changes (batch &

streaming).

Eliminate redundancy and optimize legacy logic.

Build modular, reusable pipelines using Spark/PySpark/Scala.

Modernize SSIS and integrate with orchestration frameworks.

Orchestrate workflows in Airflow (DAGs, dependencies, SLAs).

Implement logging, error handling, alerting, and metadata capture.

Data Storage Optimization

Simplify schemas; remove redundant/obsolete data across Hive and MongoDB.

Optimize partitioning, clustering, and file formats (Parquet, ORC, Avro).

Redesign MongoDB indexing, sharding, and collections.

Tune HDFS, Hive, MongoDB, and SQL Server for performance and cost.

Implement lifecycle management, archival, and retention.



Functional Skills

  • Experience in ETL migration/remediation projects
  • Strong reverse engineering of legacy ETL (SSIS, Spark, scripts)
  • Expertise in STM, transformation specs, and lineage artifacts
  • Data modeling (dimensional, normalized, denormalized)
  • Schema evolution and zero-downtime migrations
  • Performance tuning across compute and storage layers
  • Strong debugging and problem-solving for distributed systems

Preferred Qualifications

AI/ML-assisted ETL remediation or code conversion

Experience with Wiz or Palo Alto Prisma (APIs, data models, risk metrics)

Prior Prisma to Wiz (or similar CSPM/CNAPP) migrations

Knowledge of CSPM/CNAPP domains (vulnerabilities, identities, exposures)

Experience in regulated, compliance-heavy environments




---

---

// // //