Data Engineer, Pune

Werkstudierende, Vollzeit · Pune

What you'll do

We are looking for a strong Senior Data Engineer with deep experience in Java based data platforms and hands-on expertise with GCP, GCS, Iceberg and Parquet. The role involves building efficient data pipelines, improving storage and query performance, and enabling a scalable data lake architecture. Experience with Trino or Apache Spark is a plus.

 

Java +Data Engineering is a must due to our tech stack requirement; would not consider Pyspark candidates.



Key Responsibilities

 Data Engineering and Development

  • Design and develop scalable data ingestion and transformation frameworks using Java.
  • Build and maintain Iceberg tables stored on GCS using Parquet format.
  • Continuously improve pipeline performance through better partitioning, compression, data layouts and efficient Java code.

2. Cloud Engineering (Google Cloud Platform)

  • Develop and optimize data solutions using GCP storage and compute services.
  • Tune GCS usage, IAM configuration and lifecycle rules for reliability and cost.
  • Implement data residency and security for high performance and low latency workloads.

3. Data Lake Operations

  • Manage Iceberg metadata, schema evolution, commit operations and manifest handling.
  • Improve read and write performance through partition strategies, clustering, file sizing and metadata compaction.
  • Troubleshoot concurrent write issues and optimize execution paths.

4. Integration and Query Layer

  • Work with Trino or Spark to run efficient queries on Iceberg datasets.
  • Improve Trino catalog performance through caching, connector tuning and configuration changes.
  • Integrate Java based applications with data lake endpoints and reduce application query latencies.

5. Testing and Quality

  • Build comprehensive automated tests for schema validation, data correctness and regression detection.
  • Validate data performance under different loads and benchmark improvements.

6. DevOps and Observability

  • Implement CI and CD pipelines for data services.
  • Develop monitoring for Iceberg metadata operations, GCS performance, Trino query speeds and storage metrics.
  • Identify bottlenecks and drive continuous performance improvements across the platform.
Why we should decide on you
  • 5 + years of experience
  • Prior experience migrating financial/regulatory datasets.
  • Experience with Regulatory Reporting or similar enterprise workloads.
  • Familiarity with large-scale performance benchmarking and cost modelling.
Required Skills
  • Strong Java development background.
  • Deep hands-on experience with GCP and GCS.
  • Practical experience with Apache Iceberg including table design and performance tuning.
  • Strong knowledge of Parquet format, compression options and file optimization techniques.
  • Good understanding of distributed systems and data consistency.
  • Experience building scalable and high performance data platforms.

Nice to Have

  • Experience with Trino for federated querying.
  • Experience with Apache Spark for distributed data processing.
  • SQL tuning experience.
  • Knowledge of Oracle to Iceberg migration patterns.

Soft Skills

  • Strong analytical and debugging capability.
  • Clear communication and ability to work with cross functional teams.
  • Ownership mindset and drive to deliver performance improvements without supervision.

Education

  • Bachelor or Master degree in Computer Science, Engineering or equivalent.
Why you should decide on us
  • Let’s grow together, join a market leading SaaS company – our agile character and culture of innovation enables you to design our future.
  • We provide you with the opportunity to take on responsibility and participate in international projects. 
  • In addition to our buddy-program, we offer numerous individual and wide-ranging training opportunities during which you can explore technical and functional areas. 
  • Our internal mobility initiative encourages colleagues to transfer cross functionally to gain experience and promotes knowledge sharing.
  • We are proud of our positive working atmosphere characterized by a supportive team across various locations and countries and transparent communication across all levels. 
  • Together we're better - meet your colleagues at our numerous team events.
To get a first impression, we only need your CV and look forward to meeting you in a (personal/virtual) interview! 
Recognizing the benefits of working in diverse teams, we are committed to equal employment opportunities regardless of gender, age, nationality, ethnic or social origin, disability, and sexual identity. 
Are you interested? Apply now! 
https://www.regnology.net

R&D_N_2025_02
R&D_N_2025_03
About us
Regnology is a leading technology firm on a mission to bring efficiency and stability to the financial markets. With an exclusive focus on regulatory reporting and more than 35,000 financial institutions, over 100 regulators, international organizations, and tax authorities relying on our solutions to process their regulatory reporting data, we are uniquely positioned to bring greater data quality, automation, and cost savings to all market participants. With a global team of over 1,200 employees, our clients can swiftly implement and derive value from our solutions and stay ahead of regulatory changes. Established in 2021 through the merger of BearingPoint RegTech and Vizor Software, Regnology is rapidly growing into a leading global regulatory reporting powerhouse.

Visit our website www.regnology.net
 

Want to know more about Regnology ? Find our news and business events on LinkedIn: https://www.linkedin.com/company/regnology/mycompany/

Want to know more about life and people at Regnology ? Check out our Instagram page: https://www.instagram.com/peopleofregnology/

Wir freuen uns von Dir zu hören! 
Wir freuen uns über Dein Interesse an Regnology. Bitte fülle das folgende kurze Formular aus. Solltest Du Schwierigkeiten mit dem Upload Deiner Daten haben, wende dich gerne per Email an recruiting@regnology.net.
Dokument wird hochgeladen. Bitte warten Sie.
Fügen Sie alle erforderlichen (mit einem * gekennzeichneten) Angaben hinzu, um Ihre Bewerbung abzusenden.