Data Engineer

  • Company : Multicloud4u Technologies
  • Requirement Type : Full Time
  • Industry : Information Technology
  • Location : City: Gurgaon State: Haryana Country: India (IN)
  • Key Skills : Python, Pyspark and SQL
  • Your unique referral id is
    Share via your unique link below and earn up to INR 10,000 on successful referral.
Job highlights
  • Experience in Year : 3 - 5
  • Domain Requirements : Data Engineer
  • Domain Experience : 2
  • Authorized To Work : India
Description

Data Engineer

 

Data engineers work closely with Subject Matter Experts (SMEs) to design

the data model, develop data pipelines, and integrate our system

with external systems containing the data. Data engineers also need to provide guidance and support on how to access and leverage the data foundation to create new workflows or analyze data

 

 

 

Data Pipeline Development & Maintenance

 

• Integrate new data sources to Foundry using Data Connection

 

• Implement 2-way integrations between Foundry and external systems

 

• Develop pipelines transforming tabular or unstructured data

• Implement data transformations in PySpark or Pipeline Builder

to derive new datasets or create ontology objects

• Set up support structures for pipelines running in production

• Monitor and debug critical issues such as data staleness or data quality

• Improve performance of data pipelines (latency, resource usage)

• Design and implement an ontology based on business requirements and

available data

• Provide data engineering context for application development

 

 

 

Minimum Criteria Details

 

 

 

• Between 1 and 3 years of experience, ideally in a customer-facing role

• Experience in Python/PySpark, or experienced in another

programming language and willing to learn Python and PySpark on their

own

• Data engineering experience preferred over data science

• Programming experience requiring collaborative software development

 

 

 

Coding Skills

 

 

 

•Python – complete language proficiency

 

• SQL – proficiency in querying language (join types, filtering, aggregation)

 

and data modeling (relationship types, constraints)

• PySpark – basic familiarity (DataFrame operations, PySpark SQL

functions) and differences with other DataFrame implementations

(Pandas)

 

 

 

Desirable but not essential:

 

• knowledge of industrial processes

 

• knowledge of SAP, Oracle, Salesforce or similar ERP's.

 

 

 

Frameworks and Conceptual Familiarity

 

 

 

• Distributed compute – conceptual knowledge of Hadoop and Spark (driver,

 

executors, partitions)

• Databases – general familiarity with common relational database models

and proprietary instantiations, such as SAP, Salesforce etc.

• Git – knowledge of version control / collaboration workflows and

best practices

• Iterative working – familiarity with agile and iterative working

methodology and rapid user feedback gathering concepts

• Data quality – best practices

 

 

Contact Recruiter : [email protected] Note: This Requirment is either from the Multicloud4u Technologies or from its global partner, please contact recuiter directly for further information
Apply Quick Apply & Whatsapp

Similar jobs

Quick apply for the job

preview
preview

CAPTCHA
OR Login With
CAPTCHA
Successfully applied
You have successfully applied to 'Service Now Developer'
OK