Data Engineer

Company : Multicloud4u Technologies
Requirement Type : Full Time
Industry : Information Technology
Location : City: Gurgaon State: Haryana Country: India (IN)
Max Salary : 200000 INR
Key Skills : Python, Pyspark and SQL
Your unique referral id is 570667
Share via your unique link below and earn up to INR 10,000 on successful referral.

Quick Apply & Whatsapp
Join whatsapp group

0 like (s)

Experience in Year : 3 - 5

Domain Requirements : Data Engineer

Domain Experience : 2

Authorized To Work : India

Description

Data Engineer

Data engineers work closely with Subject Matter Experts (SMEs) to design

the data model, develop data pipelines, and integrate our system

with external systems containing the data. Data engineers also need to provide guidance and support on how to access and leverage the data foundation to create new workflows or analyze data

Data Pipeline Development & Maintenance

• Integrate new data sources to Foundry using Data Connection

• Implement 2-way integrations between Foundry and external systems

• Develop pipelines transforming tabular or unstructured data

• Implement data transformations in PySpark or Pipeline Builder

to derive new datasets or create ontology objects

• Set up support structures for pipelines running in production

• Monitor and debug critical issues such as data staleness or data quality

• Improve performance of data pipelines (latency, resource usage)

• Design and implement an ontology based on business requirements and

available data

• Provide data engineering context for application development

Minimum Criteria Details

• Between 1 and 3 years of experience, ideally in a customer-facing role

• Experience in Python/PySpark, or experienced in another

programming language and willing to learn Python and PySpark on their

own

• Data engineering experience preferred over data science

• Programming experience requiring collaborative software development

Coding Skills

•Python – complete language proficiency

• SQL – proficiency in querying language (join types, filtering, aggregation)

and data modeling (relationship types, constraints)

• PySpark – basic familiarity (DataFrame operations, PySpark SQL

functions) and differences with other DataFrame implementations

(Pandas)

Desirable but not essential:

• knowledge of industrial processes

• knowledge of SAP, Oracle, Salesforce or similar ERP's.

Frameworks and Conceptual Familiarity

• Distributed compute – conceptual knowledge of Hadoop and Spark (driver,

executors, partitions)

• Databases – general familiarity with common relational database models

and proprietary instantiations, such as SAP, Salesforce etc.

• Git – knowledge of version control / collaboration workflows and

best practices

• Iterative working – familiarity with agile and iterative working

methodology and rapid user feedback gathering concepts

• Data quality – best practices

Contact Recruiter : [email protected] Note: This Requirment is either from the Multicloud4u Technologies or from its global partner, please contact recuiter directly for further information

Quick Apply & Whatsapp

Quick apply for the job

Recommended Jobs For You