Data Engineer

  • Company : Multicloud4u Technologies
  • Requirement Type : Full Time
  • Industry : Information Technology
  • Location : City: Gurgaon State: Haryana Country: India (IN)
  • Max Salary : 200000 INR
  • Key Skills : Python, Pyspark and SQL
  • Your unique referral id is
    Share via your unique link below and earn up to INR 10,000 on successful referral.

  • Experience in Year : 3 - 5
  • Domain Requirements : Data Engineer
  • Domain Experience : 2
  • Authorized To Work : India
  • Description

    Data Engineer


    Data engineers work closely with Subject Matter Experts (SMEs) to design

    the data model, develop data pipelines, and integrate our system

    with external systems containing the data. Data engineers also need to provide guidance and support on how to access and leverage the data foundation to create new workflows or analyze data




    Data Pipeline Development & Maintenance


    • Integrate new data sources to Foundry using Data Connection


    • Implement 2-way integrations between Foundry and external systems


    • Develop pipelines transforming tabular or unstructured data

    • Implement data transformations in PySpark or Pipeline Builder

    to derive new datasets or create ontology objects

    • Set up support structures for pipelines running in production

    • Monitor and debug critical issues such as data staleness or data quality

    • Improve performance of data pipelines (latency, resource usage)

    • Design and implement an ontology based on business requirements and

    available data

    • Provide data engineering context for application development




    Minimum Criteria Details




    • Between 1 and 3 years of experience, ideally in a customer-facing role

    • Experience in Python/PySpark, or experienced in another

    programming language and willing to learn Python and PySpark on their


    • Data engineering experience preferred over data science

    • Programming experience requiring collaborative software development




    Coding Skills




    •Python – complete language proficiency


    • SQL – proficiency in querying language (join types, filtering, aggregation)


    and data modeling (relationship types, constraints)

    • PySpark – basic familiarity (DataFrame operations, PySpark SQL

    functions) and differences with other DataFrame implementations





    Desirable but not essential:


    • knowledge of industrial processes


    • knowledge of SAP, Oracle, Salesforce or similar ERP's.




    Frameworks and Conceptual Familiarity




    • Distributed compute – conceptual knowledge of Hadoop and Spark (driver,


    executors, partitions)

    • Databases – general familiarity with common relational database models

    and proprietary instantiations, such as SAP, Salesforce etc.

    • Git – knowledge of version control / collaboration workflows and

    best practices

    • Iterative working – familiarity with agile and iterative working

    methodology and rapid user feedback gathering concepts

    • Data quality – best practices



    Contact Recruiter : [email protected] Note: This Requirment is either from the Multicloud4u Technologies or from its global partner, please contact recuiter directly for further information
    Quick apply for the job


    Recommended Jobs For You