Job Information
Cummins Inc. Data Engineer - Senior in Pune, India
DESCRIPTION
Although the role category specified in the GPP is Remote, the requirement is for Hybrid.
Key Responsibilities:
Design and Automation : Deploy distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
Data Quality and Integrity : Implement frameworks to monitor and troubleshoot data quality and integrity issues.
Data Governance : Establish processes for managing metadata, access, and retention for internal and external users.
Data Pipelines : Build reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
Database Structure : Design and implement physical data models to optimize database performance through efficient indexing and table relationships.
Optimization and Troubleshooting : Optimize, test, and troubleshoot data pipelines.
Large Scale Solutions : Develop and operate large-scale data storage and processing solutions using distributed and cloud-based platforms (e.g., Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB).
Automation : Use modern tools and techniques to automate common, repeatable, and tedious data preparation and integration tasks.
Infrastructure Renovation : Renovate data management infrastructure to drive automation in data integration and management.
Agile Development : Ensure the success of critical analytics initiatives using agile development technologies such as DevOps, Scrum, Kanban.
Team Development : Coach and develop less experienced team members.
RESPONSIBILITIES
Qualifications:
- College, university, or equivalent degree in a relevant technical discipline, or equivalent experience required. Licensing may be required for compliance with export controls or sanctions regulations.
Competencies:
System Requirements Engineering : Translate stakeholder needs into verifiable requirements; establish acceptance criteria; track requirements status; assess impact of changes.
Collaboration : Build partnerships and work collaboratively to meet shared objectives.
Communication : Develop and deliver communications that convey a clear understanding of the unique needs of different audiences.
Customer Focus : Build strong customer relationships and deliver customer-centric solutions.
Decision Quality : Make good and timely decisions to keep the organization moving forward.
Data Extraction : Perform ETL activities from various sources using appropriate tools and technologies.
Programming : Create, write, and test computer code, test scripts, and build scripts to meet business, technical, security, governance, and compliance requirements.
Quality Assurance Metrics : Apply measurement science to assess solution outcomes using ITOM, SDLC standards, tools, metrics, and KPIs.
Solution Documentation : Document information and solutions to enable improved productivity and effective knowledge transfer.
Solution Validation Testing : Validate configuration item changes or solutions using SDLC standards, tools, and metrics.
Data Quality : Identify, understand, and correct data flaws to support effective information governance.
Problem Solving : Solve problems using systematic analysis processes; implement robust, data-based solutions; prevent problem recurrence.
Values Differences : Recognize the value of different perspectives and cultures.
QUALIFICATIONS
Skills:
ETL/Data Engineering Solution Design and Architecture : Expert level.
SQL and Data Modeling : Expert level (ER Modeling and Dimensional Modeling).
Team Leadership : Ability to lead a team of data engineers.
MSBI (SSIS, SSAS) : Experience required.
Databricks (Pyspark) and Python : Experience required.
Additional Skills : Snowflake, Power BI, Neo4j (good to have).
Communication : Good communication skills.
Preferred Experience:
8+ years of overall experience.
5+ years of relevant experience in data engineering.
Knowledge of the latest technologies and trends in data engineering.
Technologies : Familiarity with analyzing complex business systems, industry requirements, and data regulations.
Big Data Platform : Design and development using open source and third-party tools.
Tools : SPARK, Scala/Java, Map-Reduce, Hive, Hbase, Kafka.
SQL : Proficiency in SQL query language.
Cloud-Based Implementation : Experience with clustered compute cloud-based implementations.
Large File Movement : Experience developing applications requiring large file movement for cloud environments.
Analytical Solutions : Experience in building analytical solutions.
IoT Technology : Intermediate experience preferred.
Agile Software Development : Intermediate experience preferred.
Job Systems/Information Technology
Organization Cummins Inc.
Role Category Remote
Job Type Exempt - Experienced
ReqID 2414041
Relocation Package No