Data Scientist, II

IN-Pune

APAC

Req #: 107220
Type: Employee|Employee|Regular Full-time
logo

Zebra Technologies

Connect With Us:
Connect To Our Company
				Overview:

Looking for a Data Science professional with expertise in PySpark/Databricks and experience working on different stages of a Data Science project life cycle. Incumbent is expected to build and optimize Data Pipelines, tune/enhance models, and explain output to business stakeholders. Team primarily works on Demand Forecasting, Promotion Modelling, and Inventory Optimization problems for CPG/Retail customers, prior experience in CPG/Retail is strongly preferred. 

Responsibilities:

Essential Duties and Responsibilities: 

* Design, optimize, and maintain scalable ETL pipelines using PySpark and Databricks on cloud platforms (Azure/GCP).
* Develop automated data validation process to proactively perform data quality checks.
* Optimal allocation of cloud resources to control cloud cost.
* Create and schedule jobs on Databricks platform. Work with GitHub repositories and ensure that best practices are being implemented.
* Work on Supply Chain Optimization models, such as Demand Forecasting, Price Elasticity of Demand, and Inventory Optimization.
* Build and tune forecast model, identify improvement opportunities, and perform experiments to prove value.
* Incumbent is expected to have frequent conversation with Business Stakeholders. Explain data deficiencies, forecast variances, and role of different forecast drivers.
* Follow best practices in Architecture, Coding, and BAU operations.
* Collaborate with cross-functional teams, such as Business Stakeholders, Engagement Managers, Data Ops/Job Monitoring, Product Engineering/UI, Data Science, and Data Engineering.

Qualifications:

* Preferred Education: bachelor's in computer science/IT or in similar field with strong programming exposure and master's degree in Statistics/Operations Research/Mathematics/Data Science.
* 3 - 8 years of experience in Data Science/Data Engineering. Exposure to Demand Forecasting and Inventory optimization in CPG/Retail will be a big plus.
* Proven experience building and optimizing Data Pipelines/ETL processes in PySpark, DataBricks, Python (Pandas/NumPy) and SQL. Experience working with Git as a collaboration tool.
* Good understanding of cloud platform - preferably Azure.
* Exposure to conventional time series forecasting (ESM, ARIMA) and Machine Learning models (GBM, ANN, Random Forests).
* The role is expected to work independently with very low supervision.
* Good communication skills, ability to present output to business stakeholder and convey data deficiencies.
			
Share this job: