Data Quality Engineer (with Data Science and Python)

Posted 5 days 9 hours ago by eTeam

Permanent
Not Specified
Other
Utrecht, Netherlands
Job Description

Job Title: Data Quality Engineer (with Data Science and Python)

Location: Utrecht, NL

Duration: 6-12 Months

Hiring Type: Contract


Job Summary:

We are seeking a detail-oriented Data Quality Engineer with a strong foundation in data science and Python programming. This role will ensure data integrity across platforms, develop automated quality checks, and collaborate closely with data scientists, analysts, and engineers to deliver highquality data for analytical and operational use.


Key Responsibilities:

• Design, develop, and maintain automated data quality validation frameworks using Python.

• Collaborate with data engineers and scientists to ensure data consistency, completeness, and reliability across pipelines.

• Identify data quality issues, investigate root causes, and implement resolutions or process improvements.

• Create and maintain data quality metrics and dashboards to monitor the health of data systems. • Apply data science techniques (e.g., anomaly detection, clustering, statistical modeling) to identify and predict data quality issues.

• Work with stakeholders to define data quality requirements and standards.

• Perform exploratory data analysis (EDA) and profiling to assess quality and discover patterns or anomalies.

• Develop documentation and promote best practices for data validation and governance.


Required Skills and Qualifications:

• Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or related field.

• 3+ years of experience in a data engineering, data quality, or data science role.

• Strong programming skills in Python with hands-on experience in data manipulation (Pandas, NumPy).

• Proficiency with SQL and relational databases (e.g., PostgreSQL, MySQL).

• Experience with data quality tools (e.g., Great Expectations, Deequ, Apache Griffin) is a plus.

• Familiarity with data pipelines, ETL processes, and cloud platforms (AWS, GCP, or Azure).

• Knowledge of statistical techniques and machine learning concepts for data validation.

• Excellent analytical, troubleshooting, and communication skills.