Data Engineer / Analytics Engineer (w/m/d)

InCycling GmbH
Berlin Full-time 🌐 English
IG
Added to JobCollate: April 9, 2026

AI Summary Powered by Gemini

This role is for a Data Engineer/Analytics Engineer to build and own data pipelines for a Berlin-based startup in the chemical trading industry. Key requirements include Python and SQL experience, data wrangling skills, and a strong problem-solving mindset, offering an exciting opportunity to shape data infrastructure from the ground up.

Job Description

About the job: You will be the first dedicated data hire at InCycling: a Berlin-based startup building AI-driven infrastructure for surplus chemical trading in a highly regulated industry. The data you'll work with is messy, incomplete, and arriving in every format imaginable, from structured ERP exports to blurry photos of warehouse labels. Your job will be to own it, understanding what matters, what’s wrong, and what’s missing, closely collaborating with the DS/ML owner. This role starts in Data Engineering but will grow with you toward data science and analytics as the platform scales. Tasks In this role you will: Build Python-based data pipelines on AWS, from early foundations to a robust maintainable system, to ingest, validate, transform, and analyze complex chemical data. Parse and extract structured data from messy sources: fragmented proprietary databases, PDFs, photos, free-form text. Understand why half the fields are blank, why the same chemical has 15 different names, and what to do about it. Develop domain understanding of our data. Help define our data infrastructure while it's still taking shape. The tools, patterns, and practices we establish now will stick around as the platform grows. Use AI tools (we use Claude Code) to structure and accelerate your work. Contribute to design discussions, code reviews, and retrospectives. We are a team of six, your opinions will matter. Requirements You should have experience or a good foundation in: Python and SQL for scripting and data manipulation (2–4 years of experience is ideal, but strong coursework and adjacent experiences may suffice); Data wrangling, validation, and cleansing (whether algorithmic rules- , or visualization-based); Strong analytical skills and a problem-solving mindset: you genuinely enjoy the challenge of turning messy data into working schemas and structures; Good engineering practices; version control, testing, linting, a care for code quality; Strong communication skills, kindness, directness, and persistence. Our working language is English, and we work 3/2 hybrid in Berlin. The following skills (or a strong interest in learning them) are a plus: Experience with Pydantic, pandera, Great Expectations, rapidfuzz, or similar tools; Interest in modern, lightweight tech stacks, e.g. uv, Polars, dlt, DuckDB; Conceptual knowledge of enterprise orchestration and transformation tools (e.g. Airflow, Dagster, dbt), object-oriented programming, design patterns, CI/CD; Experience with PDF parsing (e.g. pdfplumber), OCR, and unstructured data extraction; LLM orchestration; agentic workflows for data, knowledge bases, vector databases; Domain knowledge in chemical industry, pharmacology, or product supply. Benefits We offer: The opportunity to build a digital platform from the ground up, influencing key architectural and engineering decisions; A small, highly collaborative team, fast decision-making. Direct mentorship from experienced data scientists and engineers (you won’t be learning from docs alone!), direct access to founders. A mission-driven product contributing to the circular economy by reducing waste and CO₂ emissions in the chemical industry. A pragmatic engineering culture with modern tools, including AI-assisted development. A flexible hybrid working setup and a culture built on ownership, collaboration, and continuous learning. Starting as a 12-month engagement tied to our funding cycle, with every intention of building a long-term journey together as InCycling grows. A starting salary in the €65-75k range (competitive for an early-stage startup in Berlin) Urban Sports Club membership as a company health benefit. Access to corporate employee benefits platforms with regularly changing monthly offers and discounts. About InCycling: Every year €26 billion worth of perfectly usable chemical raw materials are getting destroyed, generating millions of tonnes of avoidable CO₂ emissions and hazardous waste, not because they lack value, but because no system exists to capture it. InCycling is changing that. We are building a cloud-native AI-driven B2B platform with deep SAP integration, automated regulatory compliance workflows, and ML-based supply-vs-demand matching for scalable surplus chemical management. We are live with our first global DAX-tier enterprise pilot across 7 production sites in 4 countries and 3 continents with over 500 tonnes of surplus chemicals already in our deal pipeline. You would be joining at the moment where the foundation you build will define how the platform scales. We welcome candidates from all backgrounds, who believe in our mission. If you do not match 100% of the requirements, but believe you can do the job, please apply: perfect candidates do not exist, but wonderful colleagues do. Find Jobs in Germany on Arbeitnow

Full Description

About the job: You will be the first dedicated data hire at InCycling: a Berlin-based startup building AI-driven infrastructure for surplus chemical trading in a highly regulated industry. The data you'll work with is messy, incomplete, and arriving in every format imaginable, from structured ERP exports to blurry photos of warehouse labels. Your job will be to own it, understanding what matters, what’s wrong, and what’s missing, closely collaborating with the DS/ML owner. This role starts in Data Engineering but will grow with you toward data science and analytics as the platform scales. Tasks In this role you will: Build Python-based data pipelines on AWS, from early foundations to a robust maintainable system, to ingest, validate, transform, and analyze complex chemical data. Parse and extract structured data from messy sources: fragmented proprietary databases, PDFs, photos, free-form text. Understand why half the fields are blank, why the same chemical has 15 different names, and what to do about it. Develop domain understanding of our data. Help define our data infrastructure while it's still taking shape. The tools, patterns, and practices we establish now will stick around as the platform grows. Use AI tools (we use Claude Code) to structure and accelerate your work. Contribute to design discussions, code reviews, and retrospectives. We are a team of six, your opinions will matter. Requirements You should have experience or a good foundation in: Python and SQL for scripting and data manipulation (2–4 years of experience is ideal, but strong coursework and adjacent experiences may suffice); Data wrangling, validation, and cleansing (whether algorithmic rules- , or visualization-based); Strong analytical skills and a problem-solving mindset: you genuinely enjoy the challenge of turning messy data into working schemas and structures; Good engineering practices; version control, testing, linting, a care for code quality; Strong communication skills, kindness, directness, and persistence. Our working language is English, and we work 3/2 hybrid in Berlin. The following skills (or a strong interest in learning them) are a plus: Experience with Pydantic, pandera, Great Expectations, rapidfuzz, or similar tools; Interest in modern, lightweight tech stacks, e.g. uv, Polars, dlt, DuckDB; Conceptual knowledge of enterprise orchestration and transformation tools (e.g. Airflow, Dagster, dbt), object-oriented programming, design patterns, CI/CD; Experience with PDF parsing (e.g. pdfplumber), OCR, and unstructured data extraction; LLM orchestration; agentic workflows for data, knowledge bases, vector databases; Domain knowledge in chemical industry, pharmacology, or product supply. Benefits We offer: The opportunity to build a digital platform from the ground up, influencing key architectural and engineering decisions; A small, highly collaborative team, fast decision-making. Direct mentorship from experienced data scientists and engineers (you won’t be learning from docs alone!), direct access to founders. A mission-driven product contributing to the circular economy by reducing waste and CO₂ emissions in the chemical industry. A pragmatic engineering culture with modern tools, including AI-assisted development. A flexible hybrid working setup and a culture built on ownership, collaboration, and continuous learning. Starting as a 12-month engagement tied to our funding cycle, with every intention of building a long-term journey together as InCycling grows. A starting salary in the €65-75k range (competitive for an early-stage startup in Berlin) Urban Sports Club membership as a company health benefit. Access to corporate employee benefits platforms with regularly changing monthly offers and discounts. About InCycling: Every year €26 billion worth of perfectly usable chemical raw materials are getting destroyed, generating millions of tonnes of avoidable CO₂ emissions and hazardous waste, not because they lack value, but because no system exists to capture it. InCycling is changing that. We are building a cloud-native AI-driven B2B platform with deep SAP integration, automated regulatory compliance workflows, and ML-based supply-vs-demand matching for scalable surplus chemical management. We are live with our first global DAX-tier enterprise pilot across 7 production sites in 4 countries and 3 continents with over 500 tonnes of surplus chemicals already in our deal pipeline. You would be joining at the moment where the foundation you build will define how the platform scales. We welcome candidates from all backgrounds, who believe in our mission. If you do not match 100% of the requirements, but believe you can do the job, please apply: perfect candidates do not exist, but wonderful colleagues do. Find Jobs in Germany on Arbeitnow

Required Skills

IT