




Summary: Seeking a dedicated Senior AI Data Engineer to design, build, and scale robust ETL/ELT pipelines optimized for AI workloads, transform unstructured data, and maintain AI knowledge bases. Highlights: 1. Design, build, and scale robust ETL/ELT pipelines optimized for AI workloads 2. Transform unstructured data for LLM consumption 3. Maintain and automate the data-to-model lifecycle At TechBiz Global, we are providing recruitment service to our TOP clients from our portfolio. We are currently looking for a dedicated **Senior AI Data Engineer** to join one of our **clients' teams**. If you're looking for an exciting opportunity to grow in an innovative environment, this could be the perfect fit for you. #### **Responsibilities:** * Design, build, and scale robust ETL/ELT pipelines optimized for AI workloads, including RAG, fine\-tuning, and batch inference. * Transform unstructured data sources such as PDFs, logs, and transcripts into structured and vectorized formats suitable for LLM consumption. * Maintain and automate the data\-to\-model lifecycle, ensuring AI knowledge bases remain synchronized with changing business data. * Develop and maintain real\-time feature pipelines that support low\-latency AI and machine learning applications. * Integrate data platforms with Kafka and other event\-driven systems to enable real\-time processing and AI\-driven responses. * Manage and optimize Feature Stores to ensure consistency between model training and production environments. * Implement automated data quality controls and validation processes to ensure the reliability and accuracy of AI training and inference data. * Establish and maintain data lineage frameworks to provide traceability, auditability, and regulatory compliance across data workflows. * Enforce data security, privacy, and governance standards, including PII protection and compliance with industry regulations. * Manage data movement and synchronization across on\-premises systems, cloud platforms, and data warehouses. * Optimize data storage and retrieval strategies for Vector Databases to support high\-performance RAG and AI search workloads. * Collaborate with Data Scientists, ML Engineers, Software Engineers, and business stakeholders to deliver scalable AI data solutions. * 10\+ years of experience in Data Engineering or Backend Engineering with a strong focus on data platforms and pipelines. * 2\+ years of hands\-on experience supporting AI/ML data pipelines, including data preparation for machine learning and generative AI applications. * Expert\-level proficiency in Python and SQL; experience with Java or Scala is an advantage. * Strong experience building and maintaining real\-time data streaming solutions using Apache Kafka, Flink, or Spark Streaming. * Hands\-on experience with modern data orchestration and transformation tools such as Airflow, dbt, and Prefect. * Experience working with Vector Databases and Feature Stores to support AI and machine learning workloads. * Strong knowledge of cloud\-based data services on AWS, Azure, or GCP, including services such as Glue, Kinesis, Data Factory, or Dataflow. * Experience deploying and managing data workloads in Kubernetes (K8s) environments. * Proven experience handling sensitive data within regulated industries such as Fintech, Healthcare, or other compliance\-driven environments. * Strong understanding of data quality, governance, security, and privacy best practices. * Bachelor's degree in Computer Science, Software Engineering, Information Systems, or a related technical field. Equivalent practical experience will also be considered. * Excellent problem\-solving skills and the ability to collaborate effectively with cross\-functional engineering, data, and AI teams.


