Skip to main content
S

Data Engineer ETL 工程师

Supermom Business

Location

Shanghai, Shanghai, China

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

  • Responsible for data cleaning (ETL) and data warehouse construction to support large-scale AI models
  • Responsible for training and fine-tuning large AI models to meet the requirements of specific business scenarios
  • Responsible for developing supporting tools, such as dashboards and general business logic, to ensure the practicality of AI model applications
  • Must have hands-on development experience and be able to lead a team or independently complete projects related to data collection and development
  • 负责数据清洗(ETL)和数仓建设,从而为大模型服务
  • 负责大模型训练和调优,以满足对应业务场景要求
  • 负责开发周边工具,比如dashboad和普通业务逻辑,以实现大模型应用产品实用性。
  • 要有实际开发经验,带队或独立完成数据收集开发相关项目

Requirements

  • A degree in computer science or a related field is preferred. Must be familiar with professional knowledge in machine learning, deep learning, and natural language processing, with at least 1 year of experience in GPT or Gemini application development, and proficient in deep learning frameworks such as PyTorch or TensorFlow
  • Familiar with models such as Transformer, BERT, GPT, and fine-tuning algorithms like LoRA, with experience in fine-tuning models
  • Must have Java programming experience
  • Experience in backend Java development for data engineering use cases, particularly real-time processing with Apache Flink
  • Must have experience in data warehouse development and construction, such as using Flink and building ETL data cleaning pipelines
  • Experience with large model pre-training and practical application in business scenarios is a plus
  • Must have hands-on experience in setting up large models based on open-source frameworks
  • Experience in conversational AI, marketing content generation, or machine translation is preferred
  • Priority will be given to candidates with hands-on experience in Google Cloud Platform (GCP), particularly those with experience in BigQuery
  • 计算机相关专业优先,熟悉机器学习、深度学习、自然语言处理等领域专业知识,必须有过至少1年的GPT或者Gemini应用开发,熟悉pytorch/tensorflow深度学习框架;
  • 熟悉transformer、bert、gpt等模型,熟悉LoRA等微调算法,有微调模型的经验;
  • 必须有Java编程经验;
  • 在数据工程用例下后端 Java 开发经验,特别是使用 Apache Flink 的实时处理
  • 必须数仓开发和建设经验,比如flink技术和ETL数据清洗流水线搭建。
  • 有大模型预训练、实际业务场景落地经验者优先;
  • 必须有过基于开源大模型自己搭建的经验;
  • 有对话机器人,营销广告素材生成,机器翻译方向工作经验者优先。
  • 优先考虑具备 Google Cloud Platform(GCP) 实战经验,尤其是 BigQuery 相关经验的候选人。

Benefits

  • Lead community-building for Southeast Asia's largest parenting ecosystem
  • Be at the forefront of connecting brands with real parents in authentic and impactful ways
  • Work with a passionate team driving innovation in the parenting space
  • Regional exposure across three of SSEA's most dynamic markets
  • 引领东南亚最大育儿生态社区的发展与建设
  • 站在前线,以真实且有影响力的方式连接品牌与真实父母
  • 与充满热情的团队合作,共同推动育儿领域的创新
  • 拥有覆盖东南亚三大核心市场的区域曝光与发展机会

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs