Job Information
Cummins Inc. AI Intern - C in Beijing, China
This position is not available in GPP database. Talent Acquisition team member will fill in the Posting description after intake meeting.
Qualifications
Job title: AI Intern - C
The Cummins Data Science and Artificial Intelligence Department is seeking undergraduate or graduate students for in-depth research opportunities. These opportunities will cover the exploration and practice of the application of Large Language Models (LLM) to vertical scenarios of automotive and industrial domains.
Job summary:
• Explore application scenarios of Large Language Model in automotive and industrial domains
• Designing and developing frameworks for Large Language Model in vertical domains
• Build and optimize applications based on the Large Language Model.
• Evaluate and analyze the effectiveness of Large Language Model in vertical domains.
Key responsibilities:
Code reverse analysis: Conduct reverse analysis on the given C++ code according to the API documentation, disassemble the core logic, accurately locate the code implementation positions of each functional module, and output a structured analysis report. Independently develop engineering modules.
Construct datasets: Combine C++ code with large language models. Based on the results of reverse analysis, work with business personnel to construct corresponding natural - language texts for C language code, and build an NL2C (Natural Language to Code) dataset.
Model testing and training: Test the code - generation capabilities of different models in the industrial field, output evaluation reports, and use the NL2C dataset for model fine - tuning to enhance the model's code - generation ability.
Design and implement application solutions for large language models: Design and implement solutions based on large language models according to specific application scenarios.
Develop and optimize the training and inference processes of large language models: Optimize the training and inference processes of large language models to improve the model's efficiency, accuracy, and interpretability.
Evaluate and analyze the application effects of large language models: Evaluate and analyze the application effects of large language models in the automotive and industrial fields, and make improvements and optimizations based on the results.
Write technical documents and reports: Write documents and reports on the application research and development of large language models in the automotive and industrial fields, and record work achievements, experiences, and lessons.
During and after the entire internship, present convincing and verified cases to team members and stakeholders.
Qualifications and competencies:
Currently pursuing an undergraduate or postgraduate degree in Computer Science, Data Science, Artificial Intelligence, or related fields: Possess a solid foundation in computer science, with familiarity in data structures, algorithms, and programming languages.
Have basic development capabilities in C++ and C, and be able to independently develop engineering modules. Preference will be given to those familiar with unit testing.
Be familiar with technologies related to large language models (LLMs): Understand concepts such as word embeddings, language models, Transformers, prompt engineering, retrieval-augmented generation, and model fine-tuning.
Have experience deploying mainstream open-source large language models such as ChatGLM, Qwen, and Llama, and understand their respective applications and technical characteristics.
Be proficient in Prompt Engineering and be good at improving the quality of model code generation through prompt optimization. Master lightweight fine-tuning techniques such as LoRA and Adapter. Preference will be given to those with experience in fine-tuning code domain models.
Be good at using large language models to assist in programming development. Preference will be given to those who have used AI code generation tools such as Cursor and GitHub Copilot.
Be familiar with Python or other programming languages: Be able to proficiently write and develop code using Python or other programming languages.
Have good learning and problem-solving abilities: Be able to quickly learn new knowledge and technologies and independently solve problems.
Have good communication and teamwork skills: Be able to effectively communicate with team members and collaborate to complete project tasks.
职位名称:人工智能实习生 – C 语言
康明斯数据科学和人工智能部正在寻求本科生或研究生的深入研究机会。这些机会会涵盖大语言模型在汽车和工业等垂直领域的应用探索和实践。
工作概要:
探索大语言模型在汽车和工业领域的应用场景
设计和开发垂直领域的大语言模型框架
构建和优化基于大语言模型的应用系统
评估和分析大语言模型在垂直领域下的应用效果
主要职责:
代码逆向分析:根据API文档,对给定的C++代码进行逆向分析,拆解核心逻辑,精准定位各功能模块的代码实现位置,并输出结构化分析报告,独立进行工程模块的开发。
构造数据集:将C++代码与大语言模型相结合,针对逆向分析得到的结果,和业务人员一起针对C语言代码构造相应的自然语言文本,构造NL2C的数据集。
模型测试与训练:测试不同模型在工业领域下的代码生成能力,输出评估报告,并使用NL2C数据集进行模型微调,加强模型的代码生成能力。
设计和实现大语言模型的应用方案: 根据具体应用场景,设计和实现基于大语言模型的解决方案,
开发和优化大语言模型的训练和推理流程: 优化大语言模型的训练和推理流程,提高模型的效率、准确性和可解释性
评估和分析大语言模型的应用效果: 对大语言模型在汽车和工业领域的应用效果进行评估和分析,并根据结果进行改进和优化
撰写技术文档和报告: 撰写关于大语言模型在汽车和工业领域的应用研究和开发的文档和报告,记录工作成果和经验教训。
在整个实习期间和实习结束后,向团队成员和利益相关者展示令人信服的、经过验证的故事
资格和能力:
计算机科学/数据科学/人工智能或相关专业本科或研究生在读: 具备扎实的计算机科学基础知识,熟悉数据结构、算法、编程语言等
具备C++、C语言的基础开发能力,能够独立进行工程模块的开发,熟悉单元测试者优先。
熟悉大语言模型 (LLM) 相关技术:了解例如词嵌入、语言模型、Transformer、prompt工程、检索增强生成、模型微调等。
部署过主流开源大模型,如ChatGLM、Qwen、Llama等,了解其各自的应用及技术特点。
熟悉 Prompt Engineering,擅长通过优化提示词提升模型代码生成质量;掌握 LoRA、Adapter 等轻量化微调技术,有代码领域模型调优经验者优先。
擅长使用大模型辅助自己的编程开发,使用过AI代码生成工具如cursor,GitHub Copilot者优先。
熟悉 Python 或其他编程语言: 能够熟练使用 Python 或其他编程语言进行代码编写和开发
良好的学习能力和解决问题的能力: 能够快速学习新知识和技术,并能够独立解决问题
良好的沟通能力和团队合作能力: 能够与团队成员有效沟通,并协作完成项目任务
Responsibilities This position is not available in GPP database. Talent Acquisition team member will fill in the Posting description after intake meeting.
Job Engineering
Organization Cummins Inc.
Role Category Hybrid
Job Type Student - Internship
ReqID 2414242
Relocation Package No