VietSynth Project | Research Intern

Vietnam

Intern

26/03 — 04/04/2026

Job Description

To apply for this job, you need to complete both steps below:

STEP 1:

Please send your CV to this mail to submit your application directly to the company: 

careerservice@vinuni.edu.vn

Your application will only be received by Recruiter if submitted via above mail. 


STEP 2:

Kindly scroll to the bottom of this page and complete the short VinUni Tracking Form.

Filling out this form alone does not count as applying. Kindly remind this form is not part of the company’s application process. It only helps Careers, Alumni, Industry and Development (CAID) Department discover more opportunities and follow up in case of system issues.

 

About AI for Vietnam
AI for Vietnam (AIV) is a U.S.-registered 501(c)(3) nonprofit organization founded by Vietnamese
professionals from the world’s leading technology corporations and prestigious academic institutions. We
serve as a bridge between global AI advancements and Vietnam’s growing tech ecosystem, uniting a
global network of Vietnamese experts to accelerate the nation’s technological progress.
At AIV, we believe that AI is more than a technological race - it is a cultural mission. We empower
Vietnam to leapfrog traditional development stages by building open, high-quality Vietnamese datasets
and AI solutions rooted in open innovation and community collaboration.
About the VietSynth Project
VietSynth is AIV’s flagship data creation initiative dedicated to building high-quality, domain-specific
Vietnamese datasets that address critical gaps in low-resource language AI. The project focuses on
curating and generating training data across specialized topics of interest, from healthcare and legal to
education, agriculture, and cultural heritage to ensuring that Vietnamese AI systems are grounded in
accurate, culturally nuanced, and domain-expert-validated data.
VietSynth combines synthetic data generation techniques with rigorous human-in-the-loop quality
assurance to produce datasets that rival or exceed human-curated alternatives at scale.
Role Overview
We are seeking motivated and talented Research Interns to join the VietSynth project. As a Research
Intern, you will contribute directly to the creation of high-quality Vietnamese datasets in specialized
domains. You will gain hands-on research experience working alongside experienced AI researchers and
engineers from leading organizations, while making a meaningful impact on Vietnamese AI development.
This is an excellent opportunity for students who want to build real research experience, contribute to
open-source projects, and develop skills at the intersection of NLP, data science, and generative AI.
Key Responsibilities
• Assist in designing and executing data collection, annotation, and curation pipelines for
domain-specific Vietnamese datasets.
• Support synthetic data generation experiments using state-of-the-art LLMs, including prompt
engineering, self-instruct, evol-instruct, multi-agent techniques adapted for Vietnamese.
• Conduct literature reviews on data generation methodologies, low-resource NLP, and
domain-specific AI applications.

• Develop and apply quality evaluation frameworks to assess data fidelity, diversity, coverage, and
downstream utility.
• Perform data cleaning, deduplication, filtering, and formatting to ensure datasets meet rigorous
training standards.
• Collaborate with domain experts and native Vietnamese speakers to validate the accuracy and
cultural appropriateness of generated data.
• Contribute to technical documentation, research reports, and potential publications or
open-source releases.
• Participate in regular team meetings, paper readings, and knowledge-sharing sessions.
Ideal Candidate
Required Qualifications:
• Currently enrolled in a Bachelor’s or Master’s program in Computer Science, Data Science,
Computational Linguistics, NLP, Machine Learning, or a related field.
• Strong coding skills in Python and familiarity with common data processing libraries (Pandas,
NumPy) and NLP tools.
• Basic understanding of machine learning concepts and natural language processing
fundamentals.
• Interest in or experience with Large Language Models (LLMs), prompt engineering, or generative
AI.
• Strong attention to detail and ability to follow rigorous data quality standards.
• Self-motivated with excellent time management skills and the ability to work independently in a
remote setting.
• Good written and verbal communication skills in English.
Preferred Qualifications:
• Experience with deep learning frameworks such as PyTorch or Hugging Face Transformers.
• Familiarity with data annotation tools, crowdsourcing platforms, or human-in-the-loop
methodologies.
• Prior coursework or research experience in NLP, information extraction, or text mining.
• Experience with low-resource languages or multilingual NLP tasks.
• Knowledge of specific domains of interest (e.g., healthcare, legal, education, agriculture, cultural
studies).
• Prior contributions to open-source projects or published technical writing.
What You Will Gain
• Cutting-Edge Research Experience: Work on one of AI’s most pressing challenges: data
scarcity for low-resource languages using the latest generative AI techniques.
• Publication Opportunities: Contribute to research that may be published at top-tier ML/NLP
conferences and shared as open-source resources, depending on the outcomes and the interns’
motivations. Opportunities to receive direct guidance from experienced AI researchers and
professionals from Meta, NVIDIA, and top Vietnamese academic institutions.
• Real-World Impact: Help build foundational datasets that will power AI applications serving
millions of Vietnamese users.
• Professional Network: Connect with a global community of Vietnamese AI professionals across
nonprofit, academic, and industry sectors.

• Flexible & Hybrid: Work hybrid with flexible scheduling that accommodates your academic
commitments.

Application form

Full Name *
Email Address *
College  *
VinUni Email  *
Your Resume *
To attach your Resume, click here to upload from your Computer.
Security code *

Submit