Isabella Luong

About

Research engineer at the
frontier of AI safety

I build the infrastructure and evidence base for trustworthy AI — determined to close the gap between how we think AI systems behave and how they actually do.

I'm Isabella (Thao My) Luong, a research engineer based in Ho Chi Minh City, Vietnam, focused on AI safety evaluations and adversarial robustness. Within five months of entering the field, I'm running three interconnected empirical research projects examining how current evaluation infrastructure fails at scale.

My work spans benchmark integrity (how single-pass evals miss trajectory-level failure), LLM-as-judge bias (how stylistic features distort scoring independent of quality), and cross-session threat detection (catching adversarial actors who fragment attacks across API sessions). These aren't separate interests — they're a systematic examination of where evaluation breaks down.

I was admitted to CAMBRIA (10% acceptance) and accepted to two SPAR Spring 2026 projects, and am embedded in the EA/AI safety institutional ecosystem. I hold a B.Sc. in Information Technology from RMIT University Vietnam as a Vice-Chancellor Merit Scholar.

Location Ho Chi Minh City, Vietnam

Email my.isabella.luong@gmail.com

LinkedIn isabella-my-luong

Status Available for collaboration

Focus areas

AI Safety Evaluations Adversarial Robustness LLM-as-Judge Scalable Oversight Mechanistic Interpretability AI Welfare Benchmark Design

Stack

Python PyTorch HuggingFace Inspect (UK AISI) GNN Docker Git

Research

Active research projects

Benchmark Design · Animal Welfare Feb 2026 – Present

MANTA — Multi-turn Assessment of Nonhuman Thinking and Alignment

Research Mentee · Mentor: Allen Lu (Sentient Futures / Electric Sheep) · SPAR Spring 2026

Dynamic multi-turn benchmark evaluating frontier models on animal welfare reasoning under escalating adversarial pressure. Targeting submission to Inspect Evals (UK AISI); outputs feed into model specifications and training interventions at frontier labs.

Engineered an LLM-based scenario generation pipeline following ARENA dataset quality control standards, adapting ARENA's MCQ-oriented framework to open-ended adversarial welfare scenarios; implemented iterative rubric design for automated quality filtering, integrated manually curated few-shot exemplars to control output style and length distribution, and validated ~300 generated scenarios against benchmark contribution criteria
Refactored the dataset generation pipeline for reproducibility and robustness, eliminating auto-generated few-shot prompting in favor of curated exemplars, structuring multi-file pipeline outputs for reproducibility, and distilling community feedback into concrete generation constraints that measurably reduced eval-aware and formulaic scenario outputs
Identified and designed mitigation experiments for the comparability problem in dynamic multi-turn evaluation — where model-conditioned adversarial follow-ups cause non-equivalent pressure testing across models, confounding cross-model scoring; proposed and tested two solutions: a hybrid fixed-then-dynamic turn design providing a standardized baseline pressure before resuming adaptive generation, and controlled pressure injection ensuring welfare reasoning is consistently targeted regardless of Turn 1 model response

Adversarial Detection · ML Security Feb 2026 – Present

FragGuard — Cross-Session Malicious Activity Detection for Model APIs

Research Mentee · Mentors: Linh Le & David Williams-King (Mila / ERA) · SPAR Spring 2026

End-to-end detection system for cross-session malicious model misuse — targeting adversarial actors who decompose attack queries across multiple API sessions to evade per-session safety classifiers. Targeting publication at USENIX Security, NeurIPS D&B, and ICLR.

Building cross-session monitoring infrastructure — session tracker and code embedding pipeline with vector DB to semantically link malicious fragments
Constructing dependency graphs with a cross-session linker and subgraph extraction module, surfacing latent attack structures invisible to single-session classifiers
Developing a GNN detection architecture with adversarial training loops (5–10 rounds) and explainability-focused outputs generating structured attack explanations
Designing FragBench — a standardized benchmark suite with baselines and leaderboard for cross-session attack detection evaluation

AI Safety · AI Welfare Nov 2025 – Present

Electric Sheep — Evaluation Failure in Frontier AI

AI Safety Researcher · Mentor: Philip Kratz · FutureKind Winter Fellowship

Three interconnected empirical projects forming a systematic examination of benchmark and evaluation failure in frontier AI systems — with implications for scalable oversight and reward robustness.

Characterizing trajectory-level behavioral drift under adversarial input pressure — identifying failure modes that standard single-pass evaluations systematically miss
Empirically measuring how stylistic features (verbosity, hedging, formatting) distort LLM-as-judge scores independent of answer quality
Investigating the absence of reliable ground truth in nonhuman welfare reasoning evaluations — proposing criteria for scalable oversight under partial verifiability

Training

AI Safety programs

Iliad Intensive

Participant · London Initiative for Safe AI (LISA)

Accepted to the April 2026 cohort of Iliad's month-long intensive on technical AI alignment. Curriculum covers RL, learning theory, mechanistic interpretability, agent foundations, and scalable oversight including Debate. Strong performance serves as a pathway into the Iliad Fellowship (June–August 2026).

Apr 2026

CAMBRIA — Cambridge Bootcamp for Research in Interpretability and Alignment

Participant · Harvard Square, Boston · 10% acceptance rate

1 of 20 participants admitted worldwide. Completed a 3-week technical curriculum covering CNNs, ResNets, and transformers built from scratch; RL (DQN, PPO); RLHF; and mechanistic interpretability. Completed a capstone on automated capability elicitation with LLM-as-judge.

Jan 2026

Encode AI Vietnam — Country Lead

Vietnam Country Lead

International AI policy and advocacy organisation mobilising youth around AI governance and safety. Establishing funding pipelines with tech corporations and philanthropic funders; organizing panel discussions on AI risks and securing partnerships for technical curriculum delivery across Vietnam.

Nov 2025 – Present

HPAIR Harvard Conference (HCONF 2026)

Attendee · Harvard University, Cambridge, MA

HPAIR's flagship annual conference uniting global leaders, researchers, and students across policy, technology, and business.

Feb 2026

Experience

Professional background

Founding AI Engineer — Agents Full-time

DeepSurg · Remote, Turkey HQ

Aug 2025 – Present

Building Scala AI — an intelligent surgical tutor that performs real-time phase recognition, safety assessment, and structured performance feedback for laparoscopic procedures, improving surgical training through procedure-aware analysis.

Architected an agentic AI surgical tutor that understands procedural context, identifies the current surgical step, and evaluates safety, quality, efficiency, and bleeding — generating structured training reports with scores, feedback, key moments, and improvement suggestions
Fine-tuned a CNN-based image classifier to detect the start and end of the Calot Triangle Dissection phase within full laparoscopic cholecystectomy videos
Explored a VLM-only workflow as an alternative approach to building the annotation dataset; collaborated with medical practitioners to lead the surgical segmentation dataset generation process
Developed a CV-based, segmentation-driven surgical phase recognition pipeline, with active expansion to cover additional phases: Clipping & Cutting, Gallbladder Dissection, Gallbladder Packaging, Cleaning / Coagulation, Gallbladder Retraction
Built a systematic evaluation pipeline to benchmark model performance across phases and metrics
To our knowledge, first-to-market solution performing end-to-end automated surgical phase recognition and AI-driven trainee assessment at this level

Python PyTorch MONAI nnU-Net OpenCV CUDA

Product Engineer Full-time

Avery Dennison Corporation · Long Hau IP, Vietnam

2023 – 2025

Hired as 1 of 7 from 500+ candidates through a rigorous annual campaign for a fast-tracked management position. Embedded as the sole technical hire within a 500-person manufacturing operation — serving as the de facto in-house software engineer and automation consultant for the Finance department and the entire Vietnam site.

Designed and deployed Python + UiPath RPA solutions across 4 finance divisions, eliminating 120+ manual hours/month of repetitive workflow overhead
Architected a tax reconciliation system that cut month-end closing time by 25% — later standardised as the APAC regional template
Independently owned 8 department-wide automation projects from scoping to go-live — full-cycle project management, risk assessment, software engineering, and deployment; represented Finance in managing and liaising across internationally-wide automation initiatives

Python Streamlit Docker Apache Arrow UiPath Excel VBA

Education

Academic foundations

Royal Melbourne Institute of Technology (RMIT) University

Bachelor of Science in Information Technology · GPA: 3.75/4.0

Vice-Chancellor Scholar — Full-Tuition Merit Scholarship, Top 7 incoming students

Jun 2021 – May 2025

Ranked Top 8 in 2025 IT graduates cohort; Top 5% university-wide with 16/21 High Distinction courses (85%+)
Top student in: Software Engineering Project Management, Artificial Intelligence & Machine Learning, UI/UX Product Design

From Competitions to Research: Building My Technical Foundation ↗ Applying AI to Industry: Management Trainee at a Fortune 500 ↗ From Student to Founding AI Engineer: DeepSurg ↗ Finding a Research Mission Through Engineering Challenges ↗

International Strategic Management Workshop

Poznań University of Technology · Poznań, Poland

Apr 2025

Selected among 90 European peers for a 5-day intensive workshop on systems thinking, Balanced Scorecard, and AI-integrated strategy. Presented to an international faculty panel.

Competitions

Selected competitions
& leadership

Champion — Ranked 1st twice

McKinsey Young Leaders for Vietnam (YLV) Fellowship

May – Nov 2024 · Ho Chi Minh City

Selective fellowship pairing high-potential Vietnamese leaders with McKinsey consultants and NGO partners for real-world consulting casework under evaluation conditions.

Ranked 1/20 teams twice by McKinsey consultants across two independent fieldwork cases evaluated under real consulting conditions
Fieldwork 1 — Boatman Foundation: Designed a multi-component technology-integrated intervention for undereducated children in Vietnam Highlands — digital learning tools, scholarship allocation logic, and a teaching ambassador deployment model
Fieldwork 2 — Vun Art: Led end-to-end operational and financial consulting; delivered data-driven product diversification strategy and targeted materials sourcing campaign
Shortlisted for McKinsey Young Leaders for Inclusion (international program); invited as Case Coach for YLV 2025 cohort

Champion

MAERSK × RMIT Sustainability Impact Challenge

May – Nov 2024

Industry-partnered innovation challenge requiring technically grounded, financially validated decarbonization strategies for global logistics operations.

Won among 50+ teams with a 5-year roadmap deploying AI telematics for real-time fuel optimization — cutting Maersk's carbon footprint by 40% with 12% ROI
Built quantitative models demonstrating $2.8M long-term savings with full ESG compliance

National Champion · Best Pitch

RMIT FinTech Startup Competition × KardiaChain

Jun – Sep 2023

National competition requiring full-stack product development and business validation, judged by blockchain industry leaders.

Architected HemoChain — Vietnam's first distributed blood transfusion coordination network on Hyperledger Fabric with Google Maps API for real-time hospital routing
Led dual technical and business teams across smart contract logic, UI development, and go-to-market strategy

National 2nd Place

ASEA-China-India Youth Leadership Summit (ACIYLS)

Mar – Oct 2024 · Singapore & Vietnam

Vietnam — Technical Research Lead: Developed OAKIA — textile-to-textile circular system using AI-driven NIR spectroscopy for automated fabric composition identification, addressing Vietnam's 15M kg annual textile waste
Singapore — Systems Lead: Led a randomly assigned team of 10 to design Be-Cool — an urban cooling retrofit company deploying IoT-integrated passive cooling systems

Research engineer at thefrontier of AI safety

Active research projects

AI Safety programs

Professional background

Academic foundations

Selected competitions& leadership

Let's work onsafer AI together

Research engineer at the
frontier of AI safety

Selected competitions
& leadership

Let's work on
safer AI together