projects

Research projects with code, papers, and project pages.

research

Next-Gen CAPTCHAs

GUI-agent defense built around scalable, diverse cognitive-gap challenges.

Dive into Claude Code

A design-space study of today's and future AI agent systems.

BiGain

Unified token compression for joint generation and classification.

FADRM

Fast and accurate data residual matching for dataset distillation.

benchmark

Open CaptchaWorld

A web-based platform for testing and benchmarking multimodal LLM agents.

PIXAR

A taxonomy, benchmark, and metrics suite for VLM image tampering.

Mobile-MMLU

A mobile intelligence language understanding benchmark.