projects

Efficient Large Language Model Architecture for Long Context Modeling

Personal ProjectMay 2024 - Present

  • Prototyped novel LLM architecture combining YOCO, LongRoPE, Infini-Attention, and 1.58 BitNet
  • Optimized memory usage by 50% and increased inference throughput by 25%
  • Unified advanced caching and compressive attention mechanisms for million-token contexts

1.58-bit Ternary Language Models

Personal ProjectMarch 2024 - Presentgithub

  • Developed 1.58-bit ternary modeling approach based on BitNet b1.58 architecture
  • Achieved near-full-precision accuracy with 70% reduced memory usage
  • Demonstrated robust performance for consumer-grade AI systems

Disney+ Unified Messaging Platform

Professional Project2021 - Present

  • 0→1 messaging platform supporting Disney+, Hulu, and ESPN+ across Email, Push, and In-App (across mobile, web and connected tv devices)
  • Reached 250M+ subscribers with 4% incremental streaming engagement
  • Built rapid prototyping pipeline using v0, Lovable.dev, and bolt.new
  • Expanded platform across multiple TWDC Family of Businesses