projects
Efficient Large Language Model Architecture for Long Context Modeling
Personal Project • May 2024 - Present
- •Prototyped novel LLM architecture combining YOCO, LongRoPE, Infini-Attention, and 1.58 BitNet
- •Optimized memory usage by 50% and increased inference throughput by 25%
- •Unified advanced caching and compressive attention mechanisms for million-token contexts
1.58-bit Ternary Language Models
Personal Project • March 2024 - Present • github
- •Developed 1.58-bit ternary modeling approach based on BitNet b1.58 architecture
- •Achieved near-full-precision accuracy with 70% reduced memory usage
- •Demonstrated robust performance for consumer-grade AI systems
Disney+ Unified Messaging Platform
Professional Project • 2021 - Present
- •0→1 messaging platform supporting Disney+, Hulu, and ESPN+ across Email, Push, and In-App (across mobile, web and connected tv devices)
- •Reached 250M+ subscribers with 4% incremental streaming engagement
- •Built rapid prototyping pipeline using v0, Lovable.dev, and bolt.new
- •Expanded platform across multiple TWDC Family of Businesses