Case study
VisionMap
3D Earth interface that generates narrated video tours from prompts using an AI media pipeline.
GenAI3DProduct
Overview
VisionMap combines an interactive 3D globe with an AI pipeline that produces location-based video tours. Users can select a region and generate a cohesive storyline and visuals.
Problem
Creating engaging, location-based video content takes time and specialized tools.
Solution
I integrated Unity (3D experience) with Python services and AI APIs to generate scripts, scene plans, and rendered video outputs.
Architecture
- Unity front-end selects location + prompt
- Python orchestrator calls LLM for script + structure
- RunwayML generates/edits visuals → stitches final video
Tech stack
Unity for 3D interactionPython orchestrationOpenAI API + RunwayML
Key engineering decisions
- • Pipeline orchestration to keep creative steps deterministic and debuggable.
- • Separation of concerns: Unity UX vs backend generation.
Results
- • Render Time: 30s
Links
What I’d improve next
- • Add caching for reusable assets and rerun-at-step to reduce iteration time.
- • Add safety filtering and content constraints for production readiness.