AI Engineer & Content Creator

Hi, I'm
Your Name

Focused on video generation, image generation, and multimodal AI research. Also a passionate content creator making AI-powered short films.

全栈开发者🧠AI 研究者🎬内容创作者

About Me

AI Researcher × Full-Stack Engineer × Content Creator

Career Direction

AI Engineer / Machine Learning Engineer

Research Areas

Video Models & GenerationImage GenerationTemporal ModelsMultimodal Learning

Triple Identity

Full-Stack Developer

Product design, algorithm R&D, engineering, testing & deployment — end-to-end capability

AI Researcher

Deep dive into video generation and multimodal domains, tracking cutting-edge papers

Content Creator

Directing, filming & editing — creating AI short films and cinematic driving footage

Skills

Cross-disciplinary skill stack

Current Skill Positioning

Full-stack AI Engineer focused on LLM fine-tuning, agent systems, RAG architecture, and production-oriented backend delivery, differentiated by causal inference and measurement skills.

LangGraphMCP ProtocolGraphRAGQLoRAGRPO / DPOFastAPICausal Inference

LLM & GenAI Engineering

Core strengths around model integration, fine-tuning, alignment, and inference optimization.

OpenAI / Claude / DeepSeek / Gemini / Qwen APIsvLLMSGLangLoRA / QLoRA fine-tuningGRPO / DPO alignmentUnslothLLaMA-FactoryKV Cache optimizationFlash AttentionMoE architecturesDeepSeek V3 / R1 techniquesHuggingFace Transformers

Agent Systems

Product-oriented agent orchestration, tool use, workflow automation, and guardrail design.

LangGraphOpenAI Agents SDKMCP Protocol (SSE / Stdio / HTTP)Function CallingReActMulti-agent orchestrationDifyCozeN8NGuardRails

RAG & Knowledge Systems

Retrieval, knowledge organization, query transformation, and context engineering across document AI systems.

GraphRAGMilvusChromaDBFaissBGE / M3 embeddingsQuery TransformationRerankingMem0DSPy Context EngineeringRAGFlow

Machine Learning & Multimodal

A combined view of classical ML, deep learning, and multimodal modeling that matches an applied-AI profile.

PyTorchCNN ArchitecturesLSTM / GRU / InformerXGBoost / LightGBM / CatBoostOptunaFeature EngineeringModel FusionTransfer LearningCLIPVision Transformer (ViT)LLaVASwin TransformerOpenCVImage Augmentation

Optimization, Infra & MLOps

Distributed training, inference optimization, service APIs, and deployment-minded engineering support.

DeepSpeed (ZeRO 1 / 2 / 3)DDP / FSDPTensor / Pipeline ParallelismMixed Precision (fp16 / bf16 / fp8)Megatron-LMTensorRTQuantization (GPTQ / AWQ / GGUF)NCCLFastAPIDocker / KubernetesLangSmithWandbPydanticSQLAlchemy / AlembicMongoDBGraphQL / RESTful API

Causal Inference & Analytics

The strongest differentiator for showing that you can measure impact, not just build models or workflows.

Differentiator
A/B TestingPSMDIDDMLDAGs / do-calculusIV / 2SLSSensitivity AnalysisRCT DesignSQL (Window Functions / CTEs / Joins)PandasNumPyTableauRFM / AARRR / Funnel / Cohort AnalysisBusiness Metrics

Projects

Full-stack AI platforms, document intelligence systems, and model-tuning workflows

Multi-Model AI Studio
PlatformFeatured

Multi-Model AI Studio

A full-stack AI workspace that unifies hosted and self-hosted LLMs with chat, streaming responses, multimodal input, and batch inference.

ReactTypeScriptFastAPISSELLM Platform
Multimodal Document RAG Platform
Document AIFeatured

Multimodal Document RAG Platform

A multimodal RAG system for PDF parsing, vector retrieval, and document-grounded chat, built as one integrated upload-to-answer experience.

ReactFastAPILangChainMilvusRAG
Structured Extraction and Retrieval QA Platform
Document AI

Structured Extraction and Retrieval QA Platform

A document intelligence platform that combines structured extraction, vector search, and grounded QA across radiology, medication, finance, and news workflows.

FastAPIQdrantChromaLangChainDeepSeek
ViewDemoCode
Enterprise NL2SQL Fine-Tuning System
Model TuningFeatured

Enterprise NL2SQL Fine-Tuning System

An enterprise NL2SQL pipeline that generates schema-aware training data, then supports tuning, validation, and evaluation for natural-language SQL workflows.

LoRAQLoRAFastAPIWebSocketSQL
ViewDemoCode
RL-Tuned Function-Calling Agent Pipeline
Agent

RL-Tuned Function-Calling Agent Pipeline

A function-calling agent pipeline for preference data generation and evaluation, designed to improve tool selection and argument quality.

DPOFunction CallingEvaluationFastAPIAgents
ViewDemoCode
Document AI

Case Study

OCR-Powered AI Data Analysis System

Document AI

OCR-Powered AI Data Analysis System

An OCR-driven analysis workflow that turns PDF and image content into structured extraction and visualization-ready data.

DeepSeek-OCRAsync BatchingFastAPIAnalytics
ViewDemoCode

Research Papers

Academic research and technical exploration

Efficient Video Generation with Diffusion Models

CVPR 2026

Your Name, et al.

A novel efficient video diffusion architecture that significantly reduces computational cost while maintaining generation quality.

Video GenerationDiffusion ModelEfficiency

A Unified Framework for Multimodal Temporal Understanding

NeurIPS 2025

Your Name, et al.

A unified multimodal temporal understanding framework integrating visual, language, and audio signals for temporal reasoning.

MultimodalTemporalUnderstanding

Blog

Technical insights and reflections

Creative Works

AI Short Films · Cinematic Driving · Visual Stories

AI Film

AI-Generated Cyber City

A cyberpunk city short film generated with Sora and Runway

Driving

Mountain Road Sunset Drive

4K cinematic driving footage capturing sunset on mountain roads

AI Film

AI × Traditional Animation

A traditional Chinese animation short made with AI tools

Driving

City Night Cruise

Night driving through the city with neon lights and traffic

Contact

Let's connect

Whether it's job opportunities, technical discussions, or creative collaborations — feel free to reach out.