Art Critic AI Agent
Building an autonomous, multimodal AI pipeline for daily creative critique and content generation.
Link to Git Repo

Overview
I built an automated AI agent that critiques my daily render artworks in my Everyday Project and outputs shareable video content for social platforms.
Pipeline Breakdown
Stage 1 – Text & Audio Generation
- Input: Daily render image from my ongoing Everyday project.
- Perception: The agent analyzes the image using OpenAI’s GPT-4o Vision API.
- Later replaced by local LLMs (Mistral, Llama3 via Ollama) for a free, offline workflow.
- Reasoning: Generates an art critique text based on the image content.
- Voice: Converts the critique text to audio narration.
- Initially used OpenAI TTS, later integrated free alternatives like Coqui TTS and Piper TTS.
- Bonus: Suggest an existing artwork related to the artwork critiqued.
Stage 2 – Video Synthesis
- Combines:
- Daily render image (as static background)
- Audio narration
- Uses FFmpeg (open-source) to produce the final
.mp4
video.
Prompt
“Describe and critique this artwork in detail. Also suggest an existing piece of art that is similar to this based on your analysis. Check and make sure that it is an existing artwork”
Workflow Architecture





Tools & Technologies
- Text Generation: OpenAI GPT-4o, Mistral/Llama3 via Ollama
- Text-to-Speech: OpenAI TTS, Coqui TTS, Piper TTS
- Video Composition: FFmpeg
- APIs: Meta Graph API, YouTube Data API (explored for auto-posting)
Key Takeaways
- Created a multi-modal AI Agent exhibiting perception → reasoning → action.
- Transition from paid APIs to a free, local-first setup, increasing accessibility and sustainability.
- Open to future extensions:
- Auto-posting to platforms
- Engagement-driven feedback loops
- Fully autonomous daily outputs