Art Critic AI Agent

Building an autonomous, multimodal AI pipeline for daily creative critique and content generation.

Overview

I built an automated AI agent that critiques my daily render artworks in my Everyday Project and outputs shareable video content for social platforms.

Pipeline Breakdown

Stage 1 – Text & Audio Generation

Input: Daily render image from my ongoing Everyday project.
Perception: The agent analyzes the image using OpenAI’s GPT-4o Vision API.
- Later replaced by local LLMs (Mistral, Llama3 via Ollama) for a free, offline workflow.
Reasoning: Generates an art critique text based on the image content.
Voice: Converts the critique text to audio narration.

Initially used OpenAI TTS, later integrated free alternatives like Coqui TTS and Piper TTS.

Bonus: Suggest an existing artwork related to the artwork critiqued.

Stage 2 – Video Synthesis

Combines:
- Daily render image (as static background)
- Audio narration
Uses FFmpeg (open-source) to produce the final .mp4 video.

Prompt

“Describe and critique this artwork in detail. Also suggest an existing piece of art that is similar to this based on your analysis. Check and make sure that it is an existing artwork”

Workflow Architecture

Tools & Technologies

Text Generation: OpenAI GPT-4o, Mistral/Llama3 via Ollama
Text-to-Speech: OpenAI TTS, Coqui TTS, Piper TTS
Video Composition: FFmpeg
APIs: Meta Graph API, YouTube Data API (explored for auto-posting)

Key Takeaways

Created a multi-modal AI Agent exhibiting perception → reasoning → action.
Transition from paid APIs to a free, local-first setup, increasing accessibility and sustainability.
Open to future extensions:
- Auto-posting to platforms
- Engagement-driven feedback loops
- Fully autonomous daily outputs

Suraj Barthy

⌘+K Résumé Now Playbook → 9 years of Everyday 3D art