You're managing tasks on the go. Voice notes pile up in WhatsApp. Screenshots get buried in chat threads. Your team misses critical instructions because they're lost in message history. This n8n workflow transforms every WhatsApp message—voice, text, image, or video—into a structured Trello task with full transcription and media attachments. You'll learn how to build a 24/7 automated assistant that captures everything you send and delivers it to your team as actionable Trello cards.
The Problem: Task Management Through WhatsApp Doesn't Scale
Mobile professionals rely on WhatsApp for speed. You record a voice note with project instructions while driving. You snap a screenshot of a design issue between meetings. You text a quick reminder during lunch. But WhatsApp wasn't built for task management.
Current challenges:
- Voice notes require manual playback and note-taking by your team
- Images and screenshots disappear in long chat threads
- Text messages lack structure and priority indicators
- No way to track completion status
- Team members miss tasks buried in conversation flow
Business impact:
- Time spent: 2-4 hours per week manually transferring WhatsApp messages to task systems
- Error rate: 30-40% of verbal instructions require clarification
- Delayed response: Average 6-hour lag between message sent and task started
- Lost information: Screenshots and context vanish after 48 hours in active chats
The Solution Overview
This n8n workflow creates a WhatsApp-to-Trello bridge that processes every message type automatically. When you send anything to a designated WhatsApp contact, the workflow transcribes audio using OpenAI Whisper, downloads media files, and creates a Trello card with full content and attachments. The system runs continuously, processing messages within seconds of receipt. It handles German voice transcription, multi-image uploads, video attachments, and text parsing—all without manual intervention.
What You'll Build
This automation handles complete media-to-task conversion with these capabilities:
| Component | Technology | Purpose |
|---|---|---|
| Message Reception | WhatsApp Business API (via webhook) | Receives all incoming messages and media |
| Voice Transcription | OpenAI Whisper API | Converts German audio to text with high accuracy |
| Media Download | n8n HTTP Request nodes | Retrieves images, videos, and audio files from WhatsApp servers |
| Task Creation | Trello API | Creates cards with titles, descriptions, and attachments |
| Content Processing | n8n Function nodes | Parses message types and structures Trello card data |
| Error Handling | n8n Error Trigger | Captures failures and logs issues for review |
Key features:
- Processes voice notes, text, images, screenshots, and videos
- Transcribes audio to text automatically
- Attaches original media files to Trello cards
- Creates structured cards with titles and descriptions
- Supports keyword detection for labels (urgent, tomorrow)
- Operates 24/7 with instant processing
Prerequisites
Before starting, ensure you have:
- n8n instance (cloud or self-hosted version 1.0+)
- WhatsApp Business API account with webhook access
- OpenAI API key with Whisper access
- Trello account with API credentials (key + token)
- Trello board ID and list ID where tasks will be created
- Basic understanding of webhook configuration
- JavaScript knowledge for Function node customization (optional)
Step 1: Configure WhatsApp Webhook Reception
The workflow starts with a Webhook node that receives all WhatsApp messages. This node must be configured as a POST endpoint that WhatsApp can reach.
Set up the Webhook node:
- Add a Webhook node as your workflow trigger
- Set HTTP Method to POST
- Set Path to
/whatsapp-webhook(or your preferred endpoint) - Enable "Respond Immediately" to acknowledge receipt
- Set Response Code to 200
Node configuration:
{
"httpMethod": "POST",
"path": "whatsapp-webhook",
"responseMode": "responseNode",
"options": {}
}
Configure WhatsApp Business API:
- Log into your WhatsApp Business API provider dashboard
- Navigate to Webhook Configuration
- Enter your n8n webhook URL:
https://your-n8n-instance.com/webhook/whatsapp-webhook - Subscribe to message events:
messages,media - Verify the webhook with the validation token
Why this works:
WhatsApp sends a POST request to your webhook whenever a message arrives. The immediate response prevents timeout errors. The webhook receives a JSON payload containing message type, sender ID, and media URLs. This structure allows n8n to route different message types to appropriate processing branches.
Step 2: Parse Message Type and Extract Content
WhatsApp sends different JSON structures for text, voice, image, and video messages. A Function node analyzes the incoming payload and extracts relevant data.
Add a Function node after the Webhook:
// Extract message data from WhatsApp payload
const messageType = $input.item.json.entry[0].changes[0].value.messages[0].type;
const messageId = $input.item.json.entry[0].changes[0].value.messages[0].id;
const senderId = $input.item.json.entry[0].changes[0].value.messages[0].from;
let content = {};
if (messageType === 'text') {
content.text = $input.item.json.entry[0].changes[0].value.messages[0].text.body;
content.title = content.text.split('
')[0].substring(0, 100);
content.description = content.text;
}
if (messageType === 'audio' || messageType === 'voice') {
content.audioId = $input.item.json.entry[0].changes[0].value.messages[0].audio.id;
content.mimeType = $input.item.json.entry[0].changes[0].value.messages[0].audio.mime_type;
}
if (messageType === 'image') {
content.imageId = $input.item.json.entry[0].changes[0].value.messages[0].image.id;
content.caption = $input.item.json.entry[0].changes[0].value.messages[0].image.caption || '';
}
if (messageType === 'video') {
content.videoId = $input.item.json.entry[0].changes[0].value.messages[0].video.id;
content.caption = $input.item.json.entry[0].changes[0].value.messages[0].video.caption || '';
}
return {
messageType,
messageId,
senderId,
...content
};
Why this approach:
WhatsApp's webhook payload nests data several levels deep. This Function node flattens the structure and normalizes different message types into a consistent format. Later nodes can reference $json.messageType to route processing. For text messages, the first line becomes the Trello card title. For media, the ID enables file download from WhatsApp's servers.
Variables to customize:
substring(0, 100): Adjust title length limit- Add additional message types like
documentorlocation
Step 3: Download and Transcribe Voice Notes
Voice messages require two steps: downloading the audio file from WhatsApp, then transcribing it using OpenAI Whisper.
Download audio file:
Add an HTTP Request node with these settings:
{
"method": "GET",
"url": "https://graph.facebook.com/v18.0/{{ $json.audioId }}",
"authentication": "headerAuth",
"headerAuth": {
"name": "Authorization",
"value": "Bearer YOUR_WHATSAPP_ACCESS_TOKEN"
},
"options": {
"response": {
"response": {
"responseFormat": "file"
}
}
}
}
This retrieves the audio file as binary data. The responseFormat: file setting is critical—it stores the audio in n8n's binary data format, which the Whisper API requires.
Transcribe with OpenAI Whisper:
Add an OpenAI node configured for audio transcription:
{
"resource": "audio",
"operation": "transcribe",
"model": "whisper-1",
"binaryPropertyName": "data",
"options": {
"language": "de",
"temperature": 0
}
}
Critical settings:
language: "de": Optimizes transcription for German audiotemperature: 0: Maximizes accuracy by reducing randomnessbinaryPropertyName: "data": Points to the audio file from the previous node
Why this works:
WhatsApp doesn't provide direct audio URLs. You must first request the media ID, which returns a temporary download URL valid for 5 minutes. The HTTP Request node fetches the file before it expires. Whisper processes the binary audio data and returns a text transcription. Setting temperature to 0 ensures consistent, deterministic transcription—critical for task instructions.
Common issues:
- Token expiration → Refresh WhatsApp access token every 60 days
- Large audio files → Whisper supports up to 25MB; split longer recordings
- Language detection → Always specify
language: "de"for German content
Step 4: Download Images and Videos
Media files follow a similar download pattern but require different handling for Trello attachment.
Add an HTTP Request node for images:
{
"method": "GET",
"url": "https://graph.facebook.com/v18.0/{{ $json.imageId }}",
"authentication": "headerAuth",
"headerAuth": {
"name": "Authorization",
"value": "Bearer YOUR_WHATSAPP_ACCESS_TOKEN"
},
"options": {
"response": {
"response": {
"responseFormat": "file"
}
}
}
}
For videos, use identical configuration but replace imageId with videoId.
Why this approach:
Images and videos use the same WhatsApp Graph API endpoint structure. The responseFormat: file setting downloads media as binary data that Trello can accept as attachments. WhatsApp compresses images to reduce bandwidth; the original quality is preserved in the downloaded file.
Performance consideration:
Large video files (>50MB) may timeout. Add a timeout setting of 60 seconds:
"options": {
"timeout": 60000,
"response": {
"response": {
"responseFormat": "file"
}
}
}
Step 5: Create Structured Trello Cards
The final step transforms processed content into Trello cards with appropriate titles, descriptions, and attachments.
Add a Trello node:
{
"resource": "card",
"operation": "create",
"boardId": "YOUR_BOARD_ID",
"listId": "YOUR_LIST_ID",
"name": "{{ $json.title || 'New Task from WhatsApp' }}",
"description": "{{ $json.description || $json.text || 'Media attachment included' }}",
"additionalFields": {
"attachments": "data"
}
}
Card structure by message type:
| Message Type | Title | Description | Attachment |
|---|---|---|---|
| Voice note | First 100 chars of transcription | Full transcription text | Original audio file |
| Text message | First line of text | Complete message content | None |
| Image | "Image from WhatsApp" + caption | Caption text or "See attachment" | Image file |
| Video | "Video from WhatsApp" + caption | Caption text or "See attachment" | Video file |
Add keyword detection for labels:
Insert a Function node before Trello creation:
const description = $json.description || $json.text || '';
const labels = [];
if (description.toLowerCase().includes('urgent') || description.toLowerCase().includes('dringend')) {
labels.push('URGENT_LABEL_ID');
}
if (description.toLowerCase().includes('tomorrow') || description.toLowerCase().includes('morgen')) {
labels.push('TOMORROW_LABEL_ID');
}
return {
...$json,
labelIds: labels
};
Why this works:
Trello's API accepts binary data directly through the attachments parameter. The workflow passes the downloaded media file from previous nodes without additional processing. The card name uses the transcription's first line for voice notes, making tasks scannable in Trello's board view. Labels provide visual priority indicators without manual tagging.
Variables to customize:
- Replace
YOUR_BOARD_IDandYOUR_LIST_IDwith your Trello board identifiers - Add more keyword patterns for different labels
- Adjust title length limits based on your Trello board layout
Workflow Architecture Overview
This workflow consists of 12 nodes organized into 4 main sections:
- Message reception (Nodes 1-2): Webhook receives WhatsApp payload, Function node parses message type
- Media processing (Nodes 3-7): Conditional branches download audio/images/videos, Whisper transcribes voice notes
- Content structuring (Nodes 8-10): Function nodes format titles, descriptions, and detect keywords
- Trello delivery (Nodes 11-12): Trello node creates cards with attachments, Error Trigger logs failures
Execution flow:
- Trigger: WhatsApp webhook POST request
- Average run time: 3-8 seconds (depending on media size and transcription length)
- Key dependencies: WhatsApp Business API, OpenAI API, Trello API
Critical nodes:
- Function (Parse Message): Routes workflow based on message type (text/audio/image/video)
- HTTP Request (Download Media): Retrieves media files before WhatsApp URLs expire (5-minute window)
- OpenAI (Whisper): Transcribes German audio with 95%+ accuracy for task instructions
- Trello (Create Card): Delivers structured tasks with all media attachments intact
The complete n8n workflow JSON template is available at the bottom of this article.
Critical Configuration Settings
WhatsApp Business API Integration
Required fields:
- Access Token: Your WhatsApp Business API permanent token (refresh every 60 days)
- Phone Number ID: The WhatsApp number receiving messages
- Webhook Verify Token: Random string for webhook validation
- API Version:
v18.0or later (earlier versions lack media download support)
Common issues:
- Using temporary access tokens → Results in authentication failures after 24 hours
- Always generate permanent tokens from the WhatsApp Business dashboard
- Webhook verification fails → Ensure verify token matches exactly (case-sensitive)
OpenAI Whisper Configuration
Required fields:
- API Key: Your OpenAI API key with Whisper access
- Model:
whisper-1(only available model) - Language:
defor German transcription - Temperature:
0for maximum accuracy
Why this approach:
Setting temperature to 0 eliminates randomness in transcription. This ensures "Bitte erstelle einen Bericht" always transcribes identically, not as variations like "Bitte erstell einen Bericht." For task instructions, consistency matters more than creative interpretation. The language parameter improves accuracy by 15-20% compared to auto-detection for German audio.
Variables to customize:
language: Change toen,es,frfor other languagestemperature: Increase to 0.2-0.3 if transcriptions seem too rigid
Trello API Configuration
Required fields:
- API Key: Your Trello developer API key
- Token: OAuth token with read/write access to your boards
- Board ID: Found in board URL:
trello.com/b/BOARD_ID/board-name - List ID: Use Trello API to retrieve:
GET /1/boards/BOARD_ID/lists
Performance optimization:
- Create a dedicated "WhatsApp Tasks" list to avoid cluttering existing workflows
- Use Trello's label system for priority rather than multiple lists
- Archive completed cards weekly to maintain board performance
Testing & Validation
Test each message type individually:
Voice note test: Send a 10-second German voice message saying "Dies ist ein Test für die Spracherkennung"
- Verify: Trello card appears with transcription as title
- Verify: Audio file attached to card
- Expected result: Card created within 5-8 seconds
Text message test: Send "Testaufgabe
Dies ist die Beschreibung"- Verify: Title is "Testaufgabe"
- Verify: Description contains full text
- Expected result: Card created within 2-3 seconds
Image test: Send a screenshot with caption "Design-Feedback"
- Verify: Image appears as Trello attachment
- Verify: Caption included in description
- Expected result: Card created within 3-5 seconds
Video test: Send a short video file
- Verify: Video attached to card
- Verify: File plays in Trello's media viewer
- Expected result: Card created within 5-10 seconds
Common troubleshooting:
| Issue | Cause | Solution |
|---|---|---|
| No card created | Webhook not receiving messages | Check WhatsApp webhook subscription status |
| Transcription empty | Wrong audio format | Ensure WhatsApp sends audio as .ogg or .mp3 |
| Media not attached | Binary data not passed | Verify responseFormat: file in HTTP Request nodes |
| Timeout errors | Large video files | Increase timeout to 60 seconds in node settings |
Production Deployment Checklist
| Area | Requirement | Why It Matters |
|---|---|---|
| Error Handling | Add Error Trigger node with Slack/email notification | Detect transcription failures within 5 minutes vs discovering them days later |
| Monitoring | Set up n8n execution logging | Track daily message volume and identify processing bottlenecks |
| API Rate Limits | Implement exponential backoff for Trello API | Prevent workflow failures when creating 50+ cards per hour |
| Token Refresh | Calendar reminder for WhatsApp token renewal (60 days) | Avoid sudden authentication failures that break the entire workflow |
| Backup Storage | Store audio files in cloud storage before Trello upload | Preserve original recordings if Trello attachment fails |
| Documentation | Add sticky notes to each node explaining its purpose | Reduce modification time from 2 hours to 20 minutes for future updates |
Real-World Use Cases
Use Case 1: Mobile Project Management
- Industry: Construction/field services
- Scale: 30-50 voice notes per day from job sites
- Modifications needed: Add GPS location extraction from WhatsApp metadata, create separate Trello lists per project site
Use Case 2: Client Communication Tracking
- Industry: Consulting/professional services
- Scale: 100+ client messages per week
- Modifications needed: Parse sender phone number to tag cards by client name, add custom fields for billable hours
Use Case 3: Content Creation Workflow
- Industry: Marketing/media production
- Scale: 20-30 ideas captured daily via voice
- Modifications needed: Add sentiment analysis to prioritize enthusiastic ideas, integrate with content calendar
Use Case 4: Remote Team Coordination
- Industry: Distributed software teams
- Scale: 40-60 task requests per day across time zones
- Modifications needed: Extract @mentions to assign Trello cards automatically, add due date parsing for "by Friday" phrases
Customizing This Workflow
Alternative Integrations
Instead of Trello:
- Asana: Swap Trello node for Asana node—requires project GID instead of board ID, supports subtasks for multi-step voice instructions
- ClickUp: Better for complex task hierarchies—add custom fields for priority scoring, requires Space ID + List ID configuration
- Notion: Use when you need rich text formatting—replace Trello node with Notion database integration, supports inline images in descriptions
Instead of OpenAI Whisper:
- AssemblyAI: Better for speaker diarization (multiple people in audio)—requires separate API key, supports real-time transcription
- Google Speech-to-Text: Lower cost at scale (>10,000 minutes/month)—requires Google Cloud credentials, slightly lower accuracy for German
Workflow Extensions
Add automated reporting:
- Add a Schedule node to run every Monday at 9 AM
- Query Trello API for cards created in past 7 days
- Generate summary email with task counts by label
- Nodes needed: +4 (Schedule, HTTP Request for Trello, Function for formatting, Send Email)
Scale to handle team messages:
- Add a Switch node after message parsing to route by sender phone number
- Create separate Trello lists per team member
- Implement @mention detection to assign cards automatically
- Performance improvement: Handles 200+ messages/day without rate limit issues
Integration possibilities:
| Add This | To Get This | Complexity |
|---|---|---|
| Google Calendar sync | Auto-create events for time-sensitive tasks | Medium (6 nodes) |
| Slack notifications | Real-time alerts when urgent tasks arrive | Easy (2 nodes) |
| Airtable logging | Track message volume and transcription accuracy over time | Medium (5 nodes) |
| Sentiment analysis | Flag frustrated or urgent voice tones automatically | Advanced (8 nodes) |
Get Started Today
Ready to automate your WhatsApp task management?
- Download the template: Scroll to the bottom of this article to copy the n8n workflow JSON
- Import to n8n: Go to Workflows → Import from File, paste the JSON
- Configure your services: Add API credentials for WhatsApp Business API, OpenAI, and Trello
- Test with sample data: Send a test voice note to verify transcription and card creation
- Deploy to production: Activate the workflow and start sending tasks via WhatsApp
Need help customizing this workflow for your specific needs? Schedule an intro call with Atherial.
