Transcribe speech to text from any audio file, multilingual, on Faro's own infrastructure and priced below the major transcription APIs.
Audio & Video
Transcribe spoken audio to text from any common audio file, with automatic language detection, punctuation, and capitalization. Works with recordings, voice notes, calls, meetings, podcasts, and more. Priced per minute of audio, with no per-file minimums.
You have an audio file and need its spoken words as text.
Generating speech from text, translating, separating speakers, or analyzing non-speech audio.
Each is a sub-skill of Speech to Text; the router picks the right one for your request.
information
Returns the transcript plus the audio duration. Billing is 1.5 credits per minute of audio.
Skills run through one gateway with your Faro token. Hand it an intent in plain language; Faro routes to the right sub-skill, runs it, and bills per call.
curl -X POST "https://skill.askfaro.com/skills/speech-to-text/run" \
-H "Authorization: Bearer $FARO_TOKEN" \
-H "Content-Type: application/json" \
-d '{"intent":{"prompt":"Transcribe this voice note for me."}}'Example requests