In most qualitative research studies, such as focus groups or depth interviews, one of the challenges is to transcribe the audio recordings into written text for analysis. The most common method used so far is manual transcription, which is both financially and time-consuming. In this text, we explore the possibilities of using Artificial Intelligence (AI)…
Summary
The article discusses the challenges of transcribing non-English language focus group and in-depth interview recordings and explores the possibilities of using Artificial Intelligence (AI) for transcription. Manual transcription is time-consuming and expensive, and AI can be used for speech-to-text transcription. The article recommends using the Whisper model from OpenAI for the cheapest option and Microsoft Teams for online interviews. MS Word 365 and Whisper AI are recommended for recordings, personal interviews, focus groups, and speech-to-text transcription. The article also emphasizes the importance of high-quality audio and provides tips for improving audio quality.
MS Word for web (365)
- Transcription of recordings (speech to text) / live transcription
- 300 minutes per month
- Required license: Microsoft 365 (verification needed, subject to change)
- Info: https://support.microsoft.com/en-us/office/transcribe-your-recordings-7fc2efec-245e-45f0-b053-2a97531ecf57
- Supported languages: see link above
MS Teams
- a) Live transcription + transcription saving (= does not support recording transcription)
- For customers with the following licenses: Office 365 E1, Office 365 A1, Office 365/Microsoft 365 A3, Office 365/Microsoft 365 A5, Microsoft 365 E3, Microsoft 365 E5, Microsoft 365 F1, Office 365/Microsoft 365 F3, Microsoft 365 Business Basic, Microsoft 365 Business Standard, Microsoft 365 Business Premium SKU.
- Info: EN: https://support.microsoft.com/en-us/office/record-a-meeting-in-teams-34dfbe7f-b07d-4a27-b4c6-de62f1348c24
- b) saving audio/video recording, file will disappear after some time (details:
- EN: https://support.microsoft.com/en-us/office/record-a-meeting-in-teams-34dfbe7f-b07d-4a27-b4c6-de62f1348c24
- Required license: Office 365 Enterprise E1, E3, E5, F3, A1, A3, A5, M365 Business, Business Premium, or Business Essentials.
Google Speech to text api
- https://cloud.google.com/speech-to-text#section-12
- Speech Recognition (without Data Logging – default): 0-60 Minutes – Free; Over 60 Minutes – $0.024 / minute
- Not tested
Web services
- none of the services had a convincing transcription to Czech
- 180 minutes/10 USD, 990 minutes/49 USD
- credit (pay as you go, not monthly payment)
- 0.02 USD/min
Google Recorder (not tested)
- on Pixel phones, saving transcription to the cloud on newer phones
Whisper Open.ai model – custom installation
- advantages: fast transcription, free
- disadvantages: not accurate – requires corrections, no speaker identification (diarization)
- speaker identification (diarization) – can be bypassed through additional modifications
- for reference: transcription of a 10-minute conversation takes 6 minutes of computational time (on Google hardware), but it should fit into the free tariff
Quality audio is required
- for online interviews, I definitely recommend headphones and a microphone, any are better than none.
- for live interviews – if we don’t have a professional studio for group interviews – very good recommendations can be found at this link: https://www.indianscribes.com/4-ways-to-improve-focus-group-recordings/
Pay attention to:
- voices not overlapping,
- letting respondents finish their sentences,
- refraining from loudly expressing understanding to the respondent – sticking only to non-verbal expressions (nodding to indicate understanding), even though it can be difficult.
Conclusions:
For online interviews:
- MS Teams
For recordings/personal interviews/focus groups/speech-to-text transcription:
- MS Word 365
- Whisper AI – custom installation
- try to have the best possible audio quality.
Credit: Concept writen by human, text writen by human, translated to english by AI/CHatGPT, summary by AI/ChatGPT