CS log
Speech to Text Transcription with the Cloud Speech API 본문
Overview
The Speech-to-Text API lets you transcribe audio speech files to text files in over 80 languages.
In this lab you send an audio file to the Speech API for transcription.
What to learn
In this lab, you explore the following:
- Creating a Speech-to-Text API request and calling the API with curl
- Calling the Speech-to-Text API with audio files in a different language
역시나 credential을 생성해주고 아래와 같은 request 파일을 생성한다.
{
"config": {
"encoding":"FLAC",
"languageCode": "en-US"
},
"audio": {
"uri":"gs://cloud-samples-data/speech/brooklyn_bridge.flac"
}
}
위 녹음에 대하여 아래와 같은 결과를 도출했다.
student-00-7a2303ebe5cf@linux-instance:~$ curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \
"https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}" > result.json
student-00-7a2303ebe5cf@linux-instance:~$ cat result.json
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.93497634
}
],
"resultEndTime": "1.770s",
"languageCode": "en-us"
}
],
"totalBilledTime": "2s",
"requestId": "4125953435626707581"
}
- The transcript value returns the Speech API's text transcription of your audio file
- confidence value indicates how sure the API is that it has accurately transcribed your audio.
Notice that you called the syncrecognize method in our request above. The Speech-to-Text API supports both synchronous and asynchronous speech to text transcription.
In this example a complete audio file was used, but you can also use the syncrecognize method to perform streaming speech to text transcription while the user is still speaking.
Task 4. Speech-to-Text transcription in different languages
student-00-7a2303ebe5cf@linux-instance:~$ cat result.json
{
"results": [
{
"alternatives": [
{
"transcript": "maître corbeau sur un arbre perché tenait en son bec un fromage maître Renard par l'odeur alléché lui tint à peu près ce langage et bonjour monsieur du corbeau",
"confidence": 0.9039431
}
],
"resultEndTime": "12.720s",
"languageCode": "fr-fr"
}
],
"totalBilledTime": "13s",
"requestId": "7797713163026537027"
}
You can change the language_code parameter in request.json,
'AI > NLP' 카테고리의 다른 글
Gemini for Data Scientists and Analysts (1) | 2024.10.02 |
---|---|
Integrating Applications with Gemini 1.0 Pro on Google Cloud (4) | 2024.09.30 |
Entity and Sentiment Analysis with the Natural Language API (2) | 2024.09.23 |
Cloud Natural Language API: Qwik Start (0) | 2024.09.23 |
Speech-to-text API (0) | 2024.09.23 |