You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: oci-artificial-intelligence/ai-speech/transcribe-live-audio/transcribe-live-audio.md
+75-30Lines changed: 75 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,26 +3,28 @@
3
3
## Introduction
4
4
In this session, we will help users get familiar with OCI Speech live transcribe and teach them how to use our services via the cloud console.
5
5
6
-
***Estimated Lab Time***: 5 minutes
6
+
***Estimated Lab Time***: 30 minutes
7
7
8
8
### Objectives
9
9
10
10
In this lab, you will:
11
11
- Learn how to transcribe live audio to text from the OCI Console
12
12
- Invoke custom vocabulary (customizations) in the OCI Console
13
+
- Learn how to use OCI AI Speech Realtime Python SDK to create live transcription sessions
13
14
14
15
### Prerequisites:
15
16
- A Free tier or paid tenancy account in OCI (Oracle Cloud Infrastructure)
16
17
- Tenancy is whitelisted to be able to use OCI Speech
17
18
18
19
## Task 1: Navigate to Overview Page
19
20
20
-
Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Analytics and AI menu and click it, and then select Language item under AI services.
21
+
Log into OCI Console. Using the Burger Menu on the top left corner, navigate to **Analytics and AI** menu, and then select **Speech**under **AI Services**.
21
22

22
23
23
-
This will navigate you to the transcription jobs overview page.
24
-
On the left you can toggle between overview and transcription jobs listing page.
25
-
Under documentation you can find helpful links relevant to OCI speech service
24
+
This will navigate you to the Speech overview page.
25
+
From the left you can navigate to various OCI Speech offerings.
26
+
27
+
From the Documentation section you can find helpful links relevant to OCI Speech service
26
28

27
29
28
30
@@ -46,25 +48,34 @@ Under documentation you can find helpful links relevant to OCI speech service
46
48
47
49
To change transcription parameters, look to the <strong>Configure transcription</strong> menu to the right
48
50
49
-
1. Configure transcription
51
+
### Configure transcription
52
+
53
+
Here you can change parameters such as transcription model type, model domain, audio language, punctuation, partial and final silence thresholds, partial results stability and enable customizations
Here you can change parameters such as transcription model type, audio language, punctuation, partial and final silence thresholds partial results stability and enable customizations
- <strong>Model type:</strong> Use this parameter to select a model to use for generating transcriptions. Currently supported model types are: `ORACLE` and `WHISPER`
57
+
58
+
> Note: Partial results are only supported by `ORACLE` model
59
+
60
+
- <strong>Model domain:</strong> Use this parameter to configure the transcription model for specialized audio, e.g. audio that features specific media terminology. Currently supported model domains are: `GENERIC` and `MEDICAL`
53
61
54
-
<strong>Choose domain:</strong> Use this parameter to configure the transcription model for specialized audio, e.g. audio that features specific medial terminology
62
+
> Note: `MEDICAL` domain is only supported by `ORACLE` model
55
63
56
-
<strong>Choose language:</strong> Use this parameter to configure the language of the speaker
64
+
-<strong>Language:</strong> Use this parameter to configure the trancription language. `WHISPER` model supports automatic language detection.
57
65
58
-
<strong>Choose punctuation:</strong> Use this parameter to configure the punctuation mode for the transcription model
66
+
-<strong>Punctuation:</strong> Use this parameter to configure the punctuation mode for the transcription model. Currently supported punctuation modes are: `NONE`, `AUTO` and `SPOKEN`
59
67
60
-
<strong>Partial silence threshold:</strong> Use this parameter to configure how quickly partial results should be
61
-
returned
68
+
> Note: Punctuation mode `SPOKEN` is only supported by `MEDICAL` domain
69
+
70
+
Following parameters are only supported by `ORACLE` model:
71
+
72
+
- <strong>Partial silence threshold:</strong> Use this parameter to configure how quickly partial results should be returned. Value ranges from `0` to `2000` milliseconds.
62
73
63
-
<strong>Final silence threshold:</strong> Use this parameter to configure how long to wait before a partial result is finalized
74
+
-<strong>Final silence threshold:</strong> Use this parameter to configure how long to wait before a partial result is finalized. Value ranges from `0` to `5000` milliseconds.
64
75
65
-
<strong>Partial results stability:</strong> Use this parameter to configure the stability of partial results (amount of confidence required before returning a partial result)
76
+
-<strong>Partial results stability:</strong> Use this parameter to configure the stability of partial results (amount of confidence required before returning a partial result). Allowed values are `NONE`, `LOW`, `MEDIUM` and `HIGH`.
66
77
67
-
<strong>Enable customizations:</strong> Check this box to choose a customization to use during your transcription session
78
+
-<strong>Enable customizations:</strong> Check this box to choose a customization to use during your transcription session.
68
79
69
80
## Task 4: Enabling a customization
70
81
@@ -99,9 +110,10 @@ Alternatively, both the required packages can be installed using the `requiremen
99
110
pip install -r requirements.txt
100
111
```
101
112
113
+
### Python example:
114
+
102
115
OCI AI Speech live transcription uses websockets to relay audio data and receive text transcriptions in real time. This means your client must implement some key listener functions:
103
116
104
-
<strong>Python example:</strong>
105
117
```
106
118
on_result(result)
107
119
// This function will be called whenever a result is returned from the
0 commit comments