VoiceRouter

adapters/gladia-adapter

Voice Router SDK - Gladia Provider / adapters/gladia-adapter

adapters/gladia-adapter

Classes

GladiaAdapter

Gladia transcription provider adapter

Implements transcription for the Gladia API with support for:

  • Synchronous and asynchronous transcription
  • Speaker diarization (identifying different speakers)
  • Multi-language detection and transcription
  • Summarization and sentiment analysis
  • Custom vocabulary boosting
  • Word-level timestamps

See

https://docs.gladia.io/ Gladia API Documentation

Examples

import { GladiaAdapter } from '@meeting-baas/sdk';

const adapter = new GladiaAdapter();
adapter.initialize({
  apiKey: process.env.GLADIA_API_KEY
});

const result = await adapter.transcribe({
  type: 'url',
  url: 'https://example.com/audio.mp3'
}, {
  language: 'en',
  diarization: true
});

console.log(result.data.text);
console.log(result.data.speakers);
const result = await adapter.transcribe(audio, {
  language: 'en',
  summarization: true,
  sentimentAnalysis: true
});

console.log('Summary:', result.data.summary);

Extends

Methods

buildStreamingRequest()

private buildStreamingRequest(options?): StreamingRequest

Build streaming request with full type safety from OpenAPI specs

Maps normalized options to Gladia streaming request format, including all advanced features like pre-processing, real-time processing, post-processing, and message configuration.

Parameters
ParameterType
options?StreamingOptions
Returns

StreamingRequest

buildTranscriptionRequest()

private buildTranscriptionRequest(audio, options?): InitTranscriptionRequest

Build Gladia transcription request from unified options

Parameters
ParameterType
audioAudioInput
options?TranscribeOptions
Returns

InitTranscriptionRequest

createErrorResponse()

protected createErrorResponse(error, statusCode?, code?): UnifiedTranscriptResponse

Helper method to create error responses with stack traces

Parameters
ParameterTypeDescription
errorunknownError object or unknown error
statusCode?numberOptional HTTP status code
code?ErrorCodeOptional error code (defaults to extracted or UNKNOWN_ERROR)
Returns

UnifiedTranscriptResponse

Inherited from

BaseAdapter.createErrorResponse

deleteTranscript()

deleteTranscript(transcriptId, jobType): Promise<{ success: boolean; }>

Delete a transcription job and its associated data

Removes the transcription data from Gladia's servers. This action is irreversible. Supports both pre-recorded and streaming job IDs.

Parameters
ParameterTypeDefault valueDescription
transcriptIdstringundefinedThe ID of the transcript/job to delete
jobType"streaming" | "pre-recorded""pre-recorded"Type of job: 'pre-recorded' or 'streaming' (defaults to 'pre-recorded')
Returns

Promise<{ success: boolean; }>

Promise with success status

Examples
const result = await adapter.deleteTranscript('abc123');
if (result.success) {
  console.log('Transcript deleted successfully');
}
const result = await adapter.deleteTranscript('stream-456', 'streaming');
See

https://docs.gladia.io/

extractSpeakers()

private extractSpeakers(transcription): Speaker[] | undefined

Extract speaker information from Gladia response

Parameters
ParameterType
transcriptionTranscriptionDTO | undefined
Returns

Speaker[] | undefined

extractUtterances()

private extractUtterances(transcription): object[] | undefined

Extract utterances from Gladia response

Parameters
ParameterType
transcriptionTranscriptionDTO | undefined
Returns

object[] | undefined

extractWords()

private extractWords(transcription): Word[] | undefined

Extract word timestamps from Gladia response

Parameters
ParameterType
transcriptionTranscriptionDTO | undefined
Returns

Word[] | undefined

getAudioFile()

getAudioFile(transcriptId, jobType): Promise<{ success: boolean; contentType?: string; data?: ArrayBuffer; error?: { code: string; message: string; }; }>

Download the original audio file from a transcription

Gladia stores the audio files used for transcription and allows downloading them. This works for both pre-recorded and streaming (live) transcriptions.

Returns ArrayBuffer for cross-platform compatibility (Node.js and browser).

Parameters
ParameterTypeDefault valueDescription
transcriptIdstringundefinedThe ID of the transcript/job
jobType"streaming" | "pre-recorded""pre-recorded"Type of job: 'pre-recorded' or 'streaming' (defaults to 'pre-recorded')
Returns

Promise<{ success: boolean; contentType?: string; data?: ArrayBuffer; error?: { code: string; message: string; }; }>

Promise with the audio file as ArrayBuffer, or error

Examples
const result = await adapter.getAudioFile('abc123');
if (result.success && result.data) {
  const buffer = Buffer.from(result.data);
  fs.writeFileSync('audio.mp3', buffer);
}
const result = await adapter.getAudioFile('abc123');
if (result.success && result.data) {
  const blob = new Blob([result.data], { type: 'audio/mpeg' });
  const url = URL.createObjectURL(blob);
  audioElement.src = url;
}
const result = await adapter.getAudioFile('stream-456', 'streaming');
if (result.success && result.data) {
  console.log('Audio file size:', result.data.byteLength, 'bytes');
}
See

https://docs.gladia.io/

getAxiosConfig()

protected getAxiosConfig(): object

Get axios config for generated API client functions Configures headers and base URL using Gladia's x-gladia-key header

Returns

object

baseURL

baseURL: string

headers

headers: Record<string, string>

timeout

timeout: number

Overrides

BaseAdapter.getAxiosConfig

getTranscript()

getTranscript(transcriptId): Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Get transcription result by ID

Parameters
ParameterType
transcriptIdstring
Returns

Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Overrides

BaseAdapter.getTranscript

handleWebSocketMessage()

private handleWebSocketMessage(message, callbacks?): void

Handle all WebSocket message types from Gladia streaming

Processes transcript, utterance, speech events, real-time processing results (translation, sentiment, NER), post-processing results (summarization, chapterization), acknowledgments, and lifecycle events.

Parameters
ParameterType
messageunknown
callbacks?StreamingCallbacks
Returns

void

initialize()

initialize(config): void

Initialize the adapter with configuration

Parameters
ParameterType
configProviderConfig
Returns

void

Inherited from

BaseAdapter.initialize

listTranscripts()

listTranscripts(options?): Promise<{ transcripts: UnifiedTranscriptResponse<TranscriptionProvider>[]; hasMore?: boolean; total?: number; }>

List recent transcriptions with filtering

Retrieves a list of transcription jobs (both pre-recorded and streaming) with optional filtering by status, date, and custom metadata.

Parameters
ParameterTypeDescription
options?ListTranscriptsOptionsFiltering and pagination options
Returns

Promise<{ transcripts: UnifiedTranscriptResponse<TranscriptionProvider>[]; hasMore?: boolean; total?: number; }>

List of transcripts with pagination info

Examples
const { transcripts, hasMore } = await adapter.listTranscripts({
  limit: 50,
  status: 'done'
})
const { transcripts } = await adapter.listTranscripts({
  afterDate: '2026-01-01',
  beforeDate: '2026-01-31',
  limit: 100
})
const { transcripts } = await adapter.listTranscripts({
  gladia: {
    custom_metadata: { project: 'my-project' }
  }
})
See

https://docs.gladia.io/

normalizeListItem()

private normalizeListItem(item): UnifiedTranscriptResponse

Normalize a transcript list item to unified format

Parameters
Returns

UnifiedTranscriptResponse

normalizeResponse()

private normalizeResponse(response): UnifiedTranscriptResponse<"gladia">

Normalize Gladia response to unified format

Parameters
ParameterType
responsePreRecordedResponse
Returns

UnifiedTranscriptResponse<"gladia">

pollForCompletion()

protected pollForCompletion(transcriptId, options?): Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Generic polling helper for async transcription jobs

Polls getTranscript() until job completes or times out.

Parameters
ParameterTypeDescription
transcriptIdstringJob/transcript ID to poll
options?{ intervalMs?: number; maxAttempts?: number; }Polling configuration
options.intervalMs?number-
options.maxAttempts?number-
Returns

Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Final transcription result

Inherited from

BaseAdapter.pollForCompletion

transcribe()

transcribe(audio, options?): Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Submit audio for transcription

Sends audio to Gladia API for transcription. If a webhook URL is provided, returns immediately with the job ID. Otherwise, polls until completion.

Parameters
ParameterTypeDescription
audioAudioInputAudio input (currently only URL type supported)
options?TranscribeOptionsTranscription options
Returns

Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Normalized transcription response

Throws

If audio type is not 'url' (file/stream not yet supported)

Examples
const result = await adapter.transcribe({
  type: 'url',
  url: 'https://example.com/meeting.mp3'
});
const result = await adapter.transcribe({
  type: 'url',
  url: 'https://example.com/meeting.mp3'
}, {
  language: 'en',
  diarization: true,
  speakersExpected: 3,
  summarization: true,
  customVocabulary: ['API', 'TypeScript', 'JavaScript']
});
// Submit transcription with webhook
const result = await adapter.transcribe({
  type: 'url',
  url: 'https://example.com/meeting.mp3'
}, {
  webhookUrl: 'https://myapp.com/webhook/transcription',
  language: 'en'
});

// Get job ID for polling
const jobId = result.data?.id;
console.log('Job ID:', jobId); // Use this to poll for status

// Later: Poll for completion (if webhook fails or you want to check)
const status = await adapter.getTranscript(jobId);
if (status.data?.status === 'completed') {
  console.log('Transcript:', status.data.text);
}
Overrides

BaseAdapter.transcribe

transcribeStream()

transcribeStream(options?, callbacks?): Promise<StreamingSession>

Stream audio for real-time transcription

Creates a WebSocket connection to Gladia for streaming transcription. First initializes a session via REST API, then connects to WebSocket.

Supports all Gladia streaming features:

  • Real-time transcription with interim/final results
  • Speech detection events (speech_start, speech_end)
  • Real-time translation to other languages
  • Real-time sentiment analysis
  • Real-time named entity recognition
  • Post-processing summarization and chapterization
  • Audio preprocessing (audio enhancement, speech threshold)
  • Custom vocabulary and spelling
  • Multi-language code switching
Parameters
ParameterTypeDescription
options?StreamingOptionsStreaming configuration options
callbacks?StreamingCallbacksEvent callbacks for transcription results
Returns

Promise<StreamingSession>

Promise that resolves with a StreamingSession

Examples
const session = await adapter.transcribeStream({
  encoding: 'linear16', // Unified format - mapped to Gladia's 'wav/pcm'
  sampleRate: 16000,
  channels: 1,
  language: 'en',
  interimResults: true
}, {
  onOpen: () => console.log('Connected'),
  onTranscript: (event) => {
    if (event.isFinal) {
      console.log('Final:', event.text);
    } else {
      console.log('Interim:', event.text);
    }
  },
  onError: (error) => console.error('Error:', error),
  onClose: () => console.log('Disconnected')
});

// Send audio chunks
const audioChunk = getAudioChunk();
await session.sendAudio({ data: audioChunk });

// Close when done
await session.close();
const session = await adapter.transcribeStream({
  encoding: 'linear16', // Use unified format
  sampleRate: 16000,
  language: 'en',
  sentimentAnalysis: true,
  entityDetection: true,
  summarization: true,
  gladiaStreaming: {
    pre_processing: {
      audio_enhancer: true,
      speech_threshold: 0.5
    },
    realtime_processing: {
      translation: true,
      translation_config: { target_languages: ['fr', 'es'] }
    },
    post_processing: {
      chapterization: true
    },
    messages_config: {
      receive_speech_events: true,
      receive_acknowledgments: true,
      receive_lifecycle_events: true
    }
  }
}, {
  onTranscript: (e) => console.log('Transcript:', e.text),
  onSpeechStart: (e) => console.log('Speech started at:', e.timestamp),
  onSpeechEnd: (e) => console.log('Speech ended at:', e.timestamp),
  onTranslation: (e) => console.log(`${e.targetLanguage}: ${e.translatedText}`),
  onSentiment: (e) => console.log('Sentiment:', e.sentiment),
  onEntity: (e) => console.log(`Entity: ${e.type} - ${e.text}`),
  onSummarization: (e) => console.log('Summary:', e.summary),
  onChapterization: (e) => console.log('Chapters:', e.chapters),
  onAudioAck: (e) => console.log('Audio ack:', e.byteRange),
  onLifecycle: (e) => console.log('Lifecycle:', e.eventType)
});
validateConfig()

protected validateConfig(): void

Helper method to validate configuration

Returns

void

Inherited from

BaseAdapter.validateConfig

Constructors

Constructor

new GladiaAdapter(): GladiaAdapter

Returns

GladiaAdapter

Inherited from

BaseAdapter.constructor

Properties

baseUrl

protected baseUrl: string = "https://api.gladia.io"

Base URL for provider API (must be defined by subclass)

Overrides

BaseAdapter.baseUrl

capabilities

readonly capabilities: ProviderCapabilities

Provider capabilities

Overrides

BaseAdapter.capabilities

name

readonly name: "gladia"

Provider name

Overrides

BaseAdapter.name

config?

protected optional config: ProviderConfig

Inherited from

BaseAdapter.config

Functions

createGladiaAdapter()

createGladiaAdapter(config): GladiaAdapter

Factory function to create a Gladia adapter

Parameters

ParameterType
configProviderConfig

Returns

GladiaAdapter

preRecordedControllerDeletePreRecordedJobV2()

preRecordedControllerDeletePreRecordedJobV2<TData>(id, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<void, any>

Parameters

ParameterType
idstring
options?AxiosRequestConfig<any>

Returns

Promise<TData>

preRecordedControllerGetAudioV2()

preRecordedControllerGetAudioV2<TData>(id, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<Blob, any>

Parameters

ParameterType
idstring
options?AxiosRequestConfig<any>

Returns

Promise<TData>

preRecordedControllerGetPreRecordedJobV2()

preRecordedControllerGetPreRecordedJobV2<TData>(id, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<PreRecordedResponse, any>

Parameters

ParameterType
idstring
options?AxiosRequestConfig<any>

Returns

Promise<TData>

preRecordedControllerInitPreRecordedJobV2()

preRecordedControllerInitPreRecordedJobV2<TData>(initTranscriptionRequest, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<InitPreRecordedTranscriptionResponse, any>

Parameters

ParameterType
initTranscriptionRequestInitTranscriptionRequest
options?AxiosRequestConfig<any>

Returns

Promise<TData>

streamingControllerDeleteStreamingJobV2()

streamingControllerDeleteStreamingJobV2<TData>(id, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<void, any>

Parameters

ParameterType
idstring
options?AxiosRequestConfig<any>

Returns

Promise<TData>

streamingControllerGetAudioV2()

streamingControllerGetAudioV2<TData>(id, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<Blob, any>

Parameters

ParameterType
idstring
options?AxiosRequestConfig<any>

Returns

Promise<TData>

streamingControllerInitStreamingSessionV2()

streamingControllerInitStreamingSessionV2<TData>(streamingRequest, params?, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<InitStreamingResponse, any>

Parameters

ParameterType
streamingRequestStreamingRequest
params?StreamingControllerInitStreamingSessionV2Params
options?AxiosRequestConfig<any>

Returns

Promise<TData>

transcriptionControllerListV2()

transcriptionControllerListV2<TData>(params?, options?): Promise<TData>

Type Parameters

Type ParameterDefault type
TDataAxiosResponse<ListTranscriptionResponse, any>

Parameters

ParameterType
params?TranscriptionControllerListV2Params
options?AxiosRequestConfig<any>

Returns

Promise<TData>

Interfaces

AudioChunkAckMessage

Properties

acknowledged

acknowledged: boolean

Flag to indicate if the action was successfully acknowledged

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: AudioChunkAckMessageData

The message data. "null" if the action was not successfully acknowledged

Nullable
error

error: AudioChunkAckMessageError

Error message if the action was not successfully acknowledged

Nullable
session_id

session_id: string

Id of the live session

type

type: "audio_chunk"

EndRecordingMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: EndRecordingMessageData

The message data

session_id

session_id: string

Id of the live session

type

type: "end_recording"

EndSessionMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

session_id

session_id: string

Id of the live session

type

type: "end_session"

InitTranscriptionRequest

Properties

audio_url

audio_url: string

URL to a Gladia file or to an external audio or video file

audio_to_llm?

optional audio_to_llm: boolean

[Alpha] Enable audio to llm processing for this audio

audio_to_llm_config?

optional audio_to_llm_config: GladiaAudioToLlmConfig

[Alpha] Audio to llm configuration, if audio_to_llm is enabled

callback?

optional callback: boolean

Enable callback for this transcription. If true, the callback_config property will be used to customize the callback behaviour

callback_config?

optional callback_config: CallbackConfigDto

Customize the callback behaviour (url and http method)

callback_url?

optional callback_url: string

[Deprecated] Use callback/callback_config instead. Callback URL we will do a POST request to with the result of the transcription

Deprecated
chapterization?

optional chapterization: boolean

[Alpha] Enable chapterization for this audio

code_switching_config?

optional code_switching_config: GladiaCodeSwitchingConfig

[Deprecated] Use language_config instead. Specify the configuration for code switching

Deprecated
context_prompt?

optional context_prompt: string

[Deprecated] Context to feed the transcription model with for possible better accuracy

Deprecated
custom_metadata?

optional custom_metadata: InitTranscriptionRequestCustomMetadata

Custom metadata you can attach to this transcription

custom_spelling?

optional custom_spelling: boolean

[Alpha] Enable custom spelling for this audio

custom_spelling_config?

optional custom_spelling_config: CustomSpellingConfigDTO

[Alpha] Custom spelling configuration, if custom_spelling is enabled

custom_vocabulary?

optional custom_vocabulary: boolean

[Beta] Can be either boolean to enable custom_vocabulary for this audio or an array with specific vocabulary list to feed the transcription model with

custom_vocabulary_config?

optional custom_vocabulary_config: CustomVocabularyConfigDTO

[Beta] Custom vocabulary configuration, if custom_vocabulary is enabled

detect_language?

optional detect_language: boolean

[Deprecated] Use language_config instead. Detect the language from the given audio

Deprecated
diarization?

optional diarization: boolean

Enable speaker recognition (diarization) for this audio

diarization_config?

optional diarization_config: DiarizationConfigDTO

Speaker recognition configuration, if diarization is enabled

display_mode?

optional display_mode: boolean

[Alpha] Allows to change the output display_mode for this audio. The output will be reordered, creating new utterances when speakers overlapped

enable_code_switching?

optional enable_code_switching: boolean

[Deprecated] Use language_config instead.Detect multiple languages in the given audio

Deprecated
language?

optional language: TranscriptionLanguageCodeEnum

[Deprecated] Use language_config instead. Set the spoken language for the given audio (ISO 639 standard)

Deprecated
language_config?

optional language_config: LanguageConfig

Specify the language configuration

moderation?

optional moderation: boolean

[Alpha] Enable moderation for this audio

name_consistency?

optional name_consistency: boolean

[Alpha] Enable names consistency for this audio

named_entity_recognition?

optional named_entity_recognition: boolean

[Alpha] Enable named entity recognition for this audio

punctuation_enhanced?

optional punctuation_enhanced: boolean

[Alpha] Use enhanced punctuation for this audio

sentences?

optional sentences: boolean

Enable sentences for this audio

sentiment_analysis?

optional sentiment_analysis: boolean

Enable sentiment analysis for this audio

structured_data_extraction?

optional structured_data_extraction: boolean

[Alpha] Enable structured data extraction for this audio

structured_data_extraction_config?

optional structured_data_extraction_config: StructuredDataExtractionConfigDTO

[Alpha] Structured data extraction configuration, if structured_data_extraction is enabled

subtitles?

optional subtitles: boolean

Enable subtitles generation for this transcription

subtitles_config?

optional subtitles_config: SubtitlesConfigDTO

Configuration for subtitles generation if subtitles is enabled

summarization?

optional summarization: boolean

[Beta] Enable summarization for this audio

summarization_config?

optional summarization_config: SummarizationConfigDTO

[Beta] Summarization configuration, if summarization is enabled

translation?

optional translation: boolean

[Beta] Enable translation for this audio

translation_config?

optional translation_config: TranslationConfigDTO

[Beta] Translation configuration, if translation is enabled

NamedEntityRecognitionMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: NamedEntityRecognitionMessageData

The message data. "null" if the addon failed

Nullable
error

error: NamedEntityRecognitionMessageError

Error message if the addon failed

Nullable
session_id

session_id: string

Id of the live session

type

type: "named_entity_recognition"

PostChapterizationMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: PostChapterizationMessageDataProperty

The message data. "null" if the addon failed

Nullable
error

error: PostChapterizationMessageError

Error message if the addon failed

Nullable
session_id

session_id: string

Id of the live session

type

type: "post_chapterization"

PostFinalTranscriptMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: StreamingTranscriptionResultDTO

The message data

session_id

session_id: string

Id of the live session

type

type: "post_final_transcript"

PostSummarizationMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: PostSummarizationMessageDataProperty

The message data. "null" if the addon failed

Nullable
error

error: PostSummarizationMessageError

Error message if the addon failed

Nullable
session_id

session_id: string

Id of the live session

type

type: "post_summarization"

PostTranscriptMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: TranscriptionDTO

The message data

session_id

session_id: string

Id of the live session

type

type: "post_transcript"

PreRecordedResponse

Properties

created_at

created_at: string

Creation date

id

id: string

Id of the job

kind

kind: "pre-recorded"

post_session_metadata

post_session_metadata: PreRecordedResponsePostSessionMetadata

For debugging purposes, send data that could help to identify issues

request_id

request_id: string

Debug id

status

status: PreRecordedResponseStatus

"queued": the job has been queued. "processing": the job is being processed. "done": the job has been processed and the result is available. "error": an error occurred during the job's processing.

version

version: number

API version

completed_at?

optional completed_at: string | null

Completion date when status is "done" or "error"

Nullable
custom_metadata?

optional custom_metadata: PreRecordedResponseCustomMetadata

Custom metadata given in the initial request

error_code?

optional error_code: number | null

HTTP status code of the error if status is "error"

Minimum

400

Maximum

599

Nullable
file?

optional file: PreRecordedResponseFile

The file data you uploaded. Can be null if status is "error"

Nullable
request_params?

optional request_params: PreRecordedResponseRequestParams

Parameters used for this pre-recorded transcription. Can be null if status is "error"

Nullable
result?

optional result: PreRecordedResponseResult

Pre-recorded transcription's result when status is "done"

Nullable

SentimentAnalysisMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: SentimentAnalysisMessageData

The message data. "null" if the addon failed

Nullable
error

error: SentimentAnalysisMessageError

Error message if the addon failed

Nullable
session_id

session_id: string

Id of the live session

type

type: "sentiment_analysis"

SpeechEndMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: SpeechMessageData

The message data

session_id

session_id: string

Id of the live session

type

type: "speech_end"

SpeechStartMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: SpeechMessageData

The message data

session_id

session_id: string

Id of the live session

type

type: "speech_start"

StartRecordingMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

session_id

session_id: string

Id of the live session

type

type: "start_recording"

StartSessionMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

session_id

session_id: string

Id of the live session

type

type: "start_session"

StopRecordingAckMessage

Properties

acknowledged

acknowledged: boolean

Flag to indicate if the action was successfully acknowledged

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: StopRecordingAckMessageData

The message data. "null" if the action was not successfully acknowledged

Nullable
error

error: StopRecordingAckMessageError

Error message if the action was not successfully acknowledged

Nullable
session_id

session_id: string

Id of the live session

type

type: "stop_recording"

StreamingRequest

Properties

bit_depth?

optional bit_depth: StreamingSupportedBitDepthEnum

The bit depth of the audio stream

callback?

optional callback: boolean

If true, messages will be sent to configured url.

callback_config?

optional callback_config: CallbackConfig

Specify the callback configuration

channels?

optional channels: number

The number of channels of the audio stream

Minimum

1

Maximum

8

custom_metadata?

optional custom_metadata: StreamingRequestCustomMetadata

Custom metadata you can attach to this live transcription

encoding?

optional encoding: StreamingSupportedEncodingEnum

The encoding format of the audio stream. Supported formats:

  • PCM: 8, 16, 24, and 32 bits
  • A-law: 8 bits
  • μ-law: 8 bits

Note: No need to add WAV headers to raw audio as the API supports both formats.

endpointing?

optional endpointing: number

The endpointing duration in seconds. Endpointing is the duration of silence which will cause an utterance to be considered as finished

Minimum

0.01

Maximum

10

language_config?

optional language_config: LanguageConfig

Specify the language configuration

maximum_duration_without_endpointing?

optional maximum_duration_without_endpointing: number

The maximum duration in seconds without endpointing. If endpointing is not detected after this duration, current utterance will be considered as finished

Minimum

5

Maximum

60

messages_config?

optional messages_config: MessagesConfig

Specify the websocket messages configuration

model?

optional model: "solaria-1"

The model used to process the audio. "solaria-1" is used by default.

post_processing?

optional post_processing: PostProcessingConfig

Specify the post-processing configuration

pre_processing?

optional pre_processing: PreProcessingConfig

Specify the pre-processing configuration

realtime_processing?

optional realtime_processing: RealtimeProcessingConfig

Specify the realtime processing configuration

sample_rate?

optional sample_rate: StreamingSupportedSampleRateEnum

The sample rate of the audio stream

StreamingResponse

Properties

created_at

created_at: string

Creation date

id

id: string

Id of the job

kind

kind: "live"

post_session_metadata

post_session_metadata: StreamingResponsePostSessionMetadata

For debugging purposes, send data that could help to identify issues

request_id

request_id: string

Debug id

status

status: StreamingResponseStatus

"queued": the job has been queued. "processing": the job is being processed. "done": the job has been processed and the result is available. "error": an error occurred during the job's processing.

version

version: number

API version

completed_at?

optional completed_at: string | null

Completion date when status is "done" or "error"

Nullable
custom_metadata?

optional custom_metadata: StreamingResponseCustomMetadata

Custom metadata given in the initial request

error_code?

optional error_code: number | null

HTTP status code of the error if status is "error"

Minimum

400

Maximum

599

Nullable
file?

optional file: StreamingResponseFile

The file data you uploaded. Can be null if status is "error"

Nullable
request_params?

optional request_params: StreamingResponseRequestParams

Parameters used for this live transcription. Can be null if status is "error"

Nullable
result?

optional result: StreamingResponseResult

Live transcription's result when status is "done"

Nullable

TranscriptionDTO

Properties

full_transcript

full_transcript: string

All transcription on text format without any other information

languages

languages: TranscriptionLanguageCodeEnum[]

All the detected languages in the audio sorted from the most detected to the less detected

utterances

utterances: UtteranceDTO[]

Transcribed speech utterances present in the audio

sentences?

optional sentences: SentencesDTO[]

If sentences has been enabled, sentences results

subtitles?

optional subtitles: SubtitleDTO[]

If subtitles has been enabled, subtitles results

TranscriptMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: TranscriptMessageData

The message data

session_id

session_id: string

Id of the live session

type

type: "transcript"

TranslationMessage

Properties

created_at

created_at: string

Date of creation of the message. The date is formatted as an ISO 8601 string

data

data: TranslationMessageData

The message data. "null" if the addon failed

Nullable
error

error: TranslationMessageError

Error message if the addon failed

Nullable
session_id

session_id: string

Id of the live session

type

type: "translation"

UtteranceDTO

Properties

channel

channel: number

Audio channel of where this utterance has been transcribed from

Minimum

0

confidence

confidence: number

Confidence on the transcribed utterance (1 = 100% confident)

end

end: number

End timestamp in seconds of this utterance

language

language: TranscriptionLanguageCodeEnum

Spoken language in this utterance

start

start: number

Start timestamp in seconds of this utterance

text

text: string

Transcription for this utterance

words

words: WordDTO[]

List of words of the utterance, split by timestamp

speaker?

optional speaker: number

If diarization enabled, speaker identification number

Minimum

0

WordDTO

Generated by orval v7.9.0 🍺 Do not edit manually. Gladia Control API OpenAPI spec version: 1.0

Properties

confidence

confidence: number

Confidence on the transcribed word (1 = 100% confident)

end

end: number

End timestamps in seconds of the spoken word

start

start: number

Start timestamps in seconds of the spoken word

word

word: string

Spoken word

Type Aliases

ListTranscriptionResponseItemsItem

ListTranscriptionResponseItemsItem = PreRecordedResponse | StreamingResponse

StreamingSupportedBitDepthEnum

StreamingSupportedBitDepthEnum = typeof StreamingSupportedBitDepthEnum[keyof typeof StreamingSupportedBitDepthEnum]

The bit depth of the audio stream

StreamingSupportedEncodingEnum

StreamingSupportedEncodingEnum = object

The encoding format of the audio stream. Supported formats:

  • PCM: 8, 16, 24, and 32 bits
  • A-law: 8 bits
  • μ-law: 8 bits

Note: No need to add WAV headers to raw audio as the API supports both formats.

Properties

wav/alaw

readonly wav/alaw: "wav/alaw" = 'wav/alaw'

wav/pcm

readonly wav/pcm: "wav/pcm" = 'wav/pcm'

wav/ulaw

readonly wav/ulaw: "wav/ulaw" = 'wav/ulaw'

StreamingSupportedEncodingEnum

StreamingSupportedEncodingEnum = typeof StreamingSupportedEncodingEnum[keyof typeof StreamingSupportedEncodingEnum]

The encoding format of the audio stream. Supported formats:

  • PCM: 8, 16, 24, and 32 bits
  • A-law: 8 bits
  • μ-law: 8 bits

Note: No need to add WAV headers to raw audio as the API supports both formats.

StreamingSupportedModels

StreamingSupportedModels = object

The model used to process the audio. "solaria-1" is used by default.

Properties

solaria-1

readonly solaria-1: "solaria-1" = 'solaria-1'

StreamingSupportedModels

StreamingSupportedModels = typeof StreamingSupportedModels[keyof typeof StreamingSupportedModels]

The model used to process the audio. "solaria-1" is used by default.

StreamingSupportedSampleRateEnum

StreamingSupportedSampleRateEnum = typeof StreamingSupportedSampleRateEnum[keyof typeof StreamingSupportedSampleRateEnum]

The sample rate of the audio stream

TranscriptionControllerListV2Params

TranscriptionControllerListV2Params = object

Properties

after_date?

optional after_date: string

Filter for items after the specified date. Use with before_date for a range. Date in ISO format.

before_date?

optional before_date: string

Include items that occurred before the specified date in ISO format.

custom_metadata?

optional custom_metadata: object

Index Signature

[key: string]: unknown

date?

optional date: string

Filter items relevant to a specific date in ISO format (YYYY-MM-DD).

kind?

optional kind: TranscriptionControllerListV2KindItem[]

Filter the list based on the item type. Supports multiple values from the predefined list.

limit?

optional limit: number

The maximum number of items to return. Useful for pagination and controlling data payload size.

offset?

optional offset: number

The starting point for pagination. A value of 0 starts from the first item.

status?

optional status: TranscriptionControllerListV2StatusItem[]

Filter the list based on item status. Accepts multiple values from the predefined list.

TranscriptionControllerListV2StatusItem

TranscriptionControllerListV2StatusItem = typeof TranscriptionControllerListV2StatusItem[keyof typeof TranscriptionControllerListV2StatusItem]

Generated by orval v7.9.0 🍺 Do not edit manually. Gladia Control API OpenAPI spec version: 1.0

TranscriptionLanguageCodeEnum

TranscriptionLanguageCodeEnum = object

Specify the language in which it will be pronounced when sound comparison occurs. Default to transcription language.

Properties

af

readonly af: "af" = 'af'

am

readonly am: "am" = 'am'

ar

readonly ar: "ar" = 'ar'

as

readonly as: "as" = 'as'

az

readonly az: "az" = 'az'

ba

readonly ba: "ba" = 'ba'

be

readonly be: "be" = 'be'

bg

readonly bg: "bg" = 'bg'

bn

readonly bn: "bn" = 'bn'

bo

readonly bo: "bo" = 'bo'

br

readonly br: "br" = 'br'

bs

readonly bs: "bs" = 'bs'

ca

readonly ca: "ca" = 'ca'

cs

readonly cs: "cs" = 'cs'

cy

readonly cy: "cy" = 'cy'

da

readonly da: "da" = 'da'

de

readonly de: "de" = 'de'

el

readonly el: "el" = 'el'

en

readonly en: "en" = 'en'

es

readonly es: "es" = 'es'

et

readonly et: "et" = 'et'

eu

readonly eu: "eu" = 'eu'

fa

readonly fa: "fa" = 'fa'

fi

readonly fi: "fi" = 'fi'

fo

readonly fo: "fo" = 'fo'

fr

readonly fr: "fr" = 'fr'

gl

readonly gl: "gl" = 'gl'

gu

readonly gu: "gu" = 'gu'

ha

readonly ha: "ha" = 'ha'

haw

readonly haw: "haw" = 'haw'

he

readonly he: "he" = 'he'

hi

readonly hi: "hi" = 'hi'

hr

readonly hr: "hr" = 'hr'

ht

readonly ht: "ht" = 'ht'

hu

readonly hu: "hu" = 'hu'

hy

readonly hy: "hy" = 'hy'

id

readonly id: "id" = 'id'

is

readonly is: "is" = 'is'

it

readonly it: "it" = 'it'

ja

readonly ja: "ja" = 'ja'

jw

readonly jw: "jw" = 'jw'

ka

readonly ka: "ka" = 'ka'

kk

readonly kk: "kk" = 'kk'

km

readonly km: "km" = 'km'

kn

readonly kn: "kn" = 'kn'

ko

readonly ko: "ko" = 'ko'

la

readonly la: "la" = 'la'

lb

readonly lb: "lb" = 'lb'

ln

readonly ln: "ln" = 'ln'

lo

readonly lo: "lo" = 'lo'

lt

readonly lt: "lt" = 'lt'

lv

readonly lv: "lv" = 'lv'

mg

readonly mg: "mg" = 'mg'

mi

readonly mi: "mi" = 'mi'

mk

readonly mk: "mk" = 'mk'

ml

readonly ml: "ml" = 'ml'

mn

readonly mn: "mn" = 'mn'

mr

readonly mr: "mr" = 'mr'

ms

readonly ms: "ms" = 'ms'

mt

readonly mt: "mt" = 'mt'

my

readonly my: "my" = 'my'

ne

readonly ne: "ne" = 'ne'

nl

readonly nl: "nl" = 'nl'

nn

readonly nn: "nn" = 'nn'

no

readonly no: "no" = 'no'

oc

readonly oc: "oc" = 'oc'

pa

readonly pa: "pa" = 'pa'

pl

readonly pl: "pl" = 'pl'

ps

readonly ps: "ps" = 'ps'

pt

readonly pt: "pt" = 'pt'

ro

readonly ro: "ro" = 'ro'

ru

readonly ru: "ru" = 'ru'

sa

readonly sa: "sa" = 'sa'

sd

readonly sd: "sd" = 'sd'

si

readonly si: "si" = 'si'

sk

readonly sk: "sk" = 'sk'

sl

readonly sl: "sl" = 'sl'

sn

readonly sn: "sn" = 'sn'

so

readonly so: "so" = 'so'

sq

readonly sq: "sq" = 'sq'

sr

readonly sr: "sr" = 'sr'

su

readonly su: "su" = 'su'

sv

readonly sv: "sv" = 'sv'

sw

readonly sw: "sw" = 'sw'

ta

readonly ta: "ta" = 'ta'

te

readonly te: "te" = 'te'

tg

readonly tg: "tg" = 'tg'

th

readonly th: "th" = 'th'

tk

readonly tk: "tk" = 'tk'

tl

readonly tl: "tl" = 'tl'

tr

readonly tr: "tr" = 'tr'

tt

readonly tt: "tt" = 'tt'

uk

readonly uk: "uk" = 'uk'

ur

readonly ur: "ur" = 'ur'

uz

readonly uz: "uz" = 'uz'

vi

readonly vi: "vi" = 'vi'

yi

readonly yi: "yi" = 'yi'

yo

readonly yo: "yo" = 'yo'

zh

readonly zh: "zh" = 'zh'

TranscriptionLanguageCodeEnum

TranscriptionLanguageCodeEnum = typeof TranscriptionLanguageCodeEnum[keyof typeof TranscriptionLanguageCodeEnum]

Specify the language in which it will be pronounced when sound comparison occurs. Default to transcription language.

Variables

StreamingSupportedBitDepthEnum

StreamingSupportedBitDepthEnum: object

Type Declaration

NUMBER_16

readonly NUMBER_16: 16 = 16

NUMBER_24

readonly NUMBER_24: 24 = 24

NUMBER_32

readonly NUMBER_32: 32 = 32

NUMBER_8

readonly NUMBER_8: 8 = 8

StreamingSupportedSampleRateEnum

StreamingSupportedSampleRateEnum: object

Type Declaration

NUMBER_16000

readonly NUMBER_16000: 16000 = 16000

NUMBER_32000

readonly NUMBER_32000: 32000 = 32000

NUMBER_44100

readonly NUMBER_44100: 44100 = 44100

NUMBER_48000

readonly NUMBER_48000: 48000 = 48000

NUMBER_8000

readonly NUMBER_8000: 8000 = 8000

TranscriptionControllerListV2StatusItem

TranscriptionControllerListV2StatusItem: object

Type Declaration

done

readonly done: "done" = 'done'

error

readonly error: "error" = 'error'

processing

readonly processing: "processing" = 'processing'

queued

readonly queued: "queued" = 'queued'

On this page

adapters/gladia-adapterClassesGladiaAdapterSeeExamplesExtendsMethodsbuildStreamingRequest()ParametersReturnsbuildTranscriptionRequest()ParametersReturnscreateErrorResponse()ParametersReturnsInherited fromdeleteTranscript()ParametersReturnsExamplesSeeextractSpeakers()ParametersReturnsextractUtterances()ParametersReturnsextractWords()ParametersReturnsgetAudioFile()ParametersReturnsExamplesSeegetAxiosConfig()ReturnsbaseURLheaderstimeoutOverridesgetTranscript()ParametersReturnsOverrideshandleWebSocketMessage()ParametersReturnsinitialize()ParametersReturnsInherited fromlistTranscripts()ParametersReturnsExamplesSeenormalizeListItem()ParametersReturnsnormalizeResponse()ParametersReturnspollForCompletion()ParametersReturnsInherited fromtranscribe()ParametersReturnsThrowsExamplesOverridestranscribeStream()ParametersReturnsExamplesvalidateConfig()ReturnsInherited fromConstructorsConstructorReturnsInherited fromPropertiesbaseUrlOverridescapabilitiesOverridesnameOverridesconfig?Inherited fromFunctionscreateGladiaAdapter()ParametersReturnspreRecordedControllerDeletePreRecordedJobV2()Type ParametersParametersReturnspreRecordedControllerGetAudioV2()Type ParametersParametersReturnspreRecordedControllerGetPreRecordedJobV2()Type ParametersParametersReturnspreRecordedControllerInitPreRecordedJobV2()Type ParametersParametersReturnsstreamingControllerDeleteStreamingJobV2()Type ParametersParametersReturnsstreamingControllerGetAudioV2()Type ParametersParametersReturnsstreamingControllerInitStreamingSessionV2()Type ParametersParametersReturnstranscriptionControllerListV2()Type ParametersParametersReturnsInterfacesAudioChunkAckMessagePropertiesacknowledgedcreated_atdataNullableerrorNullablesession_idtypeEndRecordingMessagePropertiescreated_atdatasession_idtypeEndSessionMessagePropertiescreated_atsession_idtypeInitTranscriptionRequestPropertiesaudio_urlaudio_to_llm?audio_to_llm_config?callback?callback_config?callback_url?Deprecatedchapterization?code_switching_config?Deprecatedcontext_prompt?Deprecatedcustom_metadata?custom_spelling?custom_spelling_config?custom_vocabulary?custom_vocabulary_config?detect_language?Deprecateddiarization?diarization_config?display_mode?enable_code_switching?Deprecatedlanguage?Deprecatedlanguage_config?moderation?name_consistency?named_entity_recognition?punctuation_enhanced?sentences?sentiment_analysis?structured_data_extraction?structured_data_extraction_config?subtitles?subtitles_config?summarization?summarization_config?translation?translation_config?NamedEntityRecognitionMessagePropertiescreated_atdataNullableerrorNullablesession_idtypePostChapterizationMessagePropertiescreated_atdataNullableerrorNullablesession_idtypePostFinalTranscriptMessagePropertiescreated_atdatasession_idtypePostSummarizationMessagePropertiescreated_atdataNullableerrorNullablesession_idtypePostTranscriptMessagePropertiescreated_atdatasession_idtypePreRecordedResponsePropertiescreated_atidkindpost_session_metadatarequest_idstatusversioncompleted_at?Nullablecustom_metadata?error_code?MinimumMaximumNullablefile?Nullablerequest_params?Nullableresult?NullableSentimentAnalysisMessagePropertiescreated_atdataNullableerrorNullablesession_idtypeSpeechEndMessagePropertiescreated_atdatasession_idtypeSpeechStartMessagePropertiescreated_atdatasession_idtypeStartRecordingMessagePropertiescreated_atsession_idtypeStartSessionMessagePropertiescreated_atsession_idtypeStopRecordingAckMessagePropertiesacknowledgedcreated_atdataNullableerrorNullablesession_idtypeStreamingRequestPropertiesbit_depth?callback?callback_config?channels?MinimumMaximumcustom_metadata?encoding?endpointing?MinimumMaximumlanguage_config?maximum_duration_without_endpointing?MinimumMaximummessages_config?model?post_processing?pre_processing?realtime_processing?sample_rate?StreamingResponsePropertiescreated_atidkindpost_session_metadatarequest_idstatusversioncompleted_at?Nullablecustom_metadata?error_code?MinimumMaximumNullablefile?Nullablerequest_params?Nullableresult?NullableTranscriptionDTOPropertiesfull_transcriptlanguagesutterancessentences?subtitles?TranscriptMessagePropertiescreated_atdatasession_idtypeTranslationMessagePropertiescreated_atdataNullableerrorNullablesession_idtypeUtteranceDTOPropertieschannelMinimumconfidenceendlanguagestarttextwordsspeaker?MinimumWordDTOPropertiesconfidenceendstartwordType AliasesListTranscriptionResponseItemsItemStreamingSupportedBitDepthEnumStreamingSupportedEncodingEnumPropertieswav/alawwav/pcmwav/ulawStreamingSupportedEncodingEnumStreamingSupportedModelsPropertiessolaria-1StreamingSupportedModelsStreamingSupportedSampleRateEnumTranscriptionControllerListV2ParamsPropertiesafter_date?before_date?custom_metadata?Index Signaturedate?kind?limit?offset?status?TranscriptionControllerListV2StatusItemTranscriptionLanguageCodeEnumPropertiesafamarasazbabebgbnbobrbscacscydadeeleneseteufafifofrglguhahawhehihrhthuhyidisitjajwkakkkmknkolalblnloltlvmgmimkmlmnmrmsmtmynenlnnnoocpaplpsptrorusasdsiskslsnsosqsrsusvswtatetgthtktltrttukuruzviyiyozhTranscriptionLanguageCodeEnumVariablesStreamingSupportedBitDepthEnumType DeclarationNUMBER_16NUMBER_24NUMBER_32NUMBER_8StreamingSupportedSampleRateEnumType DeclarationNUMBER_16000NUMBER_32000NUMBER_44100NUMBER_48000NUMBER_8000TranscriptionControllerListV2StatusItemType Declarationdoneerrorprocessingqueued