Voice Router SDK - Azure Speech-to-Text Provider / adapters/azure-stt-adapter

adapters/azure-stt-adapter

Classes

AzureSTTAdapter

Azure Speech-to-Text transcription provider adapter

Implements transcription for Azure Cognitive Services Speech API with support for:

Batch transcription (async processing)
Speaker diarization (identifying different speakers)
Multi-language support
Custom models and acoustic models
Word-level timestamps
Profanity filtering
Punctuation and capitalization

See

https://learn.microsoft.com/azure/cognitive-services/speech-service/ Azure Speech Documentation

Examples

import { AzureSTTAdapter } from '@meeting-baas/sdk';

const adapter = new AzureSTTAdapter();
adapter.initialize({
  apiKey: process.env.AZURE_SPEECH_KEY,
  region: 'eastus' // Your Azure region
});

const result = await adapter.transcribe({
  type: 'url',
  url: 'https://example.com/audio.mp3'
}, {
  language: 'en-US',
  diarization: true
});

console.log(result.data.text);

const result = await adapter.transcribe(audio, {
  language: 'en-US',
  diarization: true,
  metadata: {
    modelId: 'custom-model-id'
  }
});

// Submit transcription (Azure is always async)
const result = await adapter.transcribe({
  type: 'url',
  url: 'https://example.com/audio.mp3'
}, {
  language: 'en-US',
  diarization: true
});

// Get transcription ID for polling
const transcriptionId = result.data?.id;
console.log('Transcription ID:', transcriptionId);

// Poll for completion
const poll = async () => {
  const status = await adapter.getTranscript(transcriptionId);
  if (status.data?.status === 'completed') {
    console.log('Transcript:', status.data.text);
  } else if (status.data?.status === 'processing') {
    setTimeout(poll, 5000); // Poll every 5 seconds
  }
};
await poll();

Extends

BaseAdapter

Methods

buildTranscriptionProperties()

private buildTranscriptionProperties(options?): TranscriptionProperties

Build Azure-specific transcription properties using generated types

Parameters

Parameter	Type
`options?`	`TranscribeOptions`

Returns

TranscriptionProperties

createErrorResponse()

protected createErrorResponse(error, statusCode?, code?): UnifiedTranscriptResponse

Helper method to create error responses with stack traces

Parameters

Parameter	Type	Description
`error`	`unknown`	Error object or unknown error
`statusCode?`	`number`	Optional HTTP status code
`code?`	`ErrorCode`	Optional error code (defaults to extracted or UNKNOWN_ERROR)

Returns

UnifiedTranscriptResponse

Inherited from

BaseAdapter.createErrorResponse

deleteTranscript()

deleteTranscript(transcriptId): Promise<{ success: boolean; }>

Delete a transcription and its associated data

Removes the transcription from Azure's servers. This action is irreversible.

Parameters

Parameter	Type	Description
`transcriptId`	`string`	The ID of the transcription to delete

Returns

Promise<{ success: boolean; }>

Promise with success status

Example

const result = await adapter.deleteTranscript('abc123-def456');
if (result.success) {
  console.log('Transcription deleted successfully');
}

See

https://learn.microsoft.com/azure/cognitive-services/speech-service/batch-transcription

deriveWsUrl()

protected deriveWsUrl(httpUrl): string

Derive a WebSocket URL from an HTTP base URL

Converts https:// → wss:// and http:// → ws://

Parameters

Parameter	Type
`httpUrl`	`string`

Returns

string

Inherited from

BaseAdapter.deriveWsUrl

getAxiosConfig()

protected getAxiosConfig(): object

Get axios config for generated API client functions Configures headers and base URL using Azure subscription key

Returns

object

baseURL

baseURL: string

headers

headers: Record<string, string>

timeout

timeout: number

Overrides

BaseAdapter.getAxiosConfig

getTranscript()

getTranscript(transcriptId): Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Get transcription result by ID

Poll this method to check transcription status and retrieve results.

Parameters

Parameter	Type	Description
`transcriptId`	`string`	Transcription ID from Azure

Returns

Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Transcription response with status and results

Overrides

BaseAdapter.getTranscript

initialize()

initialize(config): void

Initialize the adapter with configuration

Parameters

Parameter	Type
`config`	`ProviderConfig` & `object`

Returns

void

Overrides

BaseAdapter.initialize

listTranscripts()

listTranscripts(options?): Promise<{ transcripts: UnifiedTranscriptResponse<TranscriptionProvider>[]; hasMore?: boolean; total?: number; }>

List recent transcriptions with filtering

Retrieves a list of transcription jobs for the authenticated subscription. Azure uses OData filtering for advanced queries.

Parameters

Parameter	Type	Description
`options?`	`ListTranscriptsOptions`	Filtering and pagination options

Returns

Promise<{ transcripts: UnifiedTranscriptResponse<TranscriptionProvider>[]; hasMore?: boolean; total?: number; }>

List of transcripts with pagination info

Examples

const { transcripts, hasMore } = await adapter.listTranscripts({
  limit: 50
})

const { transcripts } = await adapter.listTranscripts({
  status: 'Succeeded',
  limit: 100
})

See

https://learn.microsoft.com/azure/cognitive-services/speech-service/batch-transcription

mapStatusToAzure()

private mapStatusToAzure(status): Status

Map unified status to Azure status format using generated enum

Parameters

Parameter	Type
`status`	`string`

Returns

Status

normalizeListItem()

private normalizeListItem(item): UnifiedTranscriptResponse

Normalize a transcript list item to unified format

Parameters

Parameter	Type
`item`	`Transcription`

Returns

UnifiedTranscriptResponse

normalizeResponse()

private normalizeResponse(transcription, transcriptionData): UnifiedTranscriptResponse

Normalize Azure transcription response to unified format

Parameters

Parameter	Type
`transcription`	`Transcription`
`transcriptionData`	`any`

Returns

UnifiedTranscriptResponse

normalizeStatus()

private normalizeStatus(status): "queued" | "processing" | "completed" | "error"

Normalize Azure status to unified status

Parameters

Parameter	Type
`status`	`any`

Returns

"queued" | "processing" | "completed" | "error"

pollForCompletion()

protected pollForCompletion(transcriptId, options?): Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Generic polling helper for async transcription jobs

Polls getTranscript() until job completes or times out.

Parameters

Parameter	Type	Description
`transcriptId`	`string`	Job/transcript ID to poll
`options?`	{ `intervalMs?`: `number`; `maxAttempts?`: `number`; }	Polling configuration
`options.intervalMs?`	`number`	-
`options.maxAttempts?`	`number`	-

Returns

Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Final transcription result

Inherited from

BaseAdapter.pollForCompletion

transcribe()

transcribe(audio, options?): Promise<UnifiedTranscriptResponse<TranscriptionProvider>>

Submit audio for transcription

Azure Speech-to-Text uses batch transcription which processes asynchronously. You need to poll getTranscript() to retrieve the completed transcription.