Conversation
🦋 Changeset detectedLatest commit: 08502b9 The changes in this PR will be included in the next version bump. This PR includes changesets to release 20 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
📝 WalkthroughWalkthroughAdds new telemetry attribute constants, surfaces STT model/provider and participant information into voice telemetry, creates and propagates user_turn spans in AudioRecognition, exposes RoomIO.linkedParticipant/localParticipant, and passes STT/provider/participant through AgentActivity/AgentSession to AudioRecognition. Changes
Sequence DiagramsequenceDiagram
participant Speech as Speech/Event
participant AR as AudioRecognition
participant OTel as OpenTelemetry
participant RoomIO as RoomIO
participant Agent as AgentActivity/AgentSession
Speech->>AR: START_OF_SPEECH (STT or VAD)
AR->>OTel: ensureUserTurnSpan()
OTel-->>AR: user_turn Span (bound to context)
AR->>RoomIO: getLinkedParticipant()
RoomIO-->>AR: ParticipantInfo
AR->>AR: attach attributes (participant, sttModel, sttProvider)
AR->>OTel: enter userTurnContext(span) and run hooks (onStartOfSpeech / EOU detection)
Speech->>AR: END_OF_SPEECH
AR->>OTel: end user_turn Span
Agent->>AR: close / cleanup
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (3)**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Files:
**/*.{ts,tsx}?(test|example|spec)📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Files:
**/*.{ts,tsx}?(test|example)📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Files:
🔇 Additional comments (2)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
agents/src/voice/audio_recognition.ts (2)
758-765:⚠️ Potential issue | 🟡 MinorEnd
userTurnSpanon close to prevent orphaned spans.If
close()is called while a user turn is in progress, theuserTurnSpanwill remain recording but never ended, leading to incomplete telemetry data.🛡️ Proposed fix
async close() { + if (this.userTurnSpan?.isRecording()) { + this.userTurnSpan.setStatus({ code: 2, message: 'Session closed' }); // SpanStatusCode.ERROR = 2 + this.userTurnSpan.end(); + this.userTurnSpan = undefined; + } this.detachInputAudioStream(); this.silenceAudioWriter.releaseLock(); await this.commitUserTurnTask?.cancelAndWait(); await this.sttTask?.cancelAndWait(); await this.vadTask?.cancelAndWait(); await this.bounceEOUTask?.cancelAndWait(); }
701-714:⚠️ Potential issue | 🟡 MinorConsider ending
userTurnSpanwhen clearing the user turn.When
clearUserTurn()is called, the transcript state is reset but theuserTurnSpanmay still be active. This could cause the next user turn to reuse the same span (if still recording), potentially merging distinct turns into one span and causing incorrect telemetry attribution.🛡️ Proposed fix
clearUserTurn() { + if (this.userTurnSpan?.isRecording()) { + this.userTurnSpan.setStatus({ code: 2, message: 'User turn cleared' }); + this.userTurnSpan.end(); + this.userTurnSpan = undefined; + } this.audioTranscript = ''; this.audioInterimTranscript = ''; this.audioPreflightTranscript = ''; this.finalTranscriptConfidence = []; this.userTurnCommitted = false;
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
agents/src/voice/audio_recognition.tsagents/src/voice/audio_recognition_span.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- agents/src/voice/audio_recognition_span.test.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
Files:
agents/src/voice/audio_recognition.ts
**/*.{ts,tsx}?(test|example|spec)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
When testing inference LLM, always use full model names from
agents/src/inference/models.ts(e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')
Files:
agents/src/voice/audio_recognition.ts
**/*.{ts,tsx}?(test|example)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Initialize logger before using any LLM functionality with
initializeLogger({ pretty: true })from '@livekit/agents'
Files:
agents/src/voice/audio_recognition.ts
🧬 Code graph analysis (1)
agents/src/voice/audio_recognition.ts (4)
plugins/google/src/beta/gemini_tts.ts (1)
opts(165-167)agents/src/telemetry/traces.ts (1)
tracer(150-150)agents/src/telemetry/index.ts (1)
tracer(24-24)agents/src/stt/index.ts (1)
SpeechEventType(11-11)
🔇 Additional comments (6)
agents/src/voice/audio_recognition.ts (6)
5-11: LGTM!OpenTelemetry imports are correctly structured and all are utilized in the implementation.
67-79: LGTM!The new tracing options and
ParticipantInfotype are well-documented and provide appropriate flexibility for dynamic participant resolution.
164-197: LGTM!The
ensureUserTurnSpanmethod correctly handles span reuse with theisRecording()check, and properly propagates participant and STT metadata. TheuserTurnContexthelper correctly builds the context hierarchy.
346-364: LGTM!The span context wrapping for STT-based speech events is correctly implemented with appropriate variable scoping.
435-479: LGTM!The EOU detection span is correctly linked as a child of the user_turn span through the propagated context.
639-671: LGTM!The VAD event handling correctly backfills the span start time using
speechDurationfor accurate timing, and consistently wraps hooks with the appropriate OpenTelemetry context.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| export interface ParticipantLike { | ||
| sid: string | undefined; | ||
| identity: string; | ||
| kind: number; |
There was a problem hiding this comment.
should this be of type ParticipantKind instead?
| return this.#done; | ||
| } | ||
|
|
||
| get result(): T { |
There was a problem hiding this comment.
this doesn't feel great. It introduces a the notion of (false) synchronicity in a - by definition - async utility.
What speaks against using await ?
There was a problem hiding this comment.
This is only been used in 1 place currently, in RoomIO.linkedParticipant, which is a getter method that expect synchronous access
get isParticipantAvailable(): boolean {
return this.participantAvailableFuture.done;
}
get linkedParticipant(): RemoteParticipant | undefined {
if (!this.isParticipantAvailable) {
return undefined;
}
return this.participantAvailableFuture.result;
}Agree that this does not looks great but it's kind of a work around. Also, we have the guard of checking isParticipantAvailable before getting the participantAvailableFuture.result so it should not throw but simply return undefined if the future is not done in this case
…-js into brian/user-span-migration
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Summary by CodeRabbit
New Features
Tests