Files
Kordant/tasks/core-services-implementation/12-voiceprint-mobile-integration.md
2026-05-31 22:03:18 -04:00

4.8 KiB
Raw Blame History

12. iOS CallKit and Android Telecom API for Real-Time Call Analysis

meta: id: core-services-12 feature: core-services-implementation priority: P2 depends_on: [core-services-11] tags: [voiceprint, ios, android, callkit, telecom-api, real-time, mobile]

objective:

  • Integrate VoicePrint into the iOS and Android mobile apps via CallKit and Telecom API, enabling real-time call recording, analysis, and synthetic voice alerts during active phone calls.

deliverables:

  • iOS CallKit extension for call interception and recording
  • Android Telecom API integration for call screening and recording
  • Real-time audio streaming to server for analysis
  • Push notification alert when synthetic voice detected during call
  • On-device audio capture and upload pipeline

steps:

  1. iOS Implementation:
    • Create CallKit extension (CallDirectoryExtension) for caller identification
    • Implement CXProvider delegate for call state monitoring
    • Add audio recording permission (NSMicrophoneUsageDescription in Info.plist)
    • Stream call audio to server via WebSocket or upload after call ends
    • Show in-call alert overlay when synthetic voice detected
    • Handle app backgrounding and call recording continuity
  2. Android Implementation:
    • Implement TelecomManager with ConnectionService for call monitoring
    • Add READ_PHONE_STATE, RECORD_AUDIO, FOREGROUND_SERVICE permissions
    • Create call screening service that triggers on incoming/outgoing calls
    • Record call audio using MediaRecorder or AudioRecord
    • Upload audio to server for analysis after call ends
    • Show heads-up notification when synthetic voice detected
  3. Server-side integration:
    • Extend VoicePrint tRPC router with analyzeCallRecording endpoint
    • Handle multipart audio upload (WAV/MP3 format)
    • Queue analysis job, push result via WebSocket or push notification
    • Store analysis result linked to call metadata (number, duration, timestamp)
  4. Real-time vs. post-call analysis:
    • Phase 1: Post-call upload + analysis (simpler, lower latency requirement)
    • Phase 2: Real-time streaming chunks during call (requires <500ms analysis)
  5. User experience:
    • Settings toggle: "Analyze calls for voice cloning"
    • After each analyzed call: summary card in app (genuine/suspicious/synthetic)
    • Emergency override: one-tap hangup + block number when synthetic detected
  6. Privacy and compliance:
    • Two-party consent state detection (disable recording in 2-party consent states)
    • User must explicitly opt-in before any call recording
    • Audio data encrypted in transit and at rest
    • Auto-delete audio after analysis (configurable retention: 030 days)

tests:

  • Unit: Mock CallKit/Telecom callbacks, verify audio capture and upload logic
  • Integration: Test audio upload and analysis flow on device simulator
  • E2E: Receive call on device → record audio → upload → receive analysis notification

acceptance_criteria:

  • iOS app can record incoming call audio and upload to server for analysis
  • Android app can record incoming call audio and upload to server for analysis
  • Call recording only happens after explicit user opt-in
  • Two-party consent states are detected and recording is disabled (legal compliance)
  • Uploaded audio is analyzed by Azure Voice Live API and result pushed to device
  • Push notification sent within 30 seconds of analysis completion
  • In-app call summary shows: caller number, duration, analysis result, confidence score
  • Emergency hangup button available when synthetic voice detected
  • Audio data is encrypted in transit (TLS) and deleted after analysis (0-day retention default)
  • App handles backgrounding without losing call recording session
  • Recording doesn't interfere with normal call audio quality

validation:

  • iOS: Test on physical device (simulator doesn't support CallKit), verify recording and upload
  • Android: Test on physical device, verify Telecom API integration and notification delivery
  • Server: Verify analyzeCallRecording endpoint accepts multipart upload and returns analysis
  • Legal review: Confirm 2-party consent logic covers all US states correctly

notes:

  • iOS CallKit extensions run in separate process — share data via App Groups
  • Android Telecom API requires phone app to be default dialer (limited market penetration)
  • Alternative: Use accessibility service on Android for broader call recording (more invasive UX)
  • Real-time analysis requires chunking audio into 35 second segments and streaming — much harder than post-call
  • Consider starting with post-call analysis and adding real-time as Phase 2
  • Audio file sizes: 1 minute of WAV at 16kHz mono = ~1.9MB; compress to AAC/MP3 for upload
  • The existing iOS VoicePrintViewModel.swift and Android VoicePrintViewModel.kt need updating