4.8 KiB
4.8 KiB
12. iOS CallKit and Android Telecom API for Real-Time Call Analysis
meta: id: core-services-12 feature: core-services-implementation priority: P2 depends_on: [core-services-11] tags: [voiceprint, ios, android, callkit, telecom-api, real-time, mobile]
objective:
- Integrate VoicePrint into the iOS and Android mobile apps via CallKit and Telecom API, enabling real-time call recording, analysis, and synthetic voice alerts during active phone calls.
deliverables:
- iOS CallKit extension for call interception and recording
- Android Telecom API integration for call screening and recording
- Real-time audio streaming to server for analysis
- Push notification alert when synthetic voice detected during call
- On-device audio capture and upload pipeline
steps:
- iOS Implementation:
- Create CallKit extension (
CallDirectoryExtension) for caller identification - Implement
CXProviderdelegate for call state monitoring - Add audio recording permission (NSMicrophoneUsageDescription in Info.plist)
- Stream call audio to server via WebSocket or upload after call ends
- Show in-call alert overlay when synthetic voice detected
- Handle app backgrounding and call recording continuity
- Create CallKit extension (
- Android Implementation:
- Implement
TelecomManagerwithConnectionServicefor call monitoring - Add
READ_PHONE_STATE,RECORD_AUDIO,FOREGROUND_SERVICEpermissions - Create call screening service that triggers on incoming/outgoing calls
- Record call audio using
MediaRecorderorAudioRecord - Upload audio to server for analysis after call ends
- Show heads-up notification when synthetic voice detected
- Implement
- Server-side integration:
- Extend VoicePrint tRPC router with
analyzeCallRecordingendpoint - Handle multipart audio upload (WAV/MP3 format)
- Queue analysis job, push result via WebSocket or push notification
- Store analysis result linked to call metadata (number, duration, timestamp)
- Extend VoicePrint tRPC router with
- Real-time vs. post-call analysis:
- Phase 1: Post-call upload + analysis (simpler, lower latency requirement)
- Phase 2: Real-time streaming chunks during call (requires <500ms analysis)
- User experience:
- Settings toggle: "Analyze calls for voice cloning"
- After each analyzed call: summary card in app (genuine/suspicious/synthetic)
- Emergency override: one-tap hangup + block number when synthetic detected
- Privacy and compliance:
- Two-party consent state detection (disable recording in 2-party consent states)
- User must explicitly opt-in before any call recording
- Audio data encrypted in transit and at rest
- Auto-delete audio after analysis (configurable retention: 0–30 days)
tests:
- Unit: Mock CallKit/Telecom callbacks, verify audio capture and upload logic
- Integration: Test audio upload and analysis flow on device simulator
- E2E: Receive call on device → record audio → upload → receive analysis notification
acceptance_criteria:
- iOS app can record incoming call audio and upload to server for analysis
- Android app can record incoming call audio and upload to server for analysis
- Call recording only happens after explicit user opt-in
- Two-party consent states are detected and recording is disabled (legal compliance)
- Uploaded audio is analyzed by Azure Voice Live API and result pushed to device
- Push notification sent within 30 seconds of analysis completion
- In-app call summary shows: caller number, duration, analysis result, confidence score
- Emergency hangup button available when synthetic voice detected
- Audio data is encrypted in transit (TLS) and deleted after analysis (0-day retention default)
- App handles backgrounding without losing call recording session
- Recording doesn't interfere with normal call audio quality
validation:
- iOS: Test on physical device (simulator doesn't support CallKit), verify recording and upload
- Android: Test on physical device, verify Telecom API integration and notification delivery
- Server: Verify
analyzeCallRecordingendpoint accepts multipart upload and returns analysis - Legal review: Confirm 2-party consent logic covers all US states correctly
notes:
- iOS CallKit extensions run in separate process — share data via App Groups
- Android Telecom API requires phone app to be default dialer (limited market penetration)
- Alternative: Use accessibility service on Android for broader call recording (more invasive UX)
- Real-time analysis requires chunking audio into 3–5 second segments and streaming — much harder than post-call
- Consider starting with post-call analysis and adding real-time as Phase 2
- Audio file sizes: 1 minute of WAV at 16kHz mono = ~1.9MB; compress to AAC/MP3 for upload
- The existing iOS
VoicePrintViewModel.swiftand AndroidVoicePrintViewModel.ktneed updating