8.7 KiB
Remote Microphone Streaming Setup
This guide shows how to use the ASR system with a client on one machine streaming audio to a server on another machine.
Architecture
┌─────────────────┐ ┌─────────────────┐
│ Client Machine │ │ Server Machine │
│ │ │ │
│ 🎤 Microphone │ ───WebSocket───▶ │ 🖥️ Display │
│ │ (Audio) │ │
│ client/ │ │ server/ │
│ mic_stream.py │ │ display_server │
└─────────────────┘ └─────────────────┘
Server Setup (Machine with GPU)
1. Start the server with live display
cd /home/koko210Serve/parakeet-test
source venv/bin/activate
PYTHONPATH=/home/koko210Serve/parakeet-test python server/display_server.py
Options:
python server/display_server.py --host 0.0.0.0 --port 8766
The server will:
- ✅ Bind to all network interfaces (0.0.0.0)
- ✅ Display transcriptions in real-time with color coding
- ✅ Show progressive updates as audio streams in
- ✅ Highlight final transcriptions when complete
2. Configure firewall (if needed)
Allow incoming connections on port 8766:
# Ubuntu/Debian
sudo ufw allow 8766/tcp
# CentOS/RHEL
sudo firewall-cmd --permanent --add-port=8766/tcp
sudo firewall-cmd --reload
3. Get the server's IP address
# Find your server's IP address
ip addr show | grep "inet " | grep -v 127.0.0.1
Example output: 192.168.1.100
Client Setup (Remote Machine)
1. Install dependencies on client machine
Create a minimal Python environment:
# Create virtual environment
python3 -m venv asr-client
source asr-client/bin/activate
# Install only client dependencies
pip install websockets sounddevice numpy
2. Copy the client script
Copy client/mic_stream.py to your client machine:
# On server machine
scp client/mic_stream.py user@client-machine:~/
# Or download it via your preferred method
3. List available microphones
python mic_stream.py --list-devices
Example output:
Available audio input devices:
--------------------------------------------------------------------------------
[0] Built-in Microphone
Channels: 2
Sample rate: 44100.0 Hz
[1] USB Microphone
Channels: 1
Sample rate: 48000.0 Hz
--------------------------------------------------------------------------------
4. Start streaming
python mic_stream.py --url ws://SERVER_IP:8766
Replace SERVER_IP with your server's IP address (e.g., ws://192.168.1.100:8766)
Options:
# Use specific microphone device
python mic_stream.py --url ws://192.168.1.100:8766 --device 1
# Change sample rate (if needed)
python mic_stream.py --url ws://192.168.1.100:8766 --sample-rate 16000
# Adjust chunk size for network latency
python mic_stream.py --url ws://192.168.1.100:8766 --chunk-duration 0.2
Usage Flow
1. Start Server
On the server machine:
cd /home/koko210Serve/parakeet-test
source venv/bin/activate
PYTHONPATH=/home/koko210Serve/parakeet-test python server/display_server.py
You'll see:
================================================================================
ASR Server - Live Transcription Display
================================================================================
Server: ws://0.0.0.0:8766
Sample Rate: 16000 Hz
Model: Parakeet TDT 0.6B V3
================================================================================
Server is running and ready for connections!
Waiting for clients...
2. Connect Client
On the client machine:
python mic_stream.py --url ws://192.168.1.100:8766
You'll see:
Connected to server: ws://192.168.1.100:8766
Recording started. Press Ctrl+C to stop.
3. Speak into Microphone
- Speak naturally into your microphone
- Watch the server terminal for real-time transcriptions
- Progressive updates appear in yellow as you speak
- Final transcriptions appear in green when you pause
4. Stop Streaming
Press Ctrl+C on the client to stop recording and disconnect.
Display Color Coding
On the server display:
- 🟢 GREEN = Final transcription (complete, accurate)
- 🟡 YELLOW = Progressive update (in progress)
- 🔵 BLUE = Connection events
- ⚪ WHITE = Server status messages
Example Session
Server Display:
================================================================================
✓ Client connected: 192.168.1.50:45232
================================================================================
[14:23:15] 192.168.1.50:45232
→ Hello this is
[14:23:17] 192.168.1.50:45232
→ Hello this is a test of the remote
[14:23:19] 192.168.1.50:45232
✓ FINAL: Hello this is a test of the remote microphone streaming system.
[14:23:25] 192.168.1.50:45232
→ Can you hear me
[14:23:27] 192.168.1.50:45232
✓ FINAL: Can you hear me clearly?
================================================================================
✗ Client disconnected: 192.168.1.50:45232
================================================================================
Client Display:
Connected to server: ws://192.168.1.100:8766
Recording started. Press Ctrl+C to stop.
Server: Connected to ASR server with live display
[PARTIAL] Hello this is
[PARTIAL] Hello this is a test of the remote
[FINAL] Hello this is a test of the remote microphone streaming system.
[PARTIAL] Can you hear me
[FINAL] Can you hear me clearly?
^C
Stopped by user
Disconnected from server
Client stopped by user
Network Considerations
Bandwidth Usage
- Sample rate: 16000 Hz
- Bit depth: 16-bit (int16)
- Bandwidth: ~32 KB/s per client
- Very low bandwidth - works well over WiFi or LAN
Latency
- Progressive updates: Every ~2 seconds
- Final transcription: When audio stops or on demand
- Total latency: ~2-3 seconds (network + processing)
Multiple Clients
The server supports multiple simultaneous clients:
- Each client gets its own session
- Transcriptions are tagged with client IP:port
- No interference between clients
Troubleshooting
Client Can't Connect
Error: [Errno 111] Connection refused
Solution:
- Check server is running
- Verify firewall allows port 8766
- Confirm server IP address is correct
- Test connectivity:
ping SERVER_IP
No Audio Being Captured
Recording started but no transcriptions appear
Solution:
- Check microphone permissions
- List devices:
python mic_stream.py --list-devices - Try different device:
--device N - Test microphone in other apps first
Poor Transcription Quality
Solution:
- Move closer to microphone
- Reduce background noise
- Speak clearly and at normal pace
- Check microphone quality/settings
High Latency
Solution:
- Use wired connection instead of WiFi
- Reduce chunk duration:
--chunk-duration 0.05 - Check network latency:
ping SERVER_IP
Security Notes
⚠️ Important: This setup uses WebSocket without encryption (ws://)
For production use:
- Use WSS (WebSocket Secure) with TLS certificates
- Add authentication (API keys, tokens)
- Restrict firewall rules to specific IP ranges
- Consider using VPN for remote access
Advanced: Auto-start Server
Create a systemd service (Linux):
sudo nano /etc/systemd/system/asr-server.service
[Unit]
Description=ASR WebSocket Server
After=network.target
[Service]
Type=simple
User=YOUR_USERNAME
WorkingDirectory=/home/koko210Serve/parakeet-test
Environment="PYTHONPATH=/home/koko210Serve/parakeet-test"
ExecStart=/home/koko210Serve/parakeet-test/venv/bin/python server/display_server.py
Restart=always
[Install]
WantedBy=multi-user.target
Enable and start:
sudo systemctl enable asr-server
sudo systemctl start asr-server
sudo systemctl status asr-server
Performance Tips
- Server: Use GPU for best performance (~100ms latency)
- Client: Use low chunk duration for responsiveness (0.1s default)
- Network: Wired connection preferred, WiFi works fine
- Audio Quality: 16kHz sample rate is optimal for speech
Summary
✅ Server displays transcriptions in real-time
✅ Client sends audio from remote microphone
✅ Progressive updates show live transcription
✅ Final results when speech pauses
✅ Multiple clients supported
✅ Low bandwidth, low latency
Enjoy your remote ASR streaming system! 🎤 → 🌐 → 🖥️