Files

koko210Serve 362108f4b0 Decided on Parakeet ONNX Runtime. Works pretty great. Realtime voice chat possible now. UX lacking.

2026-01-19 00:29:44 +02:00

8.7 KiB

Raw Blame History

Remote Microphone Streaming Setup

This guide shows how to use the ASR system with a client on one machine streaming audio to a server on another machine.

Architecture

┌─────────────────┐                    ┌─────────────────┐
│  Client Machine │                    │  Server Machine │
│                 │                    │                 │
│  🎤 Microphone  │  ───WebSocket───▶  │  🖥️  Display    │
│                 │      (Audio)       │                 │
│  client/        │                    │  server/        │
│  mic_stream.py  │                    │  display_server │
└─────────────────┘                    └─────────────────┘

Server Setup (Machine with GPU)

1. Start the server with live display

cd /home/koko210Serve/parakeet-test
source venv/bin/activate
PYTHONPATH=/home/koko210Serve/parakeet-test python server/display_server.py

Options:

python server/display_server.py --host 0.0.0.0 --port 8766

The server will:

✅ Bind to all network interfaces (0.0.0.0)
✅ Display transcriptions in real-time with color coding
✅ Show progressive updates as audio streams in
✅ Highlight final transcriptions when complete

2. Configure firewall (if needed)

Allow incoming connections on port 8766:

# Ubuntu/Debian
sudo ufw allow 8766/tcp

# CentOS/RHEL
sudo firewall-cmd --permanent --add-port=8766/tcp
sudo firewall-cmd --reload

3. Get the server's IP address

# Find your server's IP address
ip addr show | grep "inet " | grep -v 127.0.0.1

Example output: 192.168.1.100

Client Setup (Remote Machine)

1. Install dependencies on client machine

Create a minimal Python environment:

# Create virtual environment
python3 -m venv asr-client
source asr-client/bin/activate

# Install only client dependencies
pip install websockets sounddevice numpy

2. Copy the client script

Copy client/mic_stream.py to your client machine:

# On server machine
scp client/mic_stream.py user@client-machine:~/

# Or download it via your preferred method

3. List available microphones

python mic_stream.py --list-devices

Example output:

Available audio input devices:
--------------------------------------------------------------------------------
[0] Built-in Microphone
    Channels: 2
    Sample rate: 44100.0 Hz
[1] USB Microphone
    Channels: 1
    Sample rate: 48000.0 Hz
--------------------------------------------------------------------------------

4. Start streaming

python mic_stream.py --url ws://SERVER_IP:8766

Replace SERVER_IP with your server's IP address (e.g., ws://192.168.1.100:8766)

Options:

# Use specific microphone device
python mic_stream.py --url ws://192.168.1.100:8766 --device 1

# Change sample rate (if needed)
python mic_stream.py --url ws://192.168.1.100:8766 --sample-rate 16000

# Adjust chunk size for network latency
python mic_stream.py --url ws://192.168.1.100:8766 --chunk-duration 0.2

Usage Flow

1. Start Server

On the server machine:

cd /home/koko210Serve/parakeet-test
source venv/bin/activate
PYTHONPATH=/home/koko210Serve/parakeet-test python server/display_server.py

You'll see:

================================================================================
ASR Server - Live Transcription Display
================================================================================
Server: ws://0.0.0.0:8766
Sample Rate: 16000 Hz
Model: Parakeet TDT 0.6B V3
================================================================================

Server is running and ready for connections!
Waiting for clients...

2. Connect Client

On the client machine:

python mic_stream.py --url ws://192.168.1.100:8766

You'll see:

Connected to server: ws://192.168.1.100:8766
Recording started. Press Ctrl+C to stop.

3. Speak into Microphone

Speak naturally into your microphone
Watch the server terminal for real-time transcriptions
Progressive updates appear in yellow as you speak
Final transcriptions appear in green when you pause

4. Stop Streaming

Press Ctrl+C on the client to stop recording and disconnect.

Display Color Coding

On the server display:

🟢 GREEN = Final transcription (complete, accurate)
🟡 YELLOW = Progressive update (in progress)
🔵 BLUE = Connection events
⚪ WHITE = Server status messages

Example Session

Server Display:

================================================================================
✓ Client connected: 192.168.1.50:45232
================================================================================

[14:23:15] 192.168.1.50:45232
  → Hello this is

[14:23:17] 192.168.1.50:45232
  → Hello this is a test of the remote

[14:23:19] 192.168.1.50:45232
  ✓ FINAL: Hello this is a test of the remote microphone streaming system.

[14:23:25] 192.168.1.50:45232
  → Can you hear me

[14:23:27] 192.168.1.50:45232
  ✓ FINAL: Can you hear me clearly?

================================================================================
✗ Client disconnected: 192.168.1.50:45232
================================================================================

Client Display:

Connected to server: ws://192.168.1.100:8766
Recording started. Press Ctrl+C to stop.

Server: Connected to ASR server with live display
[PARTIAL] Hello this is
[PARTIAL] Hello this is a test of the remote
[FINAL] Hello this is a test of the remote microphone streaming system.
[PARTIAL] Can you hear me
[FINAL] Can you hear me clearly?

^C
Stopped by user
Disconnected from server
Client stopped by user

Network Considerations

Bandwidth Usage

Sample rate: 16000 Hz
Bit depth: 16-bit (int16)
Bandwidth: ~32 KB/s per client
Very low bandwidth - works well over WiFi or LAN

Latency

Progressive updates: Every ~2 seconds
Final transcription: When audio stops or on demand
Total latency: ~2-3 seconds (network + processing)

Multiple Clients

The server supports multiple simultaneous clients:

Each client gets its own session
Transcriptions are tagged with client IP:port
No interference between clients

Troubleshooting

Client Can't Connect

Error: [Errno 111] Connection refused

Solution:

Check server is running
Verify firewall allows port 8766
Confirm server IP address is correct
Test connectivity: ping SERVER_IP

No Audio Being Captured

Recording started but no transcriptions appear

Solution:

Check microphone permissions
List devices: python mic_stream.py --list-devices
Try different device: --device N
Test microphone in other apps first

Poor Transcription Quality

Solution:

Move closer to microphone
Reduce background noise
Speak clearly and at normal pace
Check microphone quality/settings

High Latency

Solution:

Use wired connection instead of WiFi
Reduce chunk duration: --chunk-duration 0.05
Check network latency: ping SERVER_IP

Security Notes

⚠️ Important: This setup uses WebSocket without encryption (ws://)

For production use:

Use WSS (WebSocket Secure) with TLS certificates
Add authentication (API keys, tokens)
Restrict firewall rules to specific IP ranges
Consider using VPN for remote access

Advanced: Auto-start Server

Create a systemd service (Linux):

sudo nano /etc/systemd/system/asr-server.service

[Unit]
Description=ASR WebSocket Server
After=network.target

[Service]
Type=simple
User=YOUR_USERNAME
WorkingDirectory=/home/koko210Serve/parakeet-test
Environment="PYTHONPATH=/home/koko210Serve/parakeet-test"
ExecStart=/home/koko210Serve/parakeet-test/venv/bin/python server/display_server.py
Restart=always

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl enable asr-server
sudo systemctl start asr-server
sudo systemctl status asr-server

Performance Tips

Server: Use GPU for best performance (~100ms latency)
Client: Use low chunk duration for responsiveness (0.1s default)
Network: Wired connection preferred, WiFi works fine
Audio Quality: 16kHz sample rate is optimal for speech

Summary

✅ Server displays transcriptions in real-time
✅ Client sends audio from remote microphone
✅ Progressive updates show live transcription
✅ Final results when speech pauses
✅ Multiple clients supported
✅ Low bandwidth, low latency

Enjoy your remote ASR streaming system! 🎤 → 🌐 → 🖥️

8.7 KiB Raw Blame History

Remote Microphone Streaming Setup

Architecture

Server Setup (Machine with GPU)

1. Start the server with live display

2. Configure firewall (if needed)

3. Get the server's IP address

Client Setup (Remote Machine)

1. Install dependencies on client machine

2. Copy the client script

3. List available microphones

4. Start streaming

Usage Flow

1. Start Server

2. Connect Client

3. Speak into Microphone

4. Stop Streaming

Display Color Coding

Example Session

Server Display:

Client Display:

Network Considerations

Bandwidth Usage

Latency

Multiple Clients

Troubleshooting

Client Can't Connect

No Audio Being Captured

Poor Transcription Quality

High Latency

Security Notes

Advanced: Auto-start Server

Performance Tips

Summary

8.7 KiB

Raw Blame History