Great questions! Let me address each:
Synchronization
You're right to be curious - the ±500 Hz tolerance at 4 kSym/s (~12.5% of symbol rate) is better than expected. Here's what's happening:
Current implementation:
- GNU Radio's
digital.clock_recovery_mm_fffor symbol timingdigital.fll_band_edge_ccfor coarse frequency correction- Frame sync via correlation with known preamble (64-symbol Barker-like sequence)
The FLL band edge tracker is doing most of the heavy lifting for frequency offset. It's designed for continuous-phase modulations and works reasonably well with GFSK (which is what I'm actually using, not raw FSK - should have been clearer about that).
But you've identified a weak point: I haven't implemented fine carrier tracking after frame sync. The coarse FLL + preamble correlation is surprisingly robust in simulation, but I'm skeptical it'll hold up in real hardware with drift during the frame. This is on the Q2 2025 roadmap - probably need a pilot tone or decision-directed tracking.
The ±1 kHz failure is likely the FLL's tracking range limit. Beyond that, it loses lock entirely.
LDPC Code Choice
You're exactly right about the trade-off. I'm using custom codes from the DVB-S2/T2 family:
- 4FSK: Rate 3/4, n=1000, k=750 (4FSK needs higher rate for 6 kbps Opus)
- 8FSK: Rate 2/3, n=1125, k=750 (8FSK can afford lower rate for better protection)
These are short by LDPC standards (DVB-S2 uses 64800 bits!). I chose them because:
- Target latency: <80ms total (40ms Opus frame is non-negotiable)
- Belief propagation still converges in ~20-30 iterations with these sizes
- ~2 dB from Shannon limit (not optimal, but acceptable for voice)
Could I do better? Probably with optimized irregular LDPC codes designed specifically for 750-1000 bit blocks, but that's research territory. I borrowed proven codes from DVB to avoid reinventing the wheel.
ChaCha20 and Frame Loss
You've identified a real issue! ChaCha20 is indeed a stream cipher, and losing synchronization is catastrophic.
Current approach:
- Each 40ms frame is encrypted independently
- Frame number used as nonce (increments each frame)
- Key stays constant for the transmission
- Format:
ChaCha20(frame_data, key, nonce=frame_number)How frame loss is handled:
- Receiver knows expected frame sequence (from superframe counter)
- If frame N is lost (FER), receiver knows to skip that nonce
- Frame N+1 arrives → use nonce=N+1 → decrypts correctly
- No keystream reinitialization needed
The trick: Frame numbers are transmitted in a separate authenticated header (not encrypted), protected by its own LDPC code. This header has stronger protection (rate 1/3) than the voice payload (rate 2/3 or 3/4).
Failure mode: If the header is corrupted, the entire frame is discarded. This is why crypto overhead appears as increased FER - it's not the encryption itself, but the additional header that can fail.
Alternative considered: Your multi-instance round-robin idea is clever! I didn't implement it because:
- Added complexity (state management)
- Frame numbers solve it more simply
- Voice can tolerate 5% loss (Opus error concealment)
For data applications (where 5% loss is unacceptable), your approach might be necessary.
Soft-Decision LDPC
You're correct - I'm NOT using soft-decision decoding yet!
Current implementation uses
gr-fec'sldpc_decoderin hard-decision mode:
- FSK demod → hard bits (0/1) → LDPC decoder
- This creates the 4-5% FER floor
Why not soft-decision?
- Honestly: Implementation complexity
gr-fec's sum-product decoder exists, but I couldn't get it working reliably in time for Phase 3 testing- Hard-decision was "good enough" to validate the protocol design
Roadmap ( 2025 ?): Implement soft-decision properly:
- FSK demod → log-likelihood ratios (LLRs) → sum-product algorithm
- Expected improvement: 1-2 dB waterfall, FER floor <0.1%
- This should bring me closer to theoretical LDPC performance
You're right that I'm leaving performance on the table. Hard-decision was a pragmatic choice for "get it working first."
FSK vs Other Modulations
You're absolutely right - FSK is suboptimal for power efficiency!
Why I chose GFSK:
- Simplicity: Easy to implement, debug, and test
- Constant envelope: Good for non-linear amplifiers (typical ham radios)
- Narrow bandwidth: 9-12 kHz fits in 12.5 kHz channels
- Phase continuity: GFSK (not raw FSK) maintains phase across symbols
Better alternatives:
- QPSK/OQPSK: 2 bits/symbol, better power efficiency, needs linear amp
- APSK: Even better, but more complex equalization
- GMSK: Similar to GFSK, proven (used in GSM)
Why not QPSK?
- Requires linear amplifier (not always available in ham radios)
- More sensitive to phase noise
- More complex synchronization
Future consideration: A "high-performance mode" with QPSK for fixed stations with linear amps could be interesting. Would gain 2-3 dB over GFSK.
But for initial deployment targeting typical ham gear (Class C finals), constant-envelope modulation seemed safer.
Target Frequencies
Primary target: VHF/UHF (144-148 MHz, 430-440 MHz)
- Where digital voice is most active
- Hardware available (MCM-iMX93 with SX1255 covers 400-930 MHz)
Possible extension: 6m, 2m, 70cm, 23cm
- Protocol is frequency-agnostic
- RF hardware is the limiting factor
Back-of-Envelope Check
Your math is correct:
- 8FSK: 8 kbps audio → 12 kbps coded → 4 kSym/s
- Symbol rate: 4000 symbols/second
- ±500 Hz tolerance: 12.5% of symbol rate
This IS suspiciously good! You're right to question it.
Possible explanations:
- GFSK bandwidth: I'm using BT=0.5, which spreads the spectrum more than minimum-shift FSK. This might make the band-edge FLL more robust.
- Preamble length: 64 symbols gives the FLL time to converge before frame sync.
- Simulation optimism: GNU Radio's FLL might be more ideal than real hardware. Hardware testing will reveal the truth.
Action item: I should characterize the FLL's actual tracking range experimentally, not just trust simulation. This is hardware validation work.
Summary
You've identified several areas for improvement:
- Synchronization: Need fine tracking after frame sync
- LDPC codes: Could be optimized for block size, but DVB codes work
- ChaCha20: Solved with frame numbers, but header overhead creates FER
- Soft-decision: Its a tricky thing to implement, so it depends on the cost and advantage of it.
- Modulation: GFSK is suboptimal but practical for ham gear
- Frequency sync: Need to validate ±500 Hz claim with real hardware
Really appreciate the technical depth here - these are exactly the questions that improve the design. The honest answer is: simulation shows promise, but hardware will reveal where the weaknesses are. That's why on-air testing is critical.
I am waiting for the LinHT to release some time next year..
73
No comments:
Post a Comment