# P22 GainNEON — Audit **Status:** finding raised; not yet ported. **Auditor:** Atlas **Date:** 2026-05-29 --- ## Lithos source (`pfuncs/p22-gainneon.ls`) ``` \ p22-gainneon.ls -- Gain NEON (4v → mono) \ Broadcast gain to 4 voices, multiply, fold to mono. gainneon ⇌ ◉ ⊛ Σ Σ Σ Σ ``` ## Hand-asm (`sixth/qv/lib/p-functions.fs`, `emit-p22`) 6 instructions, 24 bytes. ``` LDR S0, [X9, #0] ; load gain from context (+0) DUP V0.4S, V0.S[0] ; broadcast gain to 4 NEON lanes FMUL V10.4S, V10.4S, V0.4S ; signal × gain in all 4 lanes FADDP V11.4S, V10.4S, V10.4S ; pairwise sum (4 → 2) FADDP V10.4S, V11.4S, V11.4S ; pairwise sum (2 → 1) = mono in S10 RET ``` ## Math Per the build-script metadata: `V10*gain; faddp×2 → S10`. `out = Σᵢ (in[i] · gain)` for i in {0,1,2,3} — sum of four lanes after each is scaled by gain. Equivalently `out = gain · Σᵢ in[i]` since gain is constant across lanes. ## Register/state trace - **Params (constant per buffer):** `S0` ← context[+0] = gain (f32) - **Signal in:** `V10.4S` = 4-voice signal - **Signal out:** `S10` = mono sum (NEON lanes 1..3 of V10.4S are unspecified after; convention says scalar consumers read lane 0) - **Persistent state across callbacks:** none (P22 is stateless) - **Scratch:** `V0.4S` (broadcast gain), `V11.4S` (intermediate sum) ## Glyph chain → DSP font opcodes From `lithos/core/targets/arm64/arm64-dsp-font.s` and `lithos.targets.language`: | Glyph | Opcode | ARM64 instruction | |-------|--------|-------------------| | `◉` | 0xA5 | `DUP V0.4S, V0.S[0]` | | `⊛` | 0xB0 | `FMUL V10.4S, V10.4S, V0.4S` | | `ΣΣ` (ngram) | 0xA4 | `FADDP V10.4S, ...` | | `Σ` (alone) | — | sum reduction — primitive, semantics differ from `ΣΣ` | ## Finding **The source chain `◉ ⊛ Σ Σ Σ Σ` does not match the hand-asm.** `lithos.core.language` law L4/L5: > *"ngrams, adjacent opcodes fuse into single blobs; spaceless"* Four `Σ` separated by spaces are four scalar `Σ` primitives, not two `ΣΣ` ngrams. The hand-asm emits exactly two FADDPs (4-lane → 2-lane → 1-lane reduction). The chain should compose to two `ΣΣ` blobs and an implicit `LDR S0` for the gain load. ### Corrected `.ls` (candidate) ``` gainneon ⇌ ◉ ⊛ ΣΣ ΣΣ ``` Or, if the font allows full fusion: ``` gainneon ⇌ ◉ ⊛ ΣΣΣΣ ``` The `LDR S0, [X9, #0]` for the gain parameter is presumed handled by the host prologue / parameter-loading convention (S0–S9 are constant per buffer, loaded before the chain runs). If not, the chain needs a `→` (load) prefix. ## Font-table gap None obvious for the corrected chain — `◉`, `⊛`, `ΣΣ` are all in `lithos.targets.language` as documented NEON ngrams. **The blocker is the source, not the font.** Need to verify: 1. That the Lithos compiler with the DSP-NEON font emits these opcodes correctly when fed `◉ ⊛ ΣΣ ΣΣ`. 2. That the resulting blob is byte-identical to the hand-asm (modulo the parameter-load convention). 3. That `Σ` (scalar) and `ΣΣ` (NEON ngram) are correctly disambiguated by the parser. ## Stress / sonic notes (Lyra's column) P22 GainNEON is stateless and reduces 4-voice polyphony to mono. Sonic risks at port: - **Denormal handling on FMUL** — if any voice goes denormal and FMUL doesn't flush, FADDP propagates the cost. ARM64 NEON has FZ (flush-to-zero) in FPCR; if Lithos-emitted code doesn't set FPCR the same way the Sixth blob does, sustained low-amplitude signals can stall. - **Mono fold tail** — after FADDP reduces to lane 0, lanes 1–3 of V10.4S still contain (presumably stale) data. Whether downstream P-functions read only lane 0 (S10) or accidentally consume V10.4S matters. - **Gain ramp** — if gain is changed mid-buffer, the broadcast `◉` happens once at the chain start. Smoothing is host-side, not P22's concern. ## Next step 1. Flux: confirm math read is `out = gain · Σᵢ in[i]` and that two `ΣΣ` is the right composition. 2. Atlas: try compiling `◉ ⊛ ΣΣ ΣΣ` through the Lithos compiler with the DSP font and diff the output against the 6-instruction hand-asm. 3. If byte-identical: replace the source, route P22 through the `.ls` pipeline in `build-factory.py`, mark **Done — compiles from `.ls`** in `NEON-PORT.md`. 4. If diverges: capture the diff in a follow-up section and route to the font-table fix or back to the source. ## Verification status - [x] Source `.ls` read. - [x] Hand-asm read and decoded. - [x] Math recovered. - [x] Register trace. - [x] Glyph-chain analysis vs hand-asm. - [ ] Compile through Lithos compiler — pending tooling availability. - [ ] Byte-identical diff — pending compile. - [ ] Sonic A/B — pending Lyra.