196 Commits

Author SHA1 Message Date
Jędrzej Stuczyński 931ec03b28 feat: expose node's chain address on self-described API (#6815)
* feat: expose nym-nodes' on-chain address on v2 auxiliary endpoint

* moved swagger page outside the v1 route

* fixed swagger endpoint for nym-api

* post rebasing fixes

* remove redundant impl OfflineSigner for Arc<DirectSecp256k1HdWallet>
2026-06-12 10:36:27 +01:00
Jędrzej Stuczyński 8a93bce32f feat: additional mixnet improvements and metrics (#6874)
* wip

* batch processing of forward packets

* tmp: additional metrics for remote node

* fixed incorrect prometheus metric registration

* unified runtime metrics

* unify mixnet client metrics

* packet forwarding cleanup

* add batching for emptying the delay queue

* cleanup client io loop

* feat(nym-node): reap idle mixnet connections (ingress + egress)

Close mixnet connections that sit with no traffic past a configurable idle period (mixnet.debug.connection_idle_timeout, default 5min, 0 disables) to bound lingering tokio tasks/sockets.

Ingress handle_stream is read-only, so a silently-gone peer (NAT drop, crash without FIN, half-open) never triggers FIN/RST and the task would block on .next() forever; a new idle select arm closes it (the post-loop replay flush still runs, so nothing is stranded). Egress run_io_loop gets the symmetric arm keyed on last_send; on close EvictOnDrop clears the cache entry and the next packet transparently reconnects.

Adds a cumulative nym_node_network_idle_closed_ingress_mixnet_connections counter; egress reaping is observed via the existing active-egress gauge plus an exit_reason=idle_timeout log.

* downgrade sysinfo

* refactor(nym-node): split PacketForwarder into router + delay-queue tasks

Split the single PacketForwarder task into two concurrently-scheduled tasks connected by a bounded handoff channel, so intake and delayed-release no longer block each other.

PacketRouter (router.rs) is the intake task: sole consumer of the ingress channel, it applies the routing filter and either forwards zero/already-elapsed-delay packets directly or hands delayed ones to the delay task. Its per-packet work is sub-µs, so new packets no longer wait behind delayed-release processing (collapses the ForwarderQueue tail).

DelayForwarder (delay.rs) owns the NonExhaustiveDelayQueue exclusively (it can't be shared by reference). Its run loop services BOTH branches on every wakeup - draining pending inserts first to bring the queue current, then flushing everything now due - so the biased select can't let releases and inserts starve each other, and a freshly-arrived-but-already-due packet releases in the same pass (marginally improving DelayQueueOverrun).

The mixnet client is shared as Arc<C>; handoff-channel overflow is dropped as an egress drop rather than blocking, keeping intake decoupled from release.

* feat(nym-node): bound egress flush with a write timeout

Cap how long a single egress batch flush may block on a congested peer socket (mixnet.debug.connection_write_timeout, default 500ms, 0 disables), so a slow peer can no longer back this connection's egress queue up into the multi-second range - the root of the EgressQueue and SocketWrite tails.

A single timeout is treated as transient congestion: the un-fed tail of the batch is abandoned but the connection is retained. This is sound because NoiseStream::poll_write encrypts and buffers each frame synchronously, so a cancelled flush leaves the noise transport nonce-consistent and a later flush resumes the byte stream in order - so a momentary spike costs no re-handshake. Only MAX_CONSECUTIVE_WRITE_TIMEOUTS (3) timeouts in a row, i.e. a persistently congested peer, tears the connection down (it reconnects on the next packet); a successful flush resets the counter.

Buffer-size tuning (maximum_connection_buffer_size) deliberately left for live data.

* revert PacketForwarder split in favour of a single task that clears both channels on wake
2026-06-12 10:31:54 +01:00
Jędrzej Stuczyński 3730260cf0 feat(nym-node): mixnet packet latency instrumentation (#6852)
- PacketTrace stopwatch + generic Traced<T> carrier threaded receive -> socket-write
- TraceStage enum owns each stage's metric name/help/buckets; observations go straight
  to the global nym-metrics registry under a uniform mixnet_packet_* family
- stages: Unwrap, ReplayCheck (incl. deferral), ForwarderQueue, DelayQueue,
  DelayQueueOverrun (lateness beyond target release), EgressQueue, SocketWrite, Total
- node-side 1-in-N sampling via MixnetDebug.egress_trace_sample_rate (default 100, 0 disables)
2026-06-09 16:22:14 +01:00
benedettadavico 799dfd59bb Merge branch 'release/2026.11-xynomizithra' into develop 2026-06-09 16:24:28 +02:00
Jędrzej Stuczyński f944348216 bugfix: restore and fix node throughput tester (#6849) 2026-06-09 15:19:37 +01:00
Jędrzej Stuczyński b6202b5a6b chore: minor nym-node improvements (#6850)
* set TCP_NODELAY for mixnet connections

* bugfix: correctly compute count deferral threshold

* bugfix: make sure to flush pending packets waiting for bloomfilter check

* implement batch sending into mixnet connection

* adjust default nym-node connection settings
2026-06-08 08:37:41 +01:00
Jędrzej Stuczyński e8410b2302 feat: disable Nagle's algorithm for LP between nym-nodes (#6857) 2026-06-05 16:37:12 +01:00
Jędrzej Stuczyński 526cb9b8be Merge branch 'develop' into merge/release/2026.10-waterloo 2026-05-26 10:00:43 +01:00
Jędrzej Stuczyński 28b22f6b22 upgrade axum to 0.8.9 (and side deps) (#6808)
* upgrade axum to 0.8.9 (and side deps)

Bumps axum 0.7.5 → 0.8.9, axum-extra 0.9.4 → 0.12.6,
axum-client-ip 0.6.1 → 1.3.1, axum-test 16.2.0 → 20.0.0,
utoipa-swagger-ui 8.1 → 9.0.2.

* warn upon using fallback ip

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* chore: replace use of deprecated try_next()

* update console-subscriber to ensure single version of axum in the lock file

* removed unused axum-test dev-dep

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-05-22 15:39:33 +01:00
Jędrzej Stuczyński 46c67440bb Mixnode stress testing (#6575)
* Squashing the mix stress testing branch (#6575)

reduced chain watcher per block log severity

update network monitors contract semver to 1.0.0

fix build issues

fix mixnet client dropping initial packet on egress reconnection

adjusted logs for network monitor agent

changed default testing interval to 2h

refresh NM contract information

explicit return type for batch submission

for mixnet listener task to get scheduled before beginning connectivity test

make sure to always use canonical ip for network monitor noise keys

feat: NMv3: make agents decide egress port (#6746)

add config v12->v13 config migration for nym nodes

fix formatting in wallet types

simplified client config creation

remove other swagger redirect

removed swagger redirect on /swagger/ route

log version info on startup

add workflows, contract address, and dockerfile

bugfix: use correct endpoints when setting up orchestrator (#6733)

clippy

adjust DEFAULT_MIN_STRESS_TESTED_NODES ratio

expose route with new performance metrics

fixes and additional docs

use stress testing scores

stub for usage of stress testing scores

stub traits

added new fields to nym-api config controlling usage of stress test data

guard against duplicate packets

prevent usage of chain_authorisation_check_max_attempts with value of 0

make sure duplicate results cant be inserted into the db

submit test results from orchestrator on an interval

docs and fixes

nym-api side of handling result submission

stubs for submitting results

NM orchestrator verifying nym-api result submission permissions

NM orchestrator to update announced key on startup

allow NM orchestrator to announce its identity key to the contract

stubs within nym-api for accepting NMv3 results

added additional metrics

docs

bugfixes + making sure to only assign mixnode testruns

fixed node refresher to only retrieve mixnodes and add additional metrics

topology metrics

defined basic prometheus metrics

authorised endpoint for returning prometheus data

create initial stub for prometheus metrics

post rebasing fixes

adjusted routes

missing implementation for storage getters

a lot of new stubs and db accessors

stubs for results endpoints

update utoipa tags for agent rountes

shared auth between metrics and results

moved stale results eviction into the interval.tick branch

refactor and comments

create background process to evict stale data

include sphinx packet delay as part of the stats

fix mock construction

add median to the calculated latency distribution

remove unused imports

cleanup

performing testrun and submitting the results

assigning testruns to requesting agents

basic stub for http server for the NMv3 orchestrator

chore: rename existing 'NetworkMonitorAgent' to 'NodeStressTester'

make sure to use canonical ips within the noise config

fixed contract tests

cargo fmt

additional comments and unit tests

contract and nym-node support of NM agents being run on the same host

basic unit tests

refactoring

make agents retrieve mix port assignment from the orchestrator

provide sensible defaults to CLI arguments

stub the initial structure for the agent

chore: remove redundant import

missed tick behaviour

removed redundant mutex

removed redundant try_get_client

reuse existing constant for default nymnode port

add node refresher for periodic scraping of bonded nym-node details

- NodeRefresher periodically queries the mixnet contract for all bonded
  nodes and probes each node's HTTP API for host information, sphinx keys,
  noise keys, and key rotation IDs
- Extract NymNodeApiClientRetriever into nym-node-requests with port
  probing, identity verification, and host information signature checking
- Add clone_query_client on NyxdClient so the refresher can hold its own
  query client without locking the signing client
- Batch upsert for nym_node rows (single transaction instead of per-row)
- Reuse the new helpers in nym-api's node_describe_cache

ensure assignment of testrun begins an IMMEDIATE tx

construction of the orchestrator struct

initial set of cli args

make sure to not assign testable nodes too often

very initial database structure and cli

fixed construction of RoutableNetworkMonitors

remove redundant constructor for NoiseNode

forbid 0-nonsense config values

add type safety for test route construction

moved lioness and arrayref to workspace deps

fixed dockerfile build

always use canonical addresses in RoutableNetworkMonitors

fixed old contract formatting issues

removed redundant into() call

network monitor agent fixes

additional logs

config unit tests

more docs

standalone stress testing invocation

further refactoring and changes

refactor testing loop and return valid test result upon completion

initial sending/receiving test loop

generating reusable sphinx headers

additional structure for receiving ingress packets

initial scaffolding for NMv3 agent

added validation of x25519 noise key

removed unstable call to 'is_multiple_of'

remove calls to from_octets as they're unavailable in pre 1.91

additional docs/comments

propagating noise information about NM for mixnet routing

pass full socket address of the agent into the contract storage

feat: store noise keys alongside ip addresses within the contract

removed redundant comment

ensure NM packets can only go to NM

PR review comments

added additional docs

allow NM to replay packets + fix replay prometheus metrics

propagate information about nm agent to connection handler

updated nym-node config migration

feat: introduced nym-node websocket subscription for keeping updated list of NM agents

allow admin to also revoke monitor agents

remove agents upon orchestrator removal

fixed schema generation and regenerated the contract schema

removed rustc restriction on contracts-common

added client methods for interacting with the contract

added unit tests for contract methods

implemented logic of the network monitors contract

create initial structure for network monitors contract

start mix stress testing topic branch

* make nym-node default to the new blockstream rpc/ws node cluster

* reduced mixnet-client log severity

* set network monitors contract address for mainnet
2026-05-22 09:43:20 +01:00
Simon Wicky 71d4b5b3ea moving lp packets in lp-data crate (#6810)
* moving lp packets in lp-data crate

* one more bit

* fmt

* crate description
2026-05-20 14:32:01 +02:00
dynco-nym ad56645fc5 Block non-public IPR/NR checks (#6670)
* Block non-public IPR/NR checks

* Add CLI override flag
2026-04-15 15:59:38 +02:00
Jack Wampler 7b77091fb1 Nym Node spam logging (#6621)
prevent spam logs when downstream node is slow
2026-03-26 11:27:14 -06:00
mfahampshire 4077717d3a Max/lp stream framing (#6573)
* Add LpFrameKind::Stream variant with StreamFrameAttributes
- Define LP wire format for stream multiplexing
- Handle new variant in entry gateway match arm

* Replace MixnetStream with LP framing
- Replace custom header with LpFrameHeader
- Added sequence number for message ordering

* Revert accidental vergen bump

* Revert accidental bumps

* Rename Stream to SphinxStream and split match arms in client_handler

* Add LpFrameAttributes type alias for [u8; 14]
2026-03-19 15:30:59 +00:00
Jędrzej Stuczyński e2dd8ac743 feat: localnet v2 (#6277)
* squashing localnet-v2 commits (again)

cargo fmt

fixes to localnet purge

provide path in the error message

output args

log failed exec

print based on tty

check-prerequisites cmd

checked iptables update

basic kernel features check

enable ipv6 rules

add forwarding rules

squashing localnet-v2 commits

additional changes

propagate custom-dns flag to all run containers

remove is_mock from EcashManager

another localnet squash

unused import

chore: remove redundant testnet manager

missing impl

additional linux fixes

command to rebuild container image

wait for at least 2 blocks

additional node startup fixes

added --custom-dns flag to nym node setup

add gateway probe + wait for DKG magic file

fixed localnet down on linux

container ls

re-enable state resync

additional feature locking

macos adjustments

working nyxd startup on linux

wip linux box

wip

separating network inspect betweewn macos and linux

initial linux feature locking

moved all container commands into a single location

finally working initial node performance

squashing orchestrator commits

cleanup

fixed condition for naive rearrangement

added cache of cosmwasm contracts for speed up on subsequent runs

'down' command

refreshing described cache after nodes are bonded

nym nodes setup + wip on nym api refresh

nodes setup WIP

first pass cleanup

placeholder for nym-node setup

bypassing the dkg

further progress on nym-api setup

wip: api setup

up/down/purge placeholders

persisting contract setup data

fix contract upload by forcing amd64 container platform

wip: contracts setup4

wip: contracts setup3

wip: contracts setup2

wip: contracts setup

include network setup

init and spawn nyxd

build nyxd image in dedicated orchestrator

build nyxd image

squashed cherry-picked lp changes

Bits and bobs to make everything work

Title

MacOS setup instructions

Docker/Container localnet

* clippy

* fixes on non-unix targets

---------

Co-authored-by: durch <durch@users.noreply.github.com>
2026-03-12 14:46:00 +00:00
Jędrzej Stuczyński 59254c92c3 bugfix: make sure to use old values from metrics debug config during v12 migration (#6546) (#6547) 2026-03-11 08:33:53 +00:00
Jędrzej Stuczyński 1ebb7e06c7 chore: rename LpMessage to LpFrame 2026-03-09 13:21:39 +00:00
Jędrzej Stuczyński ef65cf4c9e additional adjustments 2026-03-06 16:34:42 +00:00
Jędrzej Stuczyński 93ac638765 importing over changes from 'lp/persistent-node-connection' 2026-03-06 16:07:35 +00:00
Jędrzej Stuczyński 28c1637198 addressing LP PR comments 2026-03-04 09:33:28 +00:00
Jędrzej Stuczyński 4464d12103 clippy and review comments 2026-03-04 09:26:29 +00:00
Jędrzej Stuczyński 0d9d97e31e remove redundant LP state machine in favour of in place processing 2026-03-03 16:20:27 +00:00
Jędrzej Stuczyński 7a300bdd74 Merge branch 'develop' into merge/release/2026.5-raclette 2026-03-03 14:45:20 +00:00
Jędrzej Stuczyński 6569479083 feat: introduce /v3/unstable/nym-nodes/semi-skimmed to aggregate LP information (#6499)
* feat: introduce /v3/unstable/nym-nodes/semi-skimmed to aggregate LP information

nym-nodes will require this information to establish shared PSQ

* reorganised imports
2026-03-03 14:05:02 +00:00
Jędrzej Stuczyński 2cc9b05520 chore: split up lp listener (#6507)
* chore: split up lp listener

* rename 'build_lp'
2026-03-03 13:59:48 +00:00
Jędrzej Stuczyński 05b6f5e282 removed redundant LP states (#6509) 2026-03-03 13:58:47 +00:00
Jędrzej Stuczyński f6bd511599 feat: Lewes Protocol with PSQv2 (#6491)
* merging georgio/lp-psqv2-integration

* use authenicator on the responder's side

* nym-lp crate compiling

* moved the e2e test to nym-lp

* move key generation to peer

* moved principal generation

* update KKTResponder

* encapsulation key parsing

* Adding concrete types within KKT exchange

* initiator side of the full handshake

* responder side of the handshake and full e2e test

* fixed unit-tests within nym-kkt

* LpSession cleanup

* helpers for Transport

* revamp of the transport traits and initial work on client-side transport

* compiling nym-crypto

* 'working' client-entry dvpn reg

* Fix key conversion

* Slightly reduce use of rand08

* reverted back to libcrux repo refs

* intial telescoping reg

* removing dead code

* wip

* moved data encryption into the state machine

* restoring nym-lp tests

* update lp api model

* Add receiver index derivation

* Add receiver index derivation

* use derived receiver index

* feat: add kem key generation to nodes

* generate fresh x25519, mlkem768 and mceliece keys on config migration

* add lp peer config

* nym-node startup cleanup

* removed dependency on pre-rand09 from nym-lp

* re-expose LP information on the http API

* fixed tests compilation

* add peer config happy path tests

* formatting

* add more tests and fix bug

* better docs

* clippy and formatting issues

* return error on mceliece within NestedSession

* wasm fixes

* removed legacy nym-vpn-lib-wasm

* fixing wasm for real this time

* additional fixes

* add payload to kkt

* make clippy happy

* moved LP to nym-node crate

* cargo fmt

* integrate lpconfig payload

* fix response size trait impl

* Migrate receiver index

* Change receiver index to u32 and regorganize crates

* clippy

* hopefully final wasm fixes

* simple conversion method from semver to ciphersuite

* updated nym-node config template

* chore: remove duplicated code

---------

Co-authored-by: Georgio Nicolas <me@georgio.xyz>
2026-02-27 13:49:08 +00:00
Jędrzej Stuczyński 6edbece3ad bugfix: restore 'latest_measurement' field for nym-node /verloc endpoint (#6452) 2026-02-21 19:10:15 +00:00
Tommy Verrall 3eff6e5e3b fix testthroughput 2026-02-18 11:06:42 +01:00
Tommy Verrall a519f4ccb8 pr feedback
- Moved OTel CLI options into a separate OtelArgs
- Otel is built behind the feature flag otel
- Store timing is in microseconds
- Restore comments to existing files
2026-02-18 10:48:54 +01:00
Tommy Verrall 988df7cff7 sampling to avoid costs
- add otel timeouts
2026-02-17 09:10:52 +01:00
Tommy Verrall d28d0ac39e fix replay batch drop, harden error handling and scripts 2026-02-16 19:42:24 +01:00
Tommy Verrall dce4d6b34b otel: refactor key selection, add environment label, fix clippy 2026-02-16 19:13:11 +01:00
Tommy Verrall cb277fe487 otel: support signoz cloud ingestion key and TLS 2026-02-16 16:11:31 +01:00
Tommy Verrall e753f24ed1 localnet: fix runtime and gateway flags 2026-02-16 15:21:45 +01:00
Tommy Verrall 40a3cd28b7 otel: add tracing 2026-02-16 13:46:17 +01:00
Jędrzej Stuczyński 801dcdda1e do not run LP (#6422) 2026-02-06 08:41:19 +00:00
Jędrzej Stuczyński a151a03181 Lp/ip pool fixes (#6412)
* squashing Lp/ip pool fixes#6412

removed unused imports

gateway probe fixes

PSK injection + test fixes

cleanup minus PSK injection

combine with lp reg

moved authenticator peer registration to centralised location

bugfix: ensure IpPool never allocates gateway ip

ip pool allocation tests

* review fixes

* test fixes
2026-02-05 14:47:37 +00:00
Simon Wicky bd755385ed LP-fix : add LP x25519 key to the description (#6408)
* add x25519 key in LP description

* gateway probe adapt
2026-02-03 10:25:43 +01:00
Jędrzej Stuczyński 76ce1bc0f9 feat: use hex-encoding for lp key digests (#6394)
* feat: use hex-encoding for lp key digests

* removed needless borrow in test code

* gateway probe fixes
2026-01-30 08:44:29 +00:00
Jędrzej Stuczyński 7dd1dd1a6c Lp/two step dvpn reg (#6386)
* squashing  Lp/two step dvpn reg #6386

fixed integration tests by extending the mocks

remove dead code

compiling client-side code

gateway side handling of updated lp-wg reg

wip: countless changes on the gateway handler side

splitting up NestedLpSession

* fixed lp-messages tests

* gateway probe fixes

* unused variable

* resolved nits
2026-01-29 13:38:21 +00:00
Jędrzej Stuczyński 8af759fb1d LP: include signing key digests to LP responses (#6373)
* include signing key digests to LP responses

* mock
2026-01-27 12:23:52 +00:00
Jędrzej Stuczyński a4638b8d2f Lp/use noise x25519 (#6372)
* use x25519 noise key for ktt instead of deriving one from ed25519

* removed client's IpAddr from RegistrationClient constructor

* Adjusted the gateway probe to inject correct lp data

* remove redundant argument from nym-lp-client

* consistent naming for HashFunction variants

* use workspace dep import for nym-kkt-ciphersuite

* struct renaming
2026-01-26 13:15:37 +00:00
Jędrzej Stuczyński a63a1e745e LP: modified LPRemotePeer to dynamically choose required KEM key hash (#6358)
* LP: modified LPRemotePeer to dynamically choose required KEM key hash

* nym-lp-client fixes
2026-01-23 11:41:55 +00:00
Jędrzej Stuczyński c1ddcc75cf LP: announced KEM key hashes (#6349)
* announce KEM key hashes and use generated value within LpStateMachine

* added digest of remote KEM key into LpSession

* changed  constructor to LpSession to take explicit key materials for local and remote

this makes it easier to change keys required by each party without having to change all the interfaces everywhere again

* extended the changes to LpStateMachine constructor

* modify the interface to LpRegistrationHandler and LpListener

* gateway probe fixes

* temp nym-lp-client fixes

* review nits

* remove network test

* introduced v2/nym-nodes/described endpoint for returning nodes description alongside LP data

* missed V1 -> V2 description replacements

* removed deprecated call within mix-fetch

* use old v1 call in network stats
2026-01-22 14:29:33 +00:00
benedetta davico 7a9a04d846 Merge pull request #6238 from YichiZhang0613/fix_assertion
fix: fix assertion
2026-01-15 15:31:01 +01:00
Jędrzej Stuczyński de7a082e58 Merge branch 'develop' into merge/release/2026.1-niolo 2026-01-15 13:47:20 +00:00
Drazen Urch 8a00ed6071 LP Registration + Telescoping + Gateway Probe Localnet Mode (#6286)
* Add KKT cryptographic primitives

Post-quantum Key Encapsulation Mechanism (KEM) Key Transfer protocol.
Enables efficient distribution of post-quantum KEM public keys.

Squashed from georgio/noise-psq branch.

* Implement LP registration protocol with KKT/PSQ integration

Initial implementation of the Lewes Protocol (LP) for gateway registration:
- Add nym-lp crate with Noise protocol handshake
- Add LP listener to gateway for handling registrations
- Add LP client for registration flow
- Integrate KKT for post-quantum KEM key exchange
- Integrate PSQ for post-quantum PSK derivation
- Add Ed25519 authentication throughout
- Add docker/localnet support for testing

Co-authored-by: Jędrzej Stuczyński <jedrzej.stuczynski@gmail.com>

* Add LP telescoping with nested sessions and subsession support

Extends LP protocol with telescoping architecture for nested sessions:
- Add nested session support with KKpsk0 rekeying
- Add subsession support with collision detection
- Implement unified packet format with outer header
- Refactor gateway handlers for single-packet forwarding
- Add TTL-based state cleanup for stale sessions
- Add outer AEAD encryption layer
- Refactor registration client for packet-per-connection model

* Add gateway-probe localnet mode with WireGuard tunnel support

Adds localnet testing mode to gateway-probe for LP development:
- Add TestMode enum for different probe configurations
- Add --gateway-ip flag for direct gateway testing
- Implement two-hop WireGuard tunnel for localnet
- Add mock ecash support for testing without real credentials
- Add netstack Go bindings for userspace networking
- Restructure probe with mode and common modules
- Update README with localnet mode documentation

* Increase KCP fragment limit from u8 to u16

- Change frg field from u8 to u16 in packet header (25 bytes total)
- Update encode/decode to use get_u16_le/put_u16_le
- Update Segment struct frg field to u16
- Remove truncating cast in session.rs
- Max message size now ~91MB (65,535 fragments × MTU)
- Internal protocol only, no interop concerns

Nym uses KCP for reliability and multiplexing, not standard real-time
use cases. The u8 limit (255 fragments, ~355KB) was insufficient.

Addresses: nym-yih9

* Zeroize Ed25519 key material in to_x25519 conversion

Wrap hash and x25519_bytes in zeroize::Zeroizing to ensure private
key material is cleared from memory after use.

Closes: nym-k55g

* Return Result from KCP session input() for error detection

Change KcpSession::input() to return Result<(), KcpError> so callers
can detect invalid packets instead of silently ignoring them.

- Add ConvMismatch error variant for conversation ID mismatches
- Update driver to propagate errors from session.input()
- Update all test and example callers

Closes: nym-n0kk

* Fix Zeroizing deref in ed25519 to_x25519 conversion

The from_bytes() function expects &[u8], need to deref the Zeroizing
wrapper to get the inner array.

* Add semaphore-based connection limiting for LP packet forwarding

Limits concurrent outbound connections when forwarding LP packets to
prevent file descriptor exhaustion under high load.

Key changes:
- Add max_concurrent_forwards config (default 1000)
- Add forward_semaphore to LpHandlerState
- Acquire semaphore permit before connecting in handle_forward_packet
- Return "Gateway at forward capacity" error when at limit

This provides load signaling so clients can choose another gateway
when the current one is overloaded.

Design note: Connection pooling was considered but provides minimal
benefit since telescope setup is one-time and targets are distributed
across many different gateways. See AIDEV-NOTE in LpHandlerState for
full analysis.

Closes: nym-xi3m

* Return error on session unavailable in handle_subsession_packet

Replace .session().ok() with proper error handling to fail fast when
session is Closed or Processing after state machine processing.

Previously, the code silently continued with outer_key = None, which
could cause protocol errors downstream.

Closes: nym-8de0

* Use explicit bincode Options helper in nested_session

Add bincode_options() helper that returns DefaultOptions with explicit
big_endian and varint_encoding configuration. This future-proofs against
bincode 1.x/2.x default changes and makes serialization format explicit.

Updated all 4 bincode usages in nested_session.rs to use the helper.

* Deduplicate outer_key lookup pattern in nested_session.rs

Extract common state_machine.session().ok().and_then(...) pattern into
two helper methods:
- get_send_key() for encryption (outer_aead_key_for_sending)
- get_recv_key() for decryption (outer_aead_key)

Updated 6 call sites to use the helpers, reducing verbosity.

* Add LpConfig struct and AIDEV-NOTE documentation for KKT+PSQ

- Create config.rs with LpConfig struct (kem_algorithm, psk_ttl, enable_kkt)
- Export LpConfig from lib.rs
- Add AIDEV-NOTE to psk.rs explaining:
  - Why PSQ is embedded in Noise (single round-trip, PSK binding)
  - KEM migration path (X25519 → MlKem768 → XWing)
- Add AIDEV-NOTE to state_machine.rs explaining protocol flow:
  - KKTExchange → Handshaking → Transport state transitions
  - PSK derivation formula (ECDH || PSQ || salt)

* Add forward_timeout to LP client config

Add forward_timeout (30s default) to LpConfig and wrap send_forward_packet's
connect_send_receive call with tokio::time::timeout, matching the pattern
used by register() with registration_timeout.

This prevents indefinite hangs when forwarding packets through entry gateway.

* Add negotiated_version field to LpSession

Add AtomicU8 field to store the protocol version from handshake packet
headers. Includes getter and setter methods for future version negotiation
and compatibility checks.

- negotiated_version() returns current version (defaults to 1)
- set_negotiated_version() allows setting during handshake
- Subsessions inherit version 1 (can be enhanced to inherit parent's)

* Change MessageType from u16 to u32

Breaking wire protocol change: MessageType field increased from 2 bytes
to 4 bytes in LP packets. This future-proofs the message type space and
aligns with other u32 fields.

Changes:
- message.rs: #[repr(u32)], from_u32(), to_u32()
- error.rs: InvalidMessageType(u32)
- codec.rs: All serialization/deserialization updated to 4-byte msg_type
  - Cleartext parsing: inner_bytes[4..8], content at [8..]
  - AEAD parsing: decrypted[4..8], content at [8..]
  - Serialization: 4 bytes for message type

* Various smaller fixes

* Refactor LP to stream-oriented TCP processing

Gateway (handler.rs):
- Add bound_receiver_idx field for session-affine connections
- Convert handle() from single-packet to loop with EOF detection
- Add validate_or_set_binding() for receiver_idx validation
- Set binding in handle_client_hello after collision check
- Centralize emit_lifecycle_metrics in main loop only
- Add is_connection_closed() helper for graceful EOF

Client (client.rs):
- Add stream field for persistent TCP connection
- Add ensure_connected(), send_packet(), receive_packet(), close() methods
- Modify perform_handshake_inner() to use persistent stream
- Modify register_with_credential() to use persistent stream
- Modify send_forward_packet() to use persistent stream
- Keep connect_send_receive() for reference (marked dead_code)

This reduces handshake overhead from ~5 TCP connections to 1.

Drive-by: Fix log::info! -> info! in wireguard peer_controller.rs

* Add persistent exit stream for entry→exit forwarding

Entry gateway now maintains a persistent TCP connection to the exit
gateway per client session, reusing it for all forward requests from
that client. This reduces TCP handshake overhead significantly.

Key changes:
- Add exit_stream: Option<(TcpStream, SocketAddr)> to LpConnectionHandler
- Modify handle_forward_packet() to open on first forward, reuse after
- Clear exit_stream on connection errors (auto-reconnect on next forward)
- Semaphore only acquired for connection opens, not reuse (sequential access)

* Fix code review issues for stream-oriented LP

- Add 30s timeout to exit stream I/O operations (nym-df31)
  Prevents handler from hanging on unresponsive exit gateway

- Return error on forward target address mismatch (nym-zegu)
  Previously warned and proceeded, which could mask bugs

- Close client stream on handshake error paths (nym-scvm)
  Prevents state machine inconsistency on timeout or failure

* Add LP registration idempotency and retry logic

Make LP registration resilient to network failures that could waste
credentials. When registration succeeds on the gateway but the response
is lost (e.g., network drop), clients can retry with the same WG key
and get the cached result instead of spending another credential.

Gateway-side:
- Add check_existing_registration() helper that looks up WG peer and
  returns cached GatewayData if already registered
- Add idempotency check in process_registration() dVPN branch
- Only return cached response if bandwidth > 0 (ensures registration
  was actually completed, not just peer created)
- Track idempotent registrations with lp_registration_dvpn_idempotent metric

Client-side:
- Add register_with_retry() to LpRegistrationClient that acquires
  credential once and retries handshake+registration on failure
- Add handshake_and_register_with_retry() to NestedLpSession for
  exit gateway registration via forwarding
- Add exponential backoff with jitter between retry attempts
- Verify outer session validity before nested session retry

Both retry methods clear state machine before retry to ensure fresh
handshake, and reuse the same credential across all attempts.

* Add no-mix-acks feature flag to nym-sphinx-framing

When enabled, mix nodes skip ack extraction and forwarding entirely.
The full payload (including ack portion) is returned as the message.

Closes: nym-3wrr

* Create nym-lp-speedtest crate scaffold

- Created tools/nym-lp-speedtest/ with Cargo.toml
- Added main.rs with CLI argument parsing
- Created stub modules: client.rs, speedtest.rs, topology.rs
- Added to workspace members
- Verified compilation with cargo check

* Implement topology fetching for nym-lp-speedtest

- Add topology.rs with NymTopology integration
- Fetch mix nodes and gateways from nym-api
- Build GatewayInfo with LP addresses (port 41264)
- Provide random_route_to_gateway() for Sphinx routing
- Add required Cargo.toml dependencies

* Implement LP+Sphinx+KCP client with SURB support

- Add send_data() and send_data_with_surbs() methods for mixnet data
- Integrate KCP reliable delivery with Sphinx packet construction
- Add x25519 encryption keypair for SURB reply mechanism
- Wire up main.rs to test LP handshake and data path
- Add NymRouteProvider support in topology for SURB construction
- Refactor send_data() to delegate to send_data_with_surbs(0) (DRY)

The client can now:
- Perform LP handshake with gateways
- Send data through the mixnet wrapped in KCP + Sphinx packets
- Attach SURBs for bidirectional communication
- Return encryption keys for decrypting replies

* Rename nym-lp-speedtest to nym-lp-client and fix KCP bug

- Rename crate from nym-lp-speedtest to nym-lp-client
- Fix KCP bug: add driver.update() call before fetch_outgoing()
  Without update(), KCP never moves segments from snd_queue to snd_buf
- Update CLI name, about string, and user agent to match new name

* Add LP mixnet mode registration with nym address return

- Extend RegistrationMode::Mixnet to include client_ed25519_pubkey
  and client_x25519_pubkey for nym address construction
- Add LpGatewayData struct containing gateway_identity and
  gateway_sphinx_key for SURB reply routing
- Add lp_gateway_data field to LpRegistrationResponse for mixnet mode
- Implement success_mixnet() constructor for mixnet registrations
- Update gateway registration to insert clients into ActiveClientsStore
  for SURB reply delivery, matching the websocket flow

* Implement LP data handler on UDP:51264

- Add LpDataHandler for UDP data plane (port 51264)
- Decrypt LP layer and forward Sphinx packets to mixnet
- Add outbound_mix_sender to LpHandlerState
- Integrate data handler spawn into LpListener::run()
- Add metrics for data packets received/forwarded/errors

Implements nym-yzzm

* Fix replay protection vulnerability in LP data handler

Use state machine process_input() instead of manual decryption to ensure
proper replay protection:
- Counter check against receiving window
- Counter marking after successful decryption

Also handle subsession actions gracefully (SendPacket ignored on UDP,
clients should use TCP control plane for rekeying).

Security fix for nym-yzzm implementation.

* feat(ipr): add KcpSessionManager for LP client KCP handling

- Add fetch_incoming() and recv() methods to KcpDriver for retrieving
  reassembled messages
- Create KcpSessionManager in ip-packet-router that manages KCP sessions
  keyed by conv_id (first 4 bytes of KCP packet header)
- Store ReplySurbs per session for sending anonymous replies
- Implement session timeout (5 min) and max sessions limit (10000)
- Add comprehensive tests for session lifecycle and KCP roundtrip

* feat(ipr): integrate KcpSessionManager into MixnetListener

- Add KcpSessionManager field to MixnetListener struct
- Add is_kcp_message() helper to detect KCP-wrapped payloads
- Add on_kcp_message() to process LP client KCP messages
- Refactor on_reconstructed_message() to route KCP vs regular IPR
- Add KCP tick timer (100ms) for session updates and cleanup
- Initialize KcpSessionManager in IpPacketRouter::run_service_provider()

KCP messages are detected by checking byte 4 for valid KCP commands
(81-84), which doesn't conflict with IPR protocol version bytes (6-8)
at position 0.

Closes: nym-96zl

* fix(ipr): prevent KCP detection false positives on IPR messages

Add secondary check in is_kcp_message() to exclude messages that match
IPR protocol header pattern (version 6-8 at byte 0, ServiceProviderType
0-2 at byte 1). This prevents false positives where IPR messages with
byte 4 in range 81-84 would be incorrectly routed to KCP processing.

Added 4 unit tests to validate the detection logic.

Closes: nym-6f3x

* fix(ipr): wrap KCP client responses in KCP before SURB reply

- Modify on_kcp_message to handle responses directly instead of returning them
- Add handle_kcp_response method that wraps response in KCP and sends via mixnet
- Ensures KCP clients receive KCP-wrapped responses for proper reassembly

Closes: nym-7oh2

* fix(ipr): send KCP protocol packets in tick instead of just logging

- Add get_sender_tag() and fetch_outgoing_for_conv() to KcpSessionManager
- Change handle_kcp_tick() to actually send ACKs/retransmissions via mixnet
- Reduce KCP tick interval from 100ms to 10ms for better responsiveness

This fixes the KCP reliability protocol which was broken because
protocol packets (ACKs, retransmissions) were generated but never sent.

* feat(lp-client): wrap payload in IpPacketRequest before KCP

- Add nym-ip-packet-requests and bytes dependencies
- Wrap payload in IpPacketRequest::new_data_request() before sending to KCP
- Add LP_DATA_PORT constant (51264) and lp_data_address field to GatewayInfo

This ensures IPR can properly parse incoming messages as DataRequest.
LP framing (wrapping Sphinx in LP before sending) is a separate task.

* feat(lp-client): add LP session management and UDP data plane support

- Add wrap_data() and session_id() to LpRegistrationClient for LP packet
  creation after handshake
- Add init_lp_session() and close_lp_session() to SpeedtestClient for
  managing LP sessions
- Extract prepare_sphinx_fragments() helper to reduce code duplication
  between send_data_with_surbs() and send_data_via_lp()
- Add send_data_via_lp() for sending Sphinx packets through LP's UDP
  data plane (port 51264)

The LP session is kept alive after TCP handshake closes, allowing
subsequent wrap_data() calls for UDP transmission without re-handshaking.

* random formatting

* replaced all instances of bincode::serialize and bincode::deserialize with explicit lp_bincode_serialiser() within the LP

* additional formatting

* removed source of possible panic from nym-kkt

invalid KEM mapping will now return an Err rather than panicking

* integration test for LP entry registration

This includes creation of mocks of various gateway-related components, such as the PeerController

* changed ClientHelloData serialisation

the old variant using bincode did not produce constant-length output in some cases

* Fixed generation of receiver index

removes the possible clash with the boostrap id

* Integration test for nested LP registration

- move `LpTransport` trait definition to shared `nym-lp-transport` crate
- make transport layer within `LpConnectionHandler` generic with respect to the forwarding target. it must, however, use the same type as the incoming client connection
- extracted explicit `LpConnectionHandler::establish_exit_stream` to more easily modify it in the future to fully protect the channel and disallow using untrusted egress points
- fix additional log-string interpolation nits

* resolved clippy issues pointed out by clippy 1.91

* added LP discovery into self-described endpoint:

- removed changes to the node bonding within the contract
- introduced '/api/v1/lewes-protocol' route within nym-node http api
- added 'lewes_protocol' field to 'NymNodeData' inside of NymNodeDescription
- refactored LpConfig to allow separate bind and announce addresses and used more strict typing

* chore: allow unwrap/expect within kkt benchmarking code

* chore: downgraded sha2 dep for cosmwasm compatibility

* clippy

* marking simd calls as unsafe

* fixed calls to '_mm_testz_si128'

* additional clippy fixes

---------

Co-authored-by: Georgio Nicolas <me@georgio.xyz>
Co-authored-by: Jędrzej Stuczyński <jedrzej.stuczynski@gmail.com>
2026-01-14 09:06:02 +00:00
Jędrzej Stuczyński 46fe1bc819 bugfix: mozzarella -> niolo config migration (#6259)
* bugfix: mozzarella -> niolo config migration

* clippy
2025-12-02 15:29:30 +00:00
zyc e50051795e Fix comment 2025-11-26 21:11:38 +08:00