netmaker

mirror of https://github.com/gravitl/netmaker.git synced 2026-04-23 00:17:10 +08:00

Author	SHA1	Message	Date
Abhishek Kondur	f8a0cfd744	v1.5.1: release notes (#3950 ) * v1.5.1: release notes * v1.5.1: release notes * v1.5.1: release notes * v1.5.1: release notes * v1.5.1: release notes * v1.5.1: update version tag * v1.5.1: update version tag	2026-03-31 20:01:57 +05:30
Abhishek Kondur	79c56b0c1c	NM-273: Vnat pool assignments fix (#3926 ) * NM-273: Vnat pool assignments fix * NM-273: rename var * NM-273: add 2 vCPUs indication for monitoring stack	2026-03-20 08:10:16 +05:30
Abhishek Kondur	292af315dd	NM-271: Scalability Improvements (#3921 ) * feat(go): add user schema; * feat(go): migrate to user schema; * feat(go): add audit fields; * feat(go): remove unused fields from the network model; * feat(go): add network schema; * feat(go): migrate to network schema; * refactor(go): add comment to clarify migration logic; * fix(go): test failures; * fix(go): test failures; * feat(go): change membership table to store memberships at all scopes; * feat(go): add schema for access grants; * feat(go): remove nameservers from new networks table; ensure db passed for schema functions; * feat(go): set max conns for sqlite to 1; * fix(go): issues updating user account status; * NM-236: streamline operations in HA mode * NM-236: only master pod should subscribe to updates from clients * refactor(go): remove converters and access grants; * refactor(go): add json tags in schema models; * refactor(go): rename file to migrate_v1_6_0.go; * refactor(go): add user groups and user roles tables; use schema tables; * refactor(go): inline get and list from schema package; * refactor(go): inline get network and list users from schema package; * fix(go): staticcheck issues; * fix(go): remove test not in use; fix test case; * fix(go): validate network; * fix(go): resolve static checks; * fix(go): new models errors; * fix(go): test errors; * fix(go): handle no records; * fix(go): add validations for user object; * fix(go): set correct extclient status; * fix(go): test error; * feat(go): make schema the base package; * feat(go): add host schema; * feat(go): use schema host everywhere; * feat(go): inline get host, list hosts and delete host; * feat(go): use non-ptr value; * feat(go): use save to upsert all fields; * feat(go): use save to upsert all fields; * feat(go): save turn endpoint as string; * feat(go): check for gorm error record not found; * fix(go): test failures; * fix(go): update all network fields; * fix(go): update all network fields; * feat(go): add paginated list networks api; * feat(go): add paginated list users api; * feat(go): add paginated list hosts api; * feat(go): add pagination to list groups api; * fix(go): comment; * fix(go): implement marshal and unmarshal text for custom types; * fix(go): implement marshal and unmarshal json for custom types; * fix(go): just use the old model for unmarshalling; * fix(go): implement marshal and unmarshal json for custom types; * NM-271:Import swap: compress/gzip replaced with github.com/klauspost/compress/gzip (2-4x faster, wire-compatible output). Added sync import. Two sync.Pool variables (gzipWriterPool, bufferPool): reuse gzip.Writer and bytes.Buffer across calls instead of allocating fresh ones per publish. compressPayload rewritten: pulls writer + buffer from pools, resets them, compresses at gzip.BestSpeed (level 1), copies the result out of the pooled buffer, and returns both objects to the pools. * feat(go): remove paginated list networks api; * feat(go): use custom paginated response object; * NM-271: Improve server scalability under high host count - Replace stdlib compress/gzip with klauspost/compress at BestSpeed and pool gzip writers and buffers via sync.Pool to eliminate compression as the dominant CPU hotspot. - Debounce peer update broadcasts with a 500ms resettable window capped at 3s max-wait, coalescing rapid-fire PublishPeerUpdate calls into a single broadcast cycle. - Cache HostPeerInfo (batch-refreshed by debounce worker) and HostPeerUpdate (stored as side-effect of each publish) so the pull API and peer_info API serve from pre-computed maps instead of triggering expensive per-host computations under thundering herd conditions. - Warm both caches synchronously at startup before the first publish cycle so early pull requests are served instantly. - Bound concurrent MQTT publishes to 5 via semaphore to prevent broker TCP buffer overflows that caused broken pipe disconnects. - Remove manual Disconnect+SetupMQTT from ConnectionLostHandler and rely on the paho client's built-in AutoReconnect; add a 5s retry wait in publish() to ride out brief reconnection windows. * NM-271: Reduce server CPU contention under high concurrent load - Cache ServerSettings with atomic.Value to eliminate repeated DB reads on every pull request (was 32+ goroutines blocked on read lock) - Batch UpdateNodeCheckin writes in memory, flush every 30s to reduce per-checkin write lock contention (was 88+ goroutines blocked) - Enable SQLite WAL mode + busy_timeout and remove global dbMutex; let SQLite handle concurrency natively (reads no longer block writes) - Move ResetFailedOverPeer/ResetAutoRelayedPeer to async in pull() handler since results don't affect the cached response - Skip no-op UpsertNode writes in failover/relay reset functions (early return when node has no failover/relay state) - Remove CheckHostPorts from hostUpdateFallback hot path - Switch to pure-Go SQLite driver (glebarez/sqlite), set CGO_ENABLED=0 * fix(go): ensure default values for page and per_page are used when not passed; * fix(go): rename v1.6.0 to v1.5.1; * fix(go): check for gorm.ErrRecordNotFound instead of database.IsEmptyRecord; * fix(go): use host id, not pending host id; * NM-271: Revert pure-Go SQLite and FIPS disable to verify impact Revert to CGO-based mattn/go-sqlite3 driver and re-enable FIPS to isolate whether these changes are still needed now that the global dbMutex has been removed and WAL mode is enabled. Keep WAL mode pragma with mattn-compatible DSN format. * feat(go): add filters to paginated apis; * feat(go): add filters to paginated apis; * feat(go): remove check for max username length; * feat(go): add filters to count as well; * feat(go): use library to check email address validity; * feat(go): ignore pagination if params not passed; * fix(go): pagination issues; * fix(go): check exists before using; * fix(go): remove debug log; * NM-271: rm debug logs * NM-271: check if caching is enabled * NM-271: add server sync mq topic for HA mode * NM-271: fix build * NM-271: push metrics in batch to exproter over api * NM-271: use basic auth for exporter metrics api * fix(go): use gorm err record not found; * NM-271: Add monitoring stack on demand * NM-271: -m arg for install script should only add monitoring stack * fix(go): use gorm err record not found; * NM-271: update docker compose file for prometheus * NM-271: update docker compose file for prometheus * fix(go): use user principal name when creating pending user; * fix(go): use schema package for consts; * NM-236: rm duplicate network hook * NM-271: add server topic to reset idp hooks on master node * fix(go): prevent disabling superadmin user; Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * fix(go): swap is admin and is superadmin; Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * fix(go): remove dead code block; https://github.com/gravitl/netmaker/pull/3910#discussion_r2928837937 * fix(go): incorrect message when trying to disable self; https://github.com/gravitl/netmaker/pull/3910#discussion_r2928837934 * NM-271: fix stale peers on reset_failovered pull and add HTTP timeout to metrics exporter Run the failover/relay reset synchronously in the pull handler so the response reflects post-reset topology instead of serving stale cached peers. Add a 30s timeout to the metrics exporter HTTP client to prevent PushAllMetricsToExporter from blocking the Keepalive loop. * NM-271: fix gzip pool corruption, MQTT topic mismatch, stale settings cache, and reduce redundant DB fetches - Only return gzip.Writer to pool after successful Close to prevent silently malformed MQTT payloads from a previously errored writer. - Fix serversync subscription to exact topic match since syncType is now in the message payload, not the topic path. - Prevent zero-value ServerSettings from being cached indefinitely when the DB record is missing or unmarshal fails on startup. - Return fetched hosts/nodes from RefreshHostPeerInfoCache so warmPeerCaches reuses them instead of querying the DB twice. - Compute fresh HostPeerUpdate on reset_failovered pull instead of serving stale cache, and store result back for subsequent requests. * NM-271: fix gzip writer pool leak, log checkin flush errors, and fix master pod ordinal parsing - Reset gzip.Writer to io.Discard before returning to pool so errored writers are never leaked or silently reused with corrupt state. - Track and log failed DB inserts in FlushNodeCheckins so operators have visibility when check-in timestamps are lost. - Parse StatefulSet pod ordinal as integer instead of using HasSuffix to prevent netmaker-10 from being misidentified as master pod. * NM-271: simplify masterpod logic * fix(go): use correct header; Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * fix(go): return after error response; Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * fix(go): use correct order of params; https://github.com/gravitl/netmaker/pull/3910#discussion_r2929593036 * fix(go): set default values for page and page size; use v2 instead of /list; * NM-271: use host name * Update mq/serversync.go Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * NM-271: fix duplicate serversynce case * NM-271: streamline gw updates * Update logic/auth.go Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * Update schema/user_roles.go Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * fix(go): syntax error; * fix(go): set default values when page and per_page are not passed or 0; * fix(go): use uuid.parse instead of uuid.must parse; * fix(go): review errors; * fix(go): review errors; * Update controllers/user.go Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * Update controllers/user.go Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * NM-163: fix errors: * Update db/types/options.go Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * fix(go): persist return user in event; * Update db/types/options.go Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com> * NM-271: signal pull on ip changes * NM-163: duplicate lines of code * NM-163: fix(go): fix missing return and filter parsing in user controller - Add missing return after error response in updateUserAccountStatus to prevent double-response and spurious ext-client side-effects - Use switch statements in listUsers to skip unrecognized account_status and mfa_status filter values * NM-271: signal pull req on node ip change * fix(go): check for both min and max page size; * NM-271: refresh node object before update * fix(go): enclose transfer superadmin in transaction; * fix(go): review errors; * fix(go): remove free tier checks; * fix(go): review fixes; * NM-271: streamline ip pool ops * NM-271: fix tests, set max idle conns * NM-271: fix(go): fix data races in settings cache and peer update worker - Use pointer type in atomic.Value for serverSettingsCache to avoid replacing the variable non-atomically in InvalidateServerSettingsCache - Swap peerUpdateReplace flag before draining the channel to prevent a concurrent replacePeers=true from being consumed by the wrong cycle --------- Co-authored-by: VishalDalwadi <dalwadivishal26@gmail.com> Co-authored-by: Vishal Dalwadi <51291657+VishalDalwadi@users.noreply.github.com> Co-authored-by: tenki-reviewer[bot] <262613592+tenki-reviewer[bot]@users.noreply.github.com>	2026-03-18 00:24:54 +05:30
Abhishek Kondur	e9675343a1	NM-241: Simplify grafana and Prometheus Setup, install script dir (#3868 ) * NM-241: add metrics secret to install script * NM-241: add install directory, download grafana files * NM-241: update exporter setup * NM-241: update exporter env vars * NM-241: update volume * NM-241: update promethues and grafana volumes * NM-241: remove caddy domain for prom * NM-241: rm graph grafana dashboard * NM-241: add container name to prom and grafana * NM-241: avoid creating new sub install folders	2026-03-02 11:23:48 +04:00
Abhishek Kondur	6b7d33fa77	v1.5.0: release notes (#3862 ) * v1.5.0: update release notes * v1.5.0: bump up version * v1.5.0: update release notes * v1.5.0: update release notes * v1.5.0: update release notes	2026-02-11 22:05:18 +04:00
Abhishek Kondur	bcf4402551	v1.4.0: release notes (#3797 ) * Update release notes * bump up version to v1.4.0 * update release notes	2025-12-22 20:01:44 +04:00
Vishal Dalwadi	36a88544af	Remove Flow Logs Infra Changes (#3778 ) * feat(go): define flow events; * feat(go): improve structure; * feat(go): improve structure; * feat(go): remove old flow definitions; * feat(sql): add clickhouse init scripts; * feat(sql): add protobuf spec; * fix(sql): store ip as string; * feat(go): move proto def to grpc dir; * feat(go): use node instead of host as type; optimize protobuf defs; * feat(go): add clickhouse db support; add endpoint to query flows; * fix(go): fix clickhouse config; * fix(go): use error response structure to report error; * feat(go): pass flow logging status to netclient; * feat(go): add peer ip identity map to host peer info; * feat(go): remove prefix from participant obj fields; * feat(go): add flow logs enabled field to host; * feat(go): add filtering to get flow api; * feat(go): fix record struct; * feat(go): add exporter url to server config; * feat(go): add exporter url to server config; * feat(go): enable flow logs by default; * feat(go): update nm-quick.sh; * feat(go): update nm-quick.sh; * feat(go): update nm-quick.sh; * feat(go): update nm-quick.sh; * feat(go): add db initialization logic; * feat(go): filter by network id; * fix(go): connection issue; * fix(go): connection issue; * fix(go): golang builder version; * feat(go): add server settings for flow logs; * feat(go): initialize clickhouse in pro; check for retention; * feat(go): add exporter feature flags; * feat(go): add grpc behind caddy; * feat(go): expose ports correctly; * fix(go): grpc caddyfile config; * fix(go): publish exporter feature flags on license validation; * fix(go): set server name for netmaker exporter; * fix(go): set server name for netmaker exporter; * fix(go): check for nil cancel func; * fix(go): add flow logs field to api host; * fix(go): add flow logs field to api host; * fix(go): remove port from grpc setting; * chore(go): tabs; * feat(go): introduce egress range participant type;. * feat(go): rename egress range to egress route for uniform language; * feat(go): rename egress range to egress route for uniform language; * feat: add peer addr identity map to host peer update; * feat: add address identity map to host peer update; * feat: add address identity map to host peer update; * feat: set correct from and to args; * feat: add support for filtering by node; * feat: use corresponding base image; * feat: update dockerfile base image version; * fix: disable flow logs for all host when global settings are changed; * refactor: setup flow logs manually;	2025-12-13 15:21:23 +04:00
Vishal Dalwadi	a4981ffd26	NM-168: Network Flow Logs (#3754 ) * feat(go): define flow events; * feat(go): improve structure; * feat(go): improve structure; * feat(go): remove old flow definitions; * feat(sql): add clickhouse init scripts; * feat(sql): add protobuf spec; * fix(sql): store ip as string; * feat(go): move proto def to grpc dir; * feat(go): use node instead of host as type; optimize protobuf defs; * feat(go): add clickhouse db support; add endpoint to query flows; * fix(go): fix clickhouse config; * fix(go): use error response structure to report error; * feat(go): pass flow logging status to netclient; * feat(go): add peer ip identity map to host peer info; * feat(go): remove prefix from participant obj fields; * feat(go): add flow logs enabled field to host; * feat(go): add filtering to get flow api; * feat(go): fix record struct; * feat(go): add exporter url to server config; * feat(go): add exporter url to server config; * feat(go): enable flow logs by default; * feat(go): update nm-quick.sh; * feat(go): update nm-quick.sh; * feat(go): update nm-quick.sh; * feat(go): update nm-quick.sh; * feat(go): add db initialization logic; * feat(go): filter by network id; * fix(go): connection issue; * fix(go): connection issue; * fix(go): golang builder version; * feat(go): add server settings for flow logs; * feat(go): initialize clickhouse in pro; check for retention; * feat(go): add exporter feature flags; * feat(go): add grpc behind caddy; * feat(go): expose ports correctly; * fix(go): grpc caddyfile config; * fix(go): publish exporter feature flags on license validation; * fix(go): set server name for netmaker exporter; * fix(go): set server name for netmaker exporter; * fix(go): check for nil cancel func; * fix(go): add flow logs field to api host; * fix(go): add flow logs field to api host; * fix(go): remove port from grpc setting; * chore(go): tabs; * feat(go): introduce egress range participant type;. * feat(go): rename egress range to egress route for uniform language; * feat(go): rename egress range to egress route for uniform language; * feat: add peer addr identity map to host peer update; * feat: add address identity map to host peer update; * feat: add address identity map to host peer update; * feat: set correct from and to args; * feat: add support for filtering by node; * feat: use corresponding base image; * feat: update dockerfile base image version; * fix: disable flow logs for all host when global settings are changed;	2025-12-12 14:12:00 +04:00
Abhishek Kondur	94f3716fdf	Merge pull request #3744 from gravitl/NM-167 NM-167: Auto delete Offline Nodes	2025-12-05 09:52:53 +04:00
abhishek9686	f4f4a02a8d	update script version	2025-11-07 16:10:55 +04:00
Abhishek K	ee9f848ac5	NM-134: Activate user onboarding (#3686 ) * activate Ui onboarding, removing network configuration * activate Ui onboarding, removing network configuration	2025-10-25 14:44:33 +04:00
abhishek9686	061ae11bac	add interface up and teardown option to ci script	2025-09-16 13:09:38 +05:30
abhishek9686	91a227f74b	add ci-runner script	2025-09-16 11:07:38 +05:30
abhishek9686	0932049163	resolve merge conflicts	2025-09-12 17:44:55 +05:30
abhishek9686	20b1b6cfd8	bump up install script version	2025-09-12 17:15:04 +05:30
Abhishek K	a8a0dd066c	NM-44: Device Approvals for Network Join (#3579 ) * add pending hosts apis, migration logic for network auto join field * fix pending hosts logic on join * delete pending hosts on host delete * ignore pedning device request if host in the network already * add peer update on host approval	2025-08-12 09:16:51 +05:30
abhishek9686	05fe84980a	update version tag on install script	2025-06-26 12:48:10 +05:30
Aceix	2df02f747e	Merge pull request #3504 from gravitl/depracate-rac-autodisable chore: deprecate rac autodisable flag	2025-06-24 23:43:44 +05:30
abhishek9686	968ffe4db2	update release tag in install script	2025-06-06 18:57:40 +05:30
abhishek9686	494cc7f367	Merge branch 'master' of https://github.com/gravitl/netmaker into develop	2025-06-06 17:40:54 +05:30
abhishek9686	cd639ae969	add version tag on install script	2025-06-06 17:40:32 +05:30
abhishek9686	39d35c160c	change dns default domain	2025-06-06 13:36:48 +05:30
Abhishek K	810ff21165	NET-2014: add audit log retention period, add timestamp for events (#3486 ) * revert inet gws from acl policies * add egress range with metric for inet gw * link pro inet funcs * add timestamp params to activity apis	2025-06-06 13:19:56 +05:30
Abhishek K	a1304b43d8	NET-2054: Auto Removal of Offline Nodes, fix enrollment key relay function (#3458 ) * check host ports on join * if 443 not available fallback to 51821 * if 443 not available fallback to 51821 * add config for auto delete of offline nodes * autocleanup offline nodes * delete offline nodes on startup * fix relay via join token	2025-05-24 08:21:47 +05:30
abhishek9686	b17b200581	udpate ip service	2025-04-01 10:04:57 +04:00
Aceix	880c3acfc1	fix: config to allow muti-net connections on netdesk (#3371 )	2025-03-17 18:49:05 +04:00
Abhishek K	6ccfd10797	set managed dns to true (#3362 )	2025-03-11 01:05:01 +04:00
Abhishek K	d46050cab4	Merge pull request #3357 from gravitl/add-rac-cfg-for-multiple-network-connections feat: add config to allow muti-net connections on netdesk	2025-03-11 00:52:39 +04:00
abhishek9686	984db44c78	fix extclient comms to gws	2025-03-05 23:06:38 +04:00
the_aceix	0e89eebc2a	feat: add config to allow muti-net connections on netdesk	2025-03-05 15:45:26 +00:00
Abhishek K	e13bf2c0eb	NET-1923: Add Metric Port to server config (#3306 ) * set default metrics port 8889 * set default metrics port 51821 * add metrics port to server config * bind caddy only on tcp * add var for pulling files * add new line * update peer update model * check if port is not zero * set replace peer to false on pull * do not replace peers on failover sync * remove debug log * add old peer update fields for backwards compatibility * add old json tag * add debug log in caller trace func	2025-02-04 08:44:24 +04:00
abhishek9686	d47be71f33	pull manifests from master	2025-01-10 13:20:37 +05:30
abhishek9686	8d4b2d572e	update comment	2025-01-09 10:47:33 +05:30
abhishek9686	d1a9fa92da	set failover	2025-01-09 10:46:42 +05:30
abhishek9686	25a09857cf	Revert "Reapply "pull test binary"" This reverts commit `554d575428`.	2025-01-09 10:06:02 +05:30
abhishek9686	4ddbc371a2	remove inet gw setup	2025-01-06 13:53:46 +04:00
abhishek9686	554d575428	Reapply "pull test binary" This reverts commit `42a958ee80`.	2025-01-01 15:44:14 +04:00
abhishek9686	42a958ee80	Revert "pull test binary" This reverts commit `fed3ce0ae7`.	2025-01-01 15:43:02 +04:00
abhishek9686	58050ac006	remove static port	2024-12-31 08:38:17 +04:00
abhishek9686	fed3ce0ae7	pull test binary	2024-12-30 22:25:18 +04:00
abhishek9686	eb01c5c869	increase sleep	2024-12-28 20:03:38 +04:00
abhishek9686	7a6ce59204	handle ip check gracefully	2024-12-28 15:57:36 +04:00
abhishek9686	27ca7f490e	listen on ipv6 if available	2024-12-20 14:28:22 +04:00
abhishek9686	7361571b6a	update default domain	2024-12-18 22:32:39 +04:00
Yabin Ma	5f21c8bb1d	NET-1778: scale test code changes (#3203 ) * comment ACL call and add debug message * add cache for network nodes * fix load node to network cache issue * add peerUpdate call 1 min limit * add debug log for scale test * release maps * avoid default policy for node * 1 min limit for peerUpdate trigger * mq options * Revert "mq options" This reverts commit `10b93d0118`. * set peerUpdate run in sequence * update for emqx 5.8.2 * remove batch peer update * change the sleep to 10 millisec to avoid timeout * add compress and change encrypt for peerUpdate message * add mem profiling and automaxprocs * add failover ctx mutex * ignore request to failover peer * remove code without called * remove debug logs * update emqx to v5.8.2 * change broker keepalive * add OLD_ACL_SUPPORT setting * add host version check for message encrypt * remove debug message * remove peerUpdate call control --------- Co-authored-by: abhishek9686 <abhi281342@gmail.com>	2024-12-10 10:15:31 +04:00
Yabin Ma	87ef555542	NET1847:Add STUN settings (#3235 ) * add setting to turn on/off STUN * sync stun setting in peerUpdate * sync stun servers setting in peerUpdate	2024-12-06 09:38:32 +04:00
Yabin Ma	508c4cf8a9	fix nm-quick.sh -p issue (#3234 )	2024-12-03 13:29:44 +04:00
Abhishek K	8546f858c1	NET-1780: Bind Caddy to public IP, set default netclient to use port 443 (#3220 ) * bind caddy to public ip * set netclient on server to 443	2024-12-03 13:25:49 +04:00
abhishek9686	a964d2bf7d	set managed to false by default	2024-11-08 10:44:30 +04:00
abhishek9686	c11629b8b4	set up failover only in pro	2024-11-06 11:53:37 +04:00

1 2 3 4 5 ...

456 Commits