failed: listen tcp 0.0.0.0:22000: listen: address already in use

We are using okteto on a kubernetes cluster. We have many users that are running jobs and we keep getting frozen pods. The initial sync seems fine, but alot of the time when the job completes on the kubernetes pod and is syncing data back it freezes. There are no errors on the containers or the server node that launched the job. I looked in the syncthing log and found below errors. I’m guess this means the address/port is in use because of another user. If this seems the case based on the logs, how do I change the default port and will I need to have each user use a different port? I don’t see a /config/syncthing anywhere and I’m thinking it’s because it come packaged with okteto. [ABKAV] 09:32:20 INFO: Listen (BEP/tcp): listen tcp 0.0.0.0:22000: listen: address already in use [ABKAV] 09:32:20 INFO: listenerSupervisor@tcp://0.0.0.0:22000: service tcp://0.0.0.0:22000 failed: listen tcp 0.0.0.0:22000: listen: address already in use [ABKAV] 09:32:20 INFO: Listen (BEP/tcp): listen tcp 0.0.0.0:22000: listen: address already in use [ABKAV] 09:32:20 INFO: listenerSupervisor@tcp://0.0.0.0:22000: service tcp://0.0.0.0:22000 failed: listen tcp 0.0.0.0:22000: listen: address already in use [ABKAV] 09:32:20 INFO: Listen (BEP/tcp): listen tcp 0.0.0.0:22000: listen: address already in use [ABKAV] 09:32:20 INFO: listenerSupervisor@tcp://0.0.0.0:22000: service tcp://0.0.0.0:22000 failed: listen tcp 0.0.0.0:22000: listen: address already in use [ABKAV] 09:32:33 INFO: Exiting

Port 22000 is the port that okteto uses by default for SSH. Is more than one developer doing okteto up to the same namespace and pod? Is the same developer running more than one okteto up on the same machine?

Could you share the okteto.yaml the developers are using,? It will help us understand and troubleshoot what’s going on.

Thanks!

I will get a copy of the yaml and post it.
To answer your questions, each developer generates their own pod on the cluster, but many developers are doing okteto up from the same server and in the same namespace. I do also believe that in some cases, a developer does run okteto up for different jobs from the same server.

In one specific case though, we have a developer who is the only one in his namespace running okteto up and his pods freeze. The point of freezing is, his container runs, the job completes and as the data is being written back the sync freezes. The freeze always happens when data is being pushed back from the server that launched the job. But yes, there are other developers using that same server for their jobs.

name: xxx-$USER
namespace: xxx
autocreate: true
image: repo.xxx.com/image
persistentVolume:
enabled: false
command: bash
workdir: /build
sync:
-" .:/build
#- .docker:/build/.docker
#- platform:/build/platform
#- src:/build/src
#- bridge_apps:/build/bridge_apps
#- name:/builder/name
#- integration-tests:/build/integration-tests
#securityContext:
#runAsUser: 30000
#runAsGroup: 30000
#volumes:
#" - /opt/dir/apps/project/name_7_xxxxx
initContainer:
image: repo.xxx.com:9900/okteto/bin:1.3.5
resources:
requests:
cpu: “0”
memory: “12Gi”
ephemeral-storage: “12Gi”
limits:
cpu: “8”

Thanks for helping. I have posted a copy of our yaml, I did have to change a few of the names but the context is the same

Is there a way to change the default port?

@bpopovic, yes you can configure the ssh port by seetting the reverse field in your okteto manifest.

I have tried to reproduce it with your manifest but my sync is working as expected. Could you try to add the command field in your manifest to sh (command: sh) and run the command from there. Maybe there is an error in the application layer that exits the pod and stops the synchronisation

The builds do work sometimes per dev build and sometimes not. Some users it freezes 20 builds in a row and then maybe one will pass and then a lot more freezes. It’s a bit sporadic for some users and some users it’s pretty constant. I will try your suggestion and also post errors I’m seeing in the okteto.yaml

nce=0&timeout=0": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:31:53-06:00” level=info msg=“error getting events: failed to call syncthing [rest/events]: Get "http://localhost:41713/rest/events?events=FolderSummary&limit=1&since=0&timeout=0\”: context deadline exceeded (Client.Timeout exceeded while awaiting headers)" action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“syncthing pull error local=true: syncing: no connected device has the required version of this file\n” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“syncthing error local=true retry 4: docker/WRSDK_ls1043ardb.tar.bz2: syncing: no connected device has the required version of this file\n” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“syncthing monitor error, sending disconnect signal: docker/WRSDK_ls1043ardb.tar.bz2: syncing: no connected device has the required version of this file\n” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“starting shutdown sequence” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“sent cancellation signal” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“stopping syncthing” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:44441->0.0.0.0:22000 → done” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=debug msg=“call to up.applyToApp cancelled: {}” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → done” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“command failed: wait: remote command exited without exit status or exit signal” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh client for exec closed” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“terminal restored” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“syncthing ping error 0” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“terminating syncthing 112179 without wait” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“terminated syncthing 112179 without wait” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“stopping forwarders” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“stopped k8s forwarder” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“stopped SSH forward manager” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“completed shutdown sequence” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → failed to dial remote connection: ssh: unexpected packet in response to channel open: ” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → failed to dial remote connection: ssh: unexpected packet in response to channel open: ” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → failed to dial remote connection: ssh: unexpected packet in response to channel open: ” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → failed to dial remote connection: ssh: unexpected packet in response to channel open: ” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → failed to dial remote connection: ssh: unexpected packet in response to channel open: ” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → failed to dial remote connection: ssh: unexpected packet in response to channel open: ” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8
time=“2023-04-13T09:32:33-06:00” level=info msg=“ssh forward localhost:41713->0.0.0.0:8384 → failed to dial remote connection: ssh: unexpected packet in response to channel open: ” action=a7352dd9-6735-49a1-8e42-f87ddadb5274 version=1.15.8