The Flask ecosystem is atrocious. Coming from a Spring / Java Enterprise background, I did not fully appreciate how many batteries were included.

I had a web server being served by one geventwebsocket.gunicorn.workers.GeventWebSocketWorker:

gunicorn -k geventwebsocket.gunicorn.workers.GeventWebSocketWorker -w 1 --threads 100 -b 0.0.0.0:5000 'app:create_app()'

This was not keeping up with both the UI and API requests, so I bumped up the worker count:

gunicorn -k geventwebsocket.gunicorn.workers.GeventWebSocketWorker -w 5 --threads 100 -b 0.0.0.0:5000 'app:create_app()'

This resulted in web socket connections constantly thrashing because there was no guarantee that the initial handshake worker would get subsequent requests.

From https://flask-socketio.readthedocs.io/en/latest/deployment.html :

Due to the limited load balancing algorithm used by gunicorn, it is not possible to use more than one worker process when using this web server. For that reason, all the examples above include the -w 1 option.
The workaround to use multiple worker processes with gunicorn is to launch several single-worker instances and put them behind a more capable load balancer such as nginx.

Gross. Okay.

Script to stand up multiple app servers:

gunicorn -k geventwebsocket.gunicorn.workers.GeventWebSocketWorker -w 1 --threads 100 -b 0.0.0.0:5000 'app:create_app() '&
gunicorn -k geventwebsocket.gunicorn.workers.GeventWebSocketWorker -w 1 --threads 100 -b 0.0.0.0:5001 'app:create_app()' &
gunicorn -k geventwebsocket.gunicorn.workers.GeventWebSocketWorker -w 1 --threads 100 -b 0.0.0.0:5002 'app:create_app()' &
gunicorn -k geventwebsocket.gunicorn.workers.GeventWebSocketWorker -w 1 --threads 100 -b 0.0.0.0:5003 'app:create_app()' &
gunicorn -k geventwebsocket.gunicorn.workers.GeventWebSocketWorker -w 1 --threads 100 -b 0.0.0.0:5004 'app:create_app()'

Nginx has to use sticky sessions when it round robins across these app instances, otherwise websockets will not function correctly. This is achieved by using ip_hash in upstream:

http {

    upstream gunicorns {
        ip_hash;
        server 0.0.0.0:5000;
        server 0.0.0.0:5001;
        server 0.0.0.0:5002;
        server 0.0.0.0:5003;
        server 0.0.0.0:5004;
    }

    server {
        listen 443 ssl default_server;
        root   /usr/share/nginx/html;
        ssl_certificate /path/to/crt;
        ssl_certificate_key /path/to/key;


        location / {
            client_max_body_size 200M;

            auth_basic           "Application Login";
            auth_basic_user_file /etc/nginx/htpasswd;

            proxy_set_header     X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header     Host $host;

            proxy_pass           http://gunicorns;
        }

        location /socket.io {
            proxy_http_version   1.1;
            proxy_set_header     Upgrade $http_upgrade;
            proxy_set_header     Connection "upgrade";
            proxy_pass           http://gunicorns;
        }
    }
}

But wait, there's more!

Flask-SocketIO has to be configured with a message queue:

socketio.init_app(app, cors_allowed_origins="*", message_queue='redis://')