[Gluster-users] glusterfs client waiting on SYN_SENT to connect...

Liam Slusser lslusser at gmail.com
Tue Dec 14 20:58:40 UTC 2010


Just wanted to update you all.  Turns out the problem is my Juniper
Firewall - sort of.  I've created a service in our Juniper that
describes "Gluster" and allowed the "tcp session" to never timeout.
The problem comes when a server crashes and the TCP connection isn't
"cleaned up".  It looks like the gluster client always starts using
the same outbound (source) TCP port and in our firewall that
source/dest port combination is already in use (never times out right)
and the firewall isn't allowing it to be created again - so its
blocked.

So right now if i do a netstat -pan

tcp        0      1 10.10.10.101:996             10.20.10.102:6996
       SYN_SENT    23491/glusterfs
tcp        0      1 10.10.10.101:997             10.20.10.102:6996
       SYN_SENT    23491/glusterfs
tcp        0      1 10.10.10.101:1000            10.20.10.102:6996
       SYN_SENT    23491/glusterfs
tcp        0      0 10.10.10.101:1001            10.20.10.102:6996
       ESTABLISHED 23491/glusterfs
tcp        0      0 10.10.10.101:999             10.20.10.101:6996
       ESTABLISHED 23491/glusterfs
tcp        0      1 10.10.10.101:998             10.20.10.101:6996
       SYN_SENT    23491/glusterfs
tcp        0      1 10.10.10.101:1003            10.20.10.101:6996
       SYN_SENT    23491/glusterfs
tcp        0      1 10.10.10.101:1002            10.20.10.101:6996
       SYN_SENT    23491/glusterfs

Now if i kill the gluster process and restart it again....notice the
source port doesn't change...

tcp        0      1 10.10.10.101:996             10.20.10.102:6996
       SYN_SENT    23687/glusterfs
tcp        0      1 10.10.10.101:997             10.20.10.102:6996
       SYN_SENT    23687/glusterfs
tcp        0      1 10.10.10.101:1000            10.20.10.102:6996
       SYN_SENT    23687/glusterfs
tcp        0      0 10.10.10.101:1001            10.20.10.102:6996
       ESTABLISHED 23687/glusterfs
tcp        0      0 10.10.10.101:999             10.20.10.101:6996
       ESTABLISHED 23687/glusterfs
tcp        0      1 10.10.10.101:998             10.20.10.101:6996
       SYN_SENT    23687/glusterfs
tcp        0      1 10.10.10.101:1003            10.20.10.101:6996
       SYN_SENT    23687/glusterfs
tcp        0      1 10.10.10.101:1002            10.20.10.101:6996
       SYN_SENT    23687/glusterfs

Now if i kill and restart a few times...i can get lucky and get a
different source port...but you can see i'm still missing a few
bricks.

tcp        0      0 10.10.10.101:994             10.20.10.102:6996
       ESTABLISHED 23745/glusterfs
tcp        0      0 10.10.10.101:995             10.20.10.102:6996
       ESTABLISHED 23745/glusterfs
tcp        0      0 10.10.10.101:998             10.20.10.102:6996
       ESTABLISHED 23745/glusterfs
tcp        0      1 10.10.10.101:1000            10.20.10.102:6996
       SYN_SENT    23745/glusterfs
tcp        0      0 10.10.10.101:997             10.20.10.101:6996
       ESTABLISHED 23745/glusterfs
tcp        0      0 10.10.10.101:996             10.20.10.101:6996
       ESTABLISHED 23745/glusterfs
tcp        0      1 10.10.10.101:1003            10.20.10.101:6996
       SYN_SENT    23745/glusterfs
tcp        0      1 10.10.10.101:1002            10.20.10.101:6996
       SYN_SENT    23745/glusterfs

Now telnet works always because it always picks a random source port:

$ telnet 10.20.10.102 6996
Trying 10.20.10.102...
Connected to glusterserver (10.20.10.102).
Escape character is '^]'.

$ netstat -pan|grep telne
tcp        0      0 10.10.10.101:58757           10.20.10.102:6996
       ESTABLISHED 23622/telnet

Why does gluster not use a more random source port??  I'm going to
have to dig through the Juniper docs to see if i can manually close an
active session (lets hope) which should fix my immediate problem but
it doesn't really fix the long term problem.

Thoughts?

thanks,
liam

On Fri, Dec 3, 2010 at 6:51 PM, Liam Slusser <lslusser at gmail.com> wrote:
> Ah the two different IPs are because I was changing my IPs for this mailing
> list and I guess I forgot that one.  :)  Will try added a static route.
> Also going to snoop traffic and see if the gluster client is actually
> getting to the server or being blocked by the firewall.  Ill letcha all know
> what I find.
>
> Thanks for the ideas.
>
> Liam
>
> On Dec 3, 2010 6:32 PM, <mki-glusterfs at mozone.net> wrote:
>> On Fri, Dec 03, 2010 at 04:25:18PM -0800, Liam Slusser wrote:
>>> [root at client~]# netstat -pan|grep glus
>>> tcp 0 1 10.8.10.107:1000 10.8.11.102:6996 SYN_SENT 3385/glusterfs
>>>
>>> from the gluster client log:
>>>
>>> However, the port is obviously open...
>>>
>>> [root at client~]# telnet 10.8.11.102 6996
>>> Trying 10.2.56.102...
>>> Connected to glusterserverb (10.8.11.102).
>>> Escape character is '^]'.
>>> ^]
>>> telnet> close
>>> Connection closed.
>>
>> Looking further... why is your telnet trying 10.2.56.102 when you
>> clearly specified 10.8.11.102? Also, what happens if you do a
>> specific route for the 10.8.11.0/24 block thru the appropriate gw
>> without relying on the default gw to route for you? In this way
>> you dont end up in a situation where the client is mistakenly
>> trying to go over the wrong interface. The telnet maybe switching
>> to an alternate interface to see if it gets thru?
>>
>> Mohan
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>



More information about the Gluster-users mailing list