[ut2004] Amd64 dedicated server locking up on vehicle maps

Clint Goudie-Nice clint at magicalspirits.net
Fri Oct 1 10:52:17 EDT 2004


I think the worst part about it is that it requires user intervention to
kick over the process...

Here's a quick kludge I cooked up last night that might help others.

I've noticed that when the server goes into this hang, none of the
sockets respond anymore, so I wrote a script that monitors the IRC
socket, and when it's hung twice in 60 seconds, it gets grumbly and
kills the process. The script I have running my UT server loops and
restarts the server after it's killed. I've only been running it for
half a day now, but I watched it hang once and get restarted without my
intervention.

Use these goodies at your own risk ;)

I run the UnGateway from elmuerte http://ungateway.elmuerte.com/ and
have the IRC connection turned on... 

----begin monitor.sh----
#! /bin/bash
while true
do
  cat irc.txt | nc -w 10 localhost 6667 |grep QUIT
  if [ $? -ne 0 ]
  then
    echo "Unable to connect to server Will try again in 30 seconds"
    sleep 30
    cat irc.txt | nc -w 10 localhost 6667 |grep QUIT
    if [ $? -ne 0 ]
    then
      echo "Unable to connect to server Will try again in 30 seconds"
      killall -9 ucc-bin
      killall -9 ucc-bin-linux-amd64
      sleep 60
      cat irc.txt | nc -w 10 localhost 6667 |grep QUIT
      if [ $? -ne 0 ]
      then
        echo "Mailing Admin, cos it's really crashed bad"
        echo "Unable to recover crashed ut2004" | /bin/mail
PageMeHere at wherever.com
      fi
    fi
  fi
  sleep 30
done
----end monitor.sh----
----begin irc.txt----
NICK Monitor
QUIT
----end irc.txt----

Btw, UnGateway doesn't actually know the quit command, so it spits out a
"UNKNOWN COMMAND: QUIT" which grep picks up and returns a 0 return code.
Otherwise, if the socket is dead, nc times out, or grep returns an error
code of 1.

This probably wouldn't be hard to adapt for the webadmin either. I had
some issue getting netcat to drop the text to it though...

Clint

-----Original Message-----
From: Kingsley Foreman [mailto:kingsley at internode.com.au] 
Sent: Thursday, September 30, 2004 11:27 PM
To: ut2004 at icculus.org
Subject: Re: [ut2004] Amd64 dedicated server locking up on vehicle maps


ive been getting this with ons for ages.

reported it many times.
still no reponse about it from anyone.
had had to go back 32bit..


----- Original Message ----- 
From: "Clint Goudie-Nice" <clint at magicalspirits.net>
To: <ut2004 at icculus.org>
Sent: Friday, October 01, 2004 2:16 PM
Subject: [ut2004] Amd64 dedicated server locking up on vehicle maps


> This seems to be a recurrant theme on our server. When people play 
> vehicle based maps, generally Vehicle Invasion, within 2 or 3 maps, 
> the server hangs with 100% cpu utilization. The only corrective action

> is to go to the server console, press ctrl-c at which point I get the 
> app requested exit message, and then I have to press ctrl-c again... 
> Then I get this dump...
> 
> Developer Backtrace:
> [ 1]  ./ucc-bin-linux-amd64 [0x9c53bd]
> [ 2]  ./ucc-bin-linux-amd64 [0x9c563c]
> [ 3]  /lib64/tls/libpthread.so.0 [0x36fbf0c4a0]
> [ 4]  ./ucc-bin-linux-amd64 [0xbafa21]
> [ 5]  ./ucc-bin-linux-amd64 [0x8bb49a]
> [ 6]  ./ucc-bin-linux-amd64 [0x9725e2]
> [ 7]  ./ucc-bin-linux-amd64 [0x8bbb91]
> [ 8]  ./ucc-bin-linux-amd64 [0x9282e7]
> [ 9]  ./ucc-bin-linux-amd64 [0x9248a6]
> [10]  ./ucc-bin-linux-amd64 [0x916821]
> [11]  ./ucc-bin-linux-amd64 [0x5b6009]
> [12]  ./ucc-bin-linux-amd64 [0x56b786]
> [13]  ./ucc-bin-linux-amd64 [0x53498a]
> [14]  ./ucc-bin-linux-amd64(atan+0x26bf) [0x40630f]
> [15]  /lib64/tls/libc.so.6(__libc_start_main+0xf2) [0x36fb01c072] [16]

> ./ucc-bin-linux-amd64(strcat+0xaa) [0x403e2a] Unreal Call Stack: 
> AONSHoverCraft::UpdateVehicle <- ASVehicle::execUpdateVehicle <- 
> UObject::ProcessEvent <- ASVehicle::preKarmaStep <- CallPreBodyStep <-

> ProcessPartitions <- KWorldStepSafeTime <- KTickLevelKarma <- 
> TickAllActors <- ULevel::Tick
> <- TickLevel <- UGameEngine::Tick <- UpdateWorld <- 
> UServerCommandlet::Main Exiting.
> FileManager: Reading 0 GByte 395 MByte 514 KByte 777 Bytes from HD
took
> 9.104828
> seconds (3.874906 reading, 5.229922 seeking).
> FileManager: 0.000000 seconds spent with misc. duties
> Name subsystem shut down
> Allocation checking disabled
> 
> From my memory all the hangs are generally in UpdateVehicle, although 
> it seems like other vehicles end up with the same issue. Anyone else 
> having this issue? Any idea how to work around it?
> 
> Thanks!
> 
> Clint
> 
>




More information about the ut2004 mailing list