  Optimum1 Hey Macleod, Get Offa My Ewe Premium join:2001-08-22 Minneapolis, MN clubs: | Is Foldy1 stuck?
Looks like the client is hung? |
|
  bbarrera Premium,MVM join:2000-10-23 Sacramento, CA clubs: | that's weird. Working on it. |
|
  bbarrera Premium,MVM join:2000-10-23 Sacramento, CA clubs: | reply to Optimum1 Bizarre. This new FAH client has bugs, e.g. reporting issues. For some reason it got stuck and restarting once didn't help. Restarting again just now got it working. |
|
  parkut Crunch Addict Premium join:2001-12-15 Harrison Township, MI clubs: 
·WOW Internet and C..
·AT&T Midwest
| Did you happen to notice if the system load reflected little to no CPU effort?
My monitoring scripts watch the average system load, and if it drops to zero for a period of time, It automatically restarts the client. -- Hello, my name is Bill and I'm a crunchaholic...
Proud to be the current host of Crunchenstein #1, #3, #5, and Foldy #3 |
|
  bbarrera Premium,MVM join:2000-10-23 Sacramento, CA clubs:
| parkut, I installed updates and rebooted. Then started FAH client from the terminal and walked away without verifying it got started.
This is only time I can recall that restarting the client resulted in a stuck client. Won't bother adding logic to my monitoring scripts as I have zero time for outside interests at the current, and am planning on migrating to Nagios to consolidate various and sundry scripts I have for server monitoring and management.
BTW I saw your bug report thread on the other forum about Linux SMP client and incorrect % completion in unitinfo.txt, which hasn't affected my monitoring as I'm using qd-tools due to issues with inaccurate or ambiguous information in unitinfo.txt. |
|
  MstrBlstr Status - Tired Premium join:2005-03-15 Corpus Christi, TX
| reply to bbarrera said by bbarrera :Bizarre. This new FAH client has bugs, e.g. reporting issues. For some reason it got stuck and restarting once didn't help. Restarting again just now got it working. said by "kasson" : The current A2 core versions are 2.01. They're substantially faster and contain some important bugfixes.
You need to manually delete your A2 core (version 2.0) to get the latest A2 core (version 2.01). |
|
  parkut Crunch Addict Premium join:2001-12-15 Harrison Township, MI clubs: 
·WOW Internet and C..
·AT&T Midwest
| reply to Optimum1 bbarrera, it's great that your systems rarely hang. I've had it happen often enough over the years, it's no big deal for me to automatically put a load watchdog script on my systems.
For example today, a p2668 hung at 54%, my watchdog script detected the system load dropped to zero for over 15 minutes, and restarted it.
10:01:01 up 38 days, 2:05, 0 users, load average: 0.00, 0.00, 0.00
-- Hello, my name is Bill and I'm a crunchaholic...
Proud to be the current host of Crunchenstein #1, #3, #5, and Foldy #3 |
|
  bbarrera Premium,MVM join:2000-10-23 Sacramento, CA clubs:
| Like I said my next time investment is to move all server and app monitoring/alert/action to Nagios.
I'm currently running SMP client on 16 Macs and 1 Linux box, and have been for several years, and except for this weird problem only downtime is: - Stanford servers won't hand out new WUs - Standford servers hand out bad WUs - kill the FAH client to increase computer performance (free up RAM needed for Photoshop and other apps) - computer turned off - forget to update FAH client after it expires
Therefore I'm surprised to hear you have hangs/lockups, although I seem to recall that you overclock and that is likely the issue.
So I could easily setup a watchdog script but it hasn't been an issue and I'm migrating to Nagios, at which time I'll setup the apropriate actions (restart, email, text message, etc) to be performed based on events. |
|