logo

Pushing Boundaries?

Anybody who looks after a Domino server will know that the screengrab below from this site's log.nsf doesn't look good.

image

Any entry with a start time and no end time tends to indicate a crash. At least it does in each of the cases above!

As I feared yesterday it's starting to look a lot like my latest round of changes to this site are killing the server. I just can't work out why. Nothing I've done is really pushing any boundaries. Surely?

The obvious culprit would be the major change, which is that I now build the entire comment thread using a Rich Text item and a Web Query Open agent each time a blog or article document is opened. There's nothing fancy going on there though and I'm thinking that's a red herring.

Right now I'm suspecting it's more to do with creating new replies. The crashes seem to be happening at random when you press the "Post It" button. When it does crash the reply you posted doesn't save but something must happen as the browser receives the redirect back from the server, which happens in the WQS agent (Print "Location: /bla"). Very odd.

While I look in to it (and beg Prominic.NET for an 8.5.1 upgrade) it might be worth copying the text of the reply before you press the post button!

I should know not to try and push boundaries with Domino. If the only solution to this is a design rollback I'm going to be pretty peeved about it after I've put in so much effort to try and show that Domino can do anything others servers can. It feels a bit like one step forward and two steps back.

If it comes to it I might even ask Prominic to revert the server to 7.0.2 and/or move it to a Windows machine (it's on Linux now and Domino version is 8.5HF224).

Any ideas folks?

Comments

    • avatar
    • Ian
    • Fri 16 Oct 2009 07:10 AM

    Q: Do you have an agent that runs when someone leaves a comment? Do users recieve a 500 error? Do you have any logging in the agent? I have experienced similar situation in the past and we tracked it down to the agent log growing too large.

      • avatar
      • Jake Howlett
      • Fri 16 Oct 2009 07:17 AM

      Yes there's a LotusScript WQS agent. Nothing special. Not sure what other people are seeing, but I just saw a "can't connect" page in the browser. The server seems to crash before it gets chance to throw a 500.

      Agent doesn't use an Agentlog.

      Show the rest of this thread

  1. Your server isn't on 8.0.2FP2 is it? If so, it might be helpful to know that I encountered a number of unexpected crashes which magically resolved when I reverted to FP1.

      • avatar
      • Jake Howlett
      • Fri 16 Oct 2009 07:23 AM

      Nope. It's on 8.5 (HF224 whatever that is). Think I skipped 8.0. Was on 7.0.2 before this.

      Hopefully I'll be on 8.5.1 soon.

  2. Sorry, scrap the above, I should have read your last sentence...

    • avatar
    • Erik Brooks
    • Fri 16 Oct 2009 08:13 AM

    Did you try the RTItem.Compact() call I suggested? That might be all you need.

      • avatar
      • Jake Howlett
      • Fri 16 Oct 2009 08:35 AM

      That's been in place since last night, so didn't prevent the crashes this morning, although I'm not convinced the RT field is even to blame

    • avatar
    • Erik Brooks
    • Fri 16 Oct 2009 08:15 AM

    Also, are you getting an NSD file? If so, search for "fatal" and post the call stack of the fatal thread so we can see it.

      • avatar
      • Jake Howlett
      • Fri 16 Oct 2009 08:36 AM

      I'm about to go looking for the NSD files. Not easy on a Linux box via the shell.

  3. You can store nsd's in a notes database (Lotus Notes/Domino Fault Reports, lndfr.nsf). Then it is as simple on Linux than on any other OS.

  4. If you still have problems with the crashes, you could add a lot of Prints to your code. Then, when you find a crash, dig through the log.nsf (if Prominic provides you with one), and see what the last part that ran was.

    From all the trouble you're having, I presume the agent doesn't crash the "normal" way, which results in errors going to the log (or that you don't have a log available).

    If you don't have a log.nsf available, OpenLog + LogEvent is also a tool.

  5. Most likelly your crash has nothing to do with the WQO agents directly, as we have a CMS based completely around that, and we don't have any troubles pushing Domino in that direction (we have our own template engine that writes html directly to an rtitem on WQO)

    However, we have been having serious stability issues with the 8.5 httpstack, crashing the main servers several times a day, without explanation. Upgrading to 8.5.1 seems to have resolved the situation for us.

      • avatar
      • Jake Howlett
      • Mon 19 Oct 2009 08:13 AM

      It's not happened since, so I'm starting to think it was "one of those things" and just a coincidence it was happening around the time I made the first changes to this sites design in years.

      Prominic have promised an 8.5.1 upgrade this week anyways.

Your Comments

Name:
E-mail:
(optional)
Website:
(optional)
Comment:


About This Page

Written by Jake Howlett on Fri 16 Oct 2009

Share This Page

# ( ) '

Comments

The most recent comments added:

Skip to the comments or add your own.

You can subscribe to an individual RSS feed of comments on this entry.

Let's Get Social


About This Website

CodeStore is all about web development. Concentrating on Lotus Domino, ASP.NET, Flex, SharePoint and all things internet.

Your host is Jake Howlett who runs his own web development company called Rockall Design and is always on the lookout for new and interesting work to do.

You can find me on Twitter and on Linked In.

Read more about this site »

Elsewhere

Here are the external links posted on the same day.

More links are available in the archive »

More Content