Domino's POP3 Server Breaks HTML Emails by Removing Characters
There's a bug when sending and retrieving HTML email from Domino that has been plaguing me for years now. Now I've finally decided to take the time to investigate fully and find a fix.
First, imagine the code below:
While i<100 j = 0 html ="" While j<i html = html + "split string " j=j+1 Wend Set mail = New Email() mail.Subject = "Test " + CStr(i) mail.HTML = html mail.Send("jake howlett") i=i+5 Wend
What this does is send me 20 test emails. Each email is increasingly longer than the one before it and they all just repeat the words "split string" over and over.
The code to send the email is based on my Email class. All it does is create a multi-part MIME message. Nothing out of the ordinary. If you're using the MIME classes to send emails then this probably applies to you.
At some point in the loop the length of the message will get to such a size that something very worrying happens, as you can see below:
Notice the missing p?!
The Problem
From my investigations I've concluded the following:
If you use a 3rd party mail client (like Thunderbird) to download a multi-part HTML email from a mail file on a Domino server using POP3 then the POP3 server will remove the 655th character and put a line break in its place.
Looking at the very same email in the original mail file - using the Notes client - there's no missing character.
What I've also noticed is that it only removes the first 655th char. Not every subsequent 655th char in the whole string.
Obviously this can be very, very bad. At it's least worst, like in this example, it just looks like a typo. However, I've had numerous bug reports because it's broken the string inside an HTML tag, resulting in un-clickable links or -worse still - emails that just show raw HTML.
The problem seems only to occur when sending emails where the HTML content is made up of one very long string. I guess using very long strings is generally a bad idea in any case, but there's nothing actually wrong with it, and nothing half as wrong as the server removing characters.
The Solution
What we need to do is avoid very long lines of HTML code in the email. The obvious solution is to add line breaks in the HTML as you build it in your code.
Unfortunately, in my case, I have way too many instances of code sending HTML email to make it practical to go and add new line breaks in to each email. Instead, what I did was put a fail-safe in to the Email class which sends the email. At the point it adds the HTML to the email it adds a new line at every point an HTML tag is closed, like so:
Call stream.WriteText(Replace(Me.str_HTMLPart, ">", ">"+Chr(10)))
There's still a slight chance that there could be a long string that avoids this rule, but it's unlikely enough for me to feel happy this has resolved the issue for now.
Summary
So, you've been warned. If you're sending HTML emails to a Domino server, make sure you split the HTML string down in to sizeable chunks so that users who access their email via POP3 don't see broken HTML.
It all leaves me wondering why on earth the POP3 server would replace the 655th character in the first place. Assuming it does of course. My investigation wasn't exactly extensive but, from what I can tell, it definitely looks like it does. Why though? What's significant about 655? It's not like it's a base 2 number or anything.
Which leads to me to the position I've kept for many years, namely email is text medium! There isn't a barge pole made long enough for me to touch HTML emailing in Notes and usually any client requests for such an abomination are pre-faced with a long rant with that opinion!
Reply
Jake do you know if this is something that only happens with domino or can this sort of thing happen on any POP3 server?
Reply
Anything can happen on any server, in theory. I'd imagine this is a Domino-only bug, but then I just don't know.
Reply
Show the rest of this thread
My previous job was at a company that used Exchange for email and Notes for lots of web and some client apps. I noticed this occurred when HTML formatted emails were sent that contained long text strings, but never did the diagnostics that you have done. The solution as you stated is to add a line break regularly. Doesn't make any difference to the display of the HTML (as long as you don't break within tags).
Reply
Correct. As long as the new line isn't inside a tag you're ok.
At first I tried some code that would add a new line every 80 (or so) chars (but not removing a character of course) only to find that adding newlines inside a tag is bad, like you say.
Reply
I trust you've filed a bug report for this? I get the feeling IBM would be responsive to such a well-written explanation of the issue.
Reply
If I knew how to and/or thought it might be an easy process I probably would...
Reply
Show the rest of this thread
Slightly off topic but somewhat related...
I'm sending HTML email through Domino from a PHP app. I encountered an issue where the email did not send at all. Eventually it turned out that the Email class in PHP I was using used a '\n' as a newline character. By setting it to '\r\n' the problem went away.
In other words, HTML emails and new lines are very sensitive. I'm guessing it depends on the OS you are hosting Domino on which one to use.
As for why the 655th character? My uneducated guess is that this has something to do with the MIME encoding of emails, where long content is divided into blocks. It could even be so that Domino is inserting a new line itself and that the space you see is in fact a newline character. Just speculating here :)
Reply
The space you see is indeed a new line. Domino seems to remove the character ("p" in this case) and replace it with a space AND a new line (which is ignored by HTML rendering, so you just see a space).
Why not every 655th character though? If you send a line of HTML with a length 2000 characters it will split it in to two lines -- one of 655 characters and a second line of 1355 characters. You'd expect it to produce 4 lines, no? Very weird.
Reply
Show the rest of this thread
This reminds me of a similar problem that drove us nuts until we discovered the fix. This was about 10 years ago.
We were generating complex HTML code and placing it in a rich text field on a notes document. The problem presented itself quite similar to what you are seeing in the HTML email. That is, very long web pages would break partway down.
We generated a test quite similar to yours and found that when the length of the HTML approached 65,536 characters, a character was dropped and a new line was inserted. (I don't remember the exact number but it was not a power of 2 either.)
Our fix was similar to yours in that we wrote a library object (in Java) for sending the code to the rich text field which kept a count of the characters. When the character count neared 32,768 we would look for a '>' and insert a new line after it and reset the character count. That solved the problem and we never looked back. (That system created web sites that looked nothing like a generic Domino site.)
I think it's interesting that your problem occurs at 655 characters and ours occurred near 65,536.
Coincidence or conspiracy? You be the judge.
Peace,
Rob:-]
Reply
Rob,
If I'm not mistaken, what you describe is the Domino paragraph size limit of 64K. Your rich text field can be a few GB big, but one paragraph inside it can never be more than 64K. Indeed inserting new lines is the solution.
Reply
I wonder if it is some built in Notes compliance with RFC 821. Many years ago I had trouble sending MIME/HTML messages due to strict adherence to the 1000 character limit per line in the above RFC. I added code to break up the html with New Lines and everything worked fine.
Reply
That might explain it. Although I don't think the RFC says anything about *removing* random characters ;-)
Reply
Just my two cents on the issue. I would need to take some time to test it out and/or read some RFCs.
Is the POP3 client setup to just download the headers and then goes and gets the full content later?
*maybe* your client is retrieving the first 655 characters part of the header and inserting a new line at 655. I believe there is a "TOP" command in the POP3 RFCs that allow specifying the number of lines to get along with the headers. Then when it gets the rest, Domino is just sending the 656+ and the client pieces it together.
I'm not saying it is not a Domino issue, just thinking about why that character.
Reply
Interesting idea, but, having thought about it, I don't it's the case.
I sent the same emails to two different addresses - one Domino and one not. I then downloaded the emails in to two different accounts in Thunderbird - one connecting to Domino and one not. Only the Domino account showed the issue and neither account had the option to fetch headers enabled.
Jake
Reply
Tip: use IMAP, it works better ;-D
Reply
Came across same issue in LAMP setup and you solution worked perfect.
I was sending HTML emails using php mail function. So issue is with POP3 clients and its not specific to Domino.
Thanks for the solution.
Reply