Some necessary compromises (important read)

In preparing my program for remigrating posts from the old forum, I've come across several situations for which a clean translation will not be possible:
1. Font size changes within [noparse][[/noparse]code] blocks. After trying to accommodate these, I have found that nearly all of them were probably unintentional, leading to text that looks really weird. So I will be disabling this in the translation.
2. URLs not enclosed in [noparse][[/noparse]url] blocks. A lot of forumistas -- you know who you are! -- have posted URLs without turning them into actual links. They're just text strings that have to be copied and pasted to get to their intended destinations. Many of these URLs refer to the (now) old forum. My inclination at this point is not to try to find and translate these URLs. The problem is that, without the surrounding link tags that are used to parse the old forum's raw HTML, they could be anywhere in the text, and that trying to find all of them will slow down the translation process to a crawl. Properly formated links will, of course, be translated to point to their correct destinations in the new forum.
3. Malformed HTML. I've finally come up with a solution for improperly nested (e.g. <b><i>text</b></i>) and unterminated (e.g. <i>text) HTML tags, which dotNetBB tolerated, but which vBulletin does not. But there are instances of malformed HTML resulting from too many links in a post and other reasons that will be impossible to decipher and correct. I'm afraid that these will have to be left to their original posters to fix up.
Questions and comments invited...
-Phil
1. Font size changes within [noparse][[/noparse]code] blocks. After trying to accommodate these, I have found that nearly all of them were probably unintentional, leading to text that looks really weird. So I will be disabling this in the translation.
2. URLs not enclosed in [noparse][[/noparse]url] blocks. A lot of forumistas -- you know who you are! -- have posted URLs without turning them into actual links. They're just text strings that have to be copied and pasted to get to their intended destinations. Many of these URLs refer to the (now) old forum. My inclination at this point is not to try to find and translate these URLs. The problem is that, without the surrounding link tags that are used to parse the old forum's raw HTML, they could be anywhere in the text, and that trying to find all of them will slow down the translation process to a crawl. Properly formated links will, of course, be translated to point to their correct destinations in the new forum.
3. Malformed HTML. I've finally come up with a solution for improperly nested (e.g. <b><i>text</b></i>) and unterminated (e.g. <i>text) HTML tags, which dotNetBB tolerated, but which vBulletin does not. But there are instances of malformed HTML resulting from too many links in a post and other reasons that will be impossible to decipher and correct. I'm afraid that these will have to be left to their original posters to fix up.
Questions and comments invited...
-Phil
Comments
Gof for it! I think you're at the point where where the better is the enemy of the good. (meaning you could always make it better, but the amout of effort is not worth the minimal gain).
John R.
-Phil
If the posts were broken in the original they can be broken in the result. It's probably impossible to guess all the possible ways the input can be broken.
4. Font face changes within [noparse][[/noparse]code] blocks. After struggling with Word-generated code blocks that include Times New Roman (why?), I decided that there's no advantage to accommodating font changes there. There just isn't any way to keep it from looking crappy.
-Phil