List Info

Thread: .NET to convert HTML to text




.NET to convert HTML to text
user name
2006-06-08 02:04:45
I have an approach that does this, but it doesn't work very
well with
more complex documents. Some people have very complex
formatting on
outgoing emails.

Erick 

-----Original Message-----
From: Discussion of building .NET applications targeted for
the Web
[mailtoOTNET-WE
BDISCUSS.DEVELOP.COM] On Behalf Of Efran Cobisi
Sent: Wednesday, June 07, 2006 1:49 AM
To: DOTNET-WEBDISCUSS.DEVELOP.COM
Subject: Re: [DOTNET-WEB] .NET to convert HTML to text

You could use regular expressions to strip out html tags
from the
source; use them even if you want to preserve basic
formatting, like BR
to CR/LF conversion.

To catch every html tag you could use <[^>]*?/?> To
catch just BR (and
subsequently make the replacement with /n) you could use
<br[^>]*?/?>

HTH
--
Efran Cobisi, cobisi.com
Microsoft Certified Professional

Erick Thompson wrote:
> I need to clean up HTML formatted emails sent in to a
plain text
version. There are a ton of utilities out there that convert
from HTML
to text, but none of them appear to be .NET and suitable for
use in an
ASP.NET application. Does anyone have any suggestions for a
way to
convert HTML to text in an ASP.NET app?
>
> Thanks,
> Erick
>

===================================
This list is hosted by DevelopMentor  http://www.develop.com

View archives and manage your subscription(s) at
http://discuss.develop.com


===================================
This list is hosted by DevelopMentorŪ  http://www.develop.com

View archives and manage your subscription(s) at http://discuss.develop.com

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )