BlogEngine.NET version 1.3 has a syntax highlighting extension included but is in beta, so I looked around for another syntax highlighter, since this blog heavly uses code snippets. After looking for a while I found this syntax highlighter extension. This extension uses Wilco Bauwer's syntax highlighter library which is impressive. But the main problem with this extension is, it does not handle HTML tags and special HTML characters like &nbsp &lt and &gt very well. Tags and special characters are left as garbage after extension tries to highlight the code. You see something like this

private voidTest&nbsp{



After inspecting SyntaxHighlightingExtension.cs file I saw that the extension matches the source code with a regular epression and feeds the Wilco's highlighter with the raw html (body). This was kind of incomlete implementation causing the side effect I mentioned above. We need to clean html tags and special characters from the raw html(body). So I changed Highlight method of the extension. The resulting method is something like this.

001private string Highlight(HighlightOptions options)
002    {
003        string parsed;
004        uint id = NextCodeID();
005        string name = options.Language;
007        HighlighterBase highlighter = GetHighlighter(name);
008        if (highlighter != null)
009        {
010            name = highlighter.FullName;
011            highlighter.Parser = htmlParser;
013            string body = Regex.Replace(options.Code,@"<\s*br\s*/\s*>","\r\n",RegexOptions.CultureInvariant| RegexOptions.IgnoreCase | RegexOptions.Singleline);        
014            body = Regex.Replace(body,@"<(?![!/]?[>\s])[^>]*>",String.Empty,RegexOptions.CultureInvariant| RegexOptions.IgnoreCase | RegexOptions.Singleline);        
015            body = HttpUtility.HtmlDecode(body);
017            try
018            {
019             parsed = highlighter.Parse(body);
020             //parsed = HttpUtility.HtmlDecode(parsed);
021            }
022            catch
023            {
024                name += " (not highlighted)";
025                parsed = options.Code;
026            }
027            finally
028            {
029                highlighter.ForceReset();
030            }
031        }
032        else
033        {
034            name += " (not highlighted)";
035            parsed = options.Code;
036        }
038        if (options.DisplayLineNumbers)
039        {
040            string[] lines = parsed.Split(new char[] { '\n' });
041            StringBuilder outputBuffer = new StringBuilder();
043            for (int i = 0; i < lines.Length; i++)
044            {
045                outputBuffer.AppendFormat(linenumberingTemplate, i+1, lines[i]);
046            }
048            return string.Format(OutputTemplate, id, name, options.Title, outputBuffer, options.InitialStyle);
049        }
051        return string.Format(OutputTemplate, id, name, options.Title, parsed, options.InitialStyle);
052    }

The change I made resides starting from line 12 and ending in line 21. I simply stripped out the html tags with a regular expression and then used HttpUtility.Decode method to decode special HTML characters and feed the parser with normalized body text and voila the extension started performing well.

By the way I want to remind you that if you are using the default editor (TinyMCE) you should copy and paste your source code to a plain text editor like Notepad++ and the copy from Notepad++ and paste to TinyMCE. I think TinyMCE should consider to add something like Paste as PlainText functionality to their editor like FCKEditor.

Posted in: BlogEngine.NET  Tags:


April 6. 2008 15:24
Thank you very much for article! I was worried that I'm doing something wrong, but I see now that it was not my mistake Wink

September 4. 2008 11:25
Andrey Kuzmenko
Thank you. It works very good!

India Deepankar
October 17. 2008 14:35

It doesn't seem to work with Live Writer also, the way you are showing the code above.. it doesn't work at this.. please explain how to write the blog entry code block to show the way the code is coming up here..

Thanks in advance.

   no site

October 18. 2008 13:52

It is simple just enclose your code block inside [code]  [/code] markers and you are done. For example

[code=csharp;ln=on;Sample highlighter]
public interface ISample
  int Value{get;}

Formatted [code] marker must be of this format [code=csharp|xml;ln=on|off;Description text]

India Deepankar Raizada
October 18. 2008 14:11
Deepankar Raizada
When i use the above method i get the code like, please see

   no site

October 19. 2008 17:44
Please copy and paste the following and check if this works. It seems like your problem is some formatting issue.
public static string StripHTML ( string value )

   // Strip the html tagstring

   pattern = "<(.|\n)+?>";

   string strOutput = string.Empty;

   Regex regex = new Regex ( pattern, RegexOptions.IgnoreCase );

   // Replace all HTML tag matches with an empty

   stringstrOutput = regex.Replace(value, string.Empty);

   // Replace all < and > with &lt; and &gt;

   strOutput = strOutput.Replace("<", "&lt;");

   strOutput = strOutput.Replace ( ">", "&gt");return strOutput;


February 6. 2009 23:14
John -
Works like a charm.  Thanks!

February 21. 2009 00:28
When i try to use this extension (before AND after making the fixes to the c# code), the code only displays on one line when pasting into the tinymce editor.. when i use the plain text editor its even worse. I tried pasting what was recommended above and no joy, check it out here:

February 26. 2009 14:47
ufc 96 live
thanks, this is great

March 28. 2009 18:53
Trackback from code0724

Syntax Highlighting Extension Test

May 11. 2009 13:33
тексты песен
Cool work..thanks for sharing....

Comments are closed