22 August, 2012

.Net and HTML 4.01


It's not that easy to make .Net produce valid HTML 4.01.

One of the problems is that even if you write <br> in your webform .Net will "correct" it to <br /> before the html-stream is sent to the webserver. (it's the same with all "not closed" html-tags)

Another problem is that .Net adds tags for Viewstate that has ID's that not conform to the HTML 4.01 standard.(__VIEWSTATE and __EVENTVALIDATION)

This was an terrible issue for me since I have a customer that requires HTML 4.01 that validates to the W3C validator.

After some massive googling I found an article with a suggestion on how to manipulate the html-stream before it reaches the webserver. The article is written by Rick Strahl and can be found here: Capturing and Transforming ASP.NET Output with Response.Filter

This is done by creating a response filter.
The code for the filter can be found here

I implemented it in the code behind of my master page like this:


protected override void OnInit(System.EventArgs e)
{
    base.OnInit(e);
    //To remove warning about Byte Order Mark
    Response.ContentEncoding = new System.Text.UTF8Encoding(false);

    ResponseFilterStream filter = new ResponseFilterStream(Response.Filter);
    filter.TransformString += filter_TransformString;
    Response.Filter = filter; 
}

string filter_TransformString(string output)
{
    return MakeHTML4_01(output);
}

/// <summary>
/// Tries to adjust output to conform to HTML4.01 standard
/// </summary>
/// <param name="output">The output adjusted to conform to HTML4.01 standard</param>
/// <returns></returns>
private string MakeHTML4_01(string output)
{
    output = output.Replace("id=\"__VIEWSTATE\"""id=\"VIEWSTATE\"");
    output = output.Replace("id=\"__EVENTVALIDATION\"""id=\"EVENTVALIDATION\"");
    return output.Replace("/>"">");
}


(Off course it would be faster to use regexp to do the replacements, but I wanted the example to be clean.)