The following article introduces HttpClients on Windows 10, which should read a web page.
The one Http client turns off the internal cache. This will prevent the same web page from always coming back even though another parameter has been entered in the web address.
The second HttpClient fixes the problem when a web server redirects the web addresses.
1) HttpClient with cache off
Problem: the same result is always returned as a string, even though the web URL changes in the query parameters.
Solution: here the web cache must be switched off
For this you have to use the namespace Windows.Web.Http
using Windows.Web.Http; //*http cache |
C # code example
Call one HttpClient with disabled cache
First you create a protocol filter
This filter can be set to the CacheControl behavior on NoCache
public static async Task<HtmlDocument> Web_Get_HtmlDocument(string sURL) { //------------< fx_read_Page() >------------ //* get the HTML Document of a website-URL try { //-< init >- //< HttpClient > //*noCache HttpBaseProtocolFilter filter = new HttpBaseProtocolFilter(); filter.CacheControl.ReadBehavior = HttpCacheReadBehavior.NoCache; filter.CacheControl.WriteBehavior = HttpCacheWriteBehavior.Default;
HttpClient client = new HttpClient(filter);
//httpClient string sHTML = ""; //Client Request as string try { sHTML = await client.GetStringAsync(new Uri(sURL)); } catch (Exception ex) { //clsSys.show_Message(ex.Message); clsSys.fx_Log("Error httpClient: " + ex.Message); return null; } //</ HttpClient > //-</ init >-
clsSys.fx_Log("read HTML=" + sHTML.Substring(0, 10) + "..");
//< get HTMLdocument > //*create and load to local HtmlDocument HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(sHTML); //</ get HTMLdocument >
//< output > return doc; //</ output > } catch (Exception ex) { clsSys.fx_Log("ERROR get HtmlDocument URL=" + sURL + " Msg:" + ex.Message ); return null; }
//------------</ fx_read_Page() >------------ }
|
2) HttpClient with automatic redirection
If a web server redirects the web pages internally, then the HttpClient usually gets an error 404 or something similar.
To instruct the HttpClient to also follow the forwarding as in the browser, you have to create the client under System.Net.Http with an HttpClientHandler
Namespace HttpClient
using System.Net.Http; //*HttpClientHandler |
Example code for creating a HttpClient with web page forwarding enabled
public static async Task<HtmlDocument> Web_Get_HtmlDocument(string sURL) { //------------< fx_read_Page() >------------ //* get the HTML Document of a website-URL try { //-< init >- //< HttpClient >
HttpClientHandler handler = new HttpClientHandler(); handler.AllowAutoRedirect = true; HttpClient httpClient = new HttpClient(handler);
//httpClient string sHTML = ""; //Client Request as string try { sHTML = await httpClient.GetStringAsync(sURL); } catch (Exception ex) { //clsSys.show_Message(ex.Message); clsSys.fx_Log("Error httpClient: " + ex.Message); return null; } //</ HttpClient > //-</ init >-
clsSys.fx_Log("read HTML=" + sHTML.Substring(0, 10) + "..");
//< get HTMLdocument > //*create and load to local HtmlDocument HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(sHTML); //</ get HTMLdocument >
//< output > return doc; //</ output > } catch (Exception ex) { clsSys.fx_Log("ERROR get HtmlDocument URL=" + sURL + " Msg:" + ex.Message ); return null; }
//------------</ fx_read_Page() >------------ }
|
# UWP: Universal Windows Platform