Problem:
When reading a web page with WPF, opening the URL with a HttpClient returns the error code for missing Unicode character.
At first, I suspected that it would have the encodede character []% 5B% 5D as the cause of an http client.
The solution lies in the fact that the HttpClient can not read difficult UTF codes. That's why I rewrote the code on a WPF web request.
Error message:
There is no mapped character in the multi-byte codepage for the Unicode character. (Exception from HRESULT: 0x80070459)
URL
[Lnk "https://www.nigelfrank.com/de/search?keywords=&country%5B%5D=3&type=contract&order=posted" /]
Faulty download code
Windows.Web.Http.HttpClient client = new Windows.Web.Http.HttpClient(filter); String sHTML = await client.GetStringAsync(new Uri(sURL)); |
Solution: WPF, C # (partially UWP, 2018)
Read as a webrequest
//--< download as webrequest >-- WebRequest objRequest = WebRequest.Create(sURL); objRequest.CachePolicy = new HttpRequestCachePolicy(HttpRequestCacheLevel.NoCacheNoStore); HttpWebResponse objResponse = (HttpWebResponse)objRequest.GetResponse(); //</ WebRequest and Response >
//< Stream and Reader > Stream objDataStream = objResponse.GetResponseStream(); StreamReader TextReader = new StreamReader(objDataStream); sHTML = TextReader.ReadToEnd(); //</ Stream and Reader > //--</ download as webrequest >-- |
Complete code, C #, WPF
Download as webrequest and not as string
public static async Task<HtmlDocument> Web_Get_HtmlDocument(string sURL) { //------------< Web_Get_HtmlDocument() >------------ //* get the HTML Document of a website-URL try { //----< Read HTML from Website >----
//< read HTML > string sHTML = ""; //Client Request as string try { string sDownloadType = "webrequest"; if (sDownloadType=="webrequest") { //--< download as webrequest >-- WebRequest objRequest = WebRequest.Create(sURL); objRequest.CachePolicy = new HttpRequestCachePolicy(HttpRequestCacheLevel.NoCacheNoStore); HttpWebResponse objResponse = (HttpWebResponse)objRequest.GetResponse(); //</ WebRequest and Response >
//< Stream and Reader > Stream objDataStream = objResponse.GetResponseStream(); StreamReader TextReader = new StreamReader(objDataStream); sHTML = TextReader.ReadToEnd(); //</ Stream and Reader > //--</ download as webrequest >-- } else { //--< download as string >-- //-< HttpClient >- HttpBaseProtocolFilter filter = new HttpBaseProtocolFilter(); filter.CacheControl.ReadBehavior = HttpCacheReadBehavior.NoCache; filter.CacheControl.WriteBehavior = HttpCacheWriteBehavior.Default; filter.CookieUsageBehavior = HttpCookieUsageBehavior.Default;
Windows.Web.Http.HttpClient client = new Windows.Web.Http.HttpClient(filter);
//-</ HttpClient >- sHTML = await client.GetStringAsync(new Uri(sURL)); //--</ download as string >-- }
} catch (Exception ex) { //clsSys.show_Message(ex.Message); clsSys.fx_Log("Error httpClient: " + ex.Message); return null; } //</ read HTML > //----</ Read HTML from Website >----
clsSys.fx_Log("read HTML=" + sHTML.Substring(0, 10) + "..");
//< get HTMLdocument > //*create and load to local HtmlDocument HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(sHTML); //</ get HTMLdocument >
//< output > return doc; //</ output > } catch (Exception ex) { clsSys.fx_Log("ERROR get HtmlDocument URL=" + sURL + " Msg:" + ex.Message); return null; }
//------------</ Web_Get_HtmlDocument() >------------ }
|