Removing html contents from a web request using C# -

January 15, 2014

i have following code in c# gets contents of web page , stores them in string variable.

webrequest request = webrequest.create("http://www.arsenal.com"); webresponse response = request.getresponse(); stream data = response.getresponsestream(); string html = string.empty; using (streamreader sr = new streamreader(data)) {     html = sr.readtoend(); }

the code works m need store content of page without html tags , javascript stuff. there way (any built-in method or ready such things)?
have found ways removing html tags javascript , css styles still bother me. have mention way removing html not working well, i'm using regular expressions doing so.

as this question suggests, it's tricky process parsing html , best approach use library.

i've used html agility pack before success though this question lists other options.

Search This Blog

CSS

Removing html contents from a web request using C# -

Comments

Post a Comment

Popular posts from this blog

php - trouble displaying mysqli database results in correct order -

depending on nth recurrence of job in control M -

sql server - Cannot query correctly (MSSQL - PHP - JSON) -