printing html entities using lxml in python -


i'm trying make div element below string html entities. since string contains html entities, & reserved char in html entity being escaped & in output. html entities displayed plain text. how can avoid html entities rendered properly?

s = 'actress adamari l&#243;pez , amgen launch spanish-language chemotherapy: myths or facts&#8482; website , resources'  div = etree.element("div") div.text = s  lxml.html.tostring(div)  output: <div>actress adamari l&amp;#243;pez , amgen launch spanish-language chemotherapy: myths or facts&amp;#8482; website , resources</div> 

you can specify encoding while calling tostring():

>>> lxml.html import fromstring, tostring >>> s = 'actress adamari l&#243;pez , amgen launch spanish-language chemotherapy: myths or facts&#8482; website , resources' >>> div = fromstring(s) >>> print tostring(div, encoding='unicode') <p>actress adamari lópez , amgen launch spanish-language chemotherapy: myths or facts™ website , resources</p> 

as side note, should use lxml.html.tostring() while dealing html data:

note should use lxml.html.tostring , not lxml.tostring. lxml.tostring(doc) return xml representation of document, not valid html. in particular, things <script src="..."></script> serialized <script src="..." />, confuses browsers.

also see:


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -