python BeautifulSoup How to write the output to html file -
i modify html file removing of tag using beautifulsoup, want write results in html file. code:
from bs4 import beautifulsoup bs4 import comment soup = beautifulsoup(open('1.html'),"html.parser") [x.extract() x in soup.find_all('script')] [x.extract() x in soup.find_all('style')] [x.extract() x in soup.find_all('meta')] [x.extract() x in soup.find_all('noscript')] [x.extract() x in soup.find_all(text=lambda text:isinstance(text, comment))] html =soup.contents in html: print html = soup.prettify("utf-8") open("output1.html", "wb") file: file.write(html)
but since use soup.prettify, generates html this
<p> <strong> batam.tribunnews.com, bintan </strong> - tradisi pedang pora mewarnai serah terima jabatan pejabat di <a href="http://batam.tribunnews.com/tag/polres/" title="polres"> polres </a> <a href="http://batam.tribunnews.com/tag/bintan/" title="bintan"> bintan </a> , senin (3/10/2016). </p>
but have result print do. :
<p><strong>batam.tribunnews.com, bintan</strong> - tradisi pedang pora mewarnai serah terima jabatan pejabat di <a href="http://batam.tribunnews.com/tag/polres/" title="polres">polres</a> <a href="http://batam.tribunnews.com/tag/bintan/" title="bintan">bintan</a>, senin (3/10/2016).</p> <p>empat perwira baru senin itu diminta cepat bekerja. tumpukan pekerjaan rumah sudah menanti di meja masing masing.</p>
so how make result same print i. tag , content on same line. thanks
just convert soup
instance string , write:
with open("output1.html", "w") file: file.write(str(soup))
Comments
Post a Comment