java - Parse a table from HTML using jsoup -
i've got problem scraping html text. here's sample of i'm trying extract from:
<table class="scripture"> <tbody> <tr> <td class="verse" valign="top"> <a name="2:1"></a><a class="vers" href="javascript:getparallel('luk', 2, 1);" title="klik om grondtekst en sv te zien"> 1 </a> </td> <td class="content"> <span class="main">en het geschiedde in die dagen dat er een gebod uitging van keizer augustus dat heel de wereld ingeschreven moest worden.</span> </td> </tr> </tbody> </table> <table class="scripture"> <tbody> <tr> <td class="verse" valign="top"> <a name="2:2"></a><a class="vers" href="javascript:getparallel('luk', 2, 2);" title="klik om grondtekst en sv te zien"> 2 </a> </td> <td class="content"> <span class="main">deze eerste inschrijving vond plaats toen cyrenius on syriƫ stadhouder was.</span> </td> </tr> </tbody> </table>
this similar problem in link want verse text , scripture content. how achieve this?
so far i've tried:
element table = doc.select("table[class=scripture]").first(); log.e("bb", "passage1: " + table.owntext());
but doesn't display anything. appreciated. thanks.
assuming want span's content corresponding table contains verse 2:2
, can with:
string verse = "2:2"; // span of class main located inside table of class scripture // contains td of class verse link attribute name value of verse element p = doc.select( string.format("table.scripture:has(td.verse a[name=%s]) span.main", verse) ).first(); system.out.println(p.text());
output:
deze eerste inschrijving vond plaats toen cyrenius on syriƫ stadhouder was.
Comments
Post a Comment