html - Load nested links via Java with jsoup -


i working on crawler via jsoup. want display link(s) of categories asian e-shop https://world.taobao.com/. code able find link on page to:

elements links = doc.select("a[href]"); system.out.println("total results: " + links.size()); 

but not of them. need show links categories, nested in many <div> tags.

anchor

here code:

package jsoup;  import java.io.ioexception; import org.jsoup.jsoup; import org.jsoup.nodes.document; import org.jsoup.nodes.element; import org.jsoup.select.elements;  public class crawler {    public static final string cls_name = "crawler";      public static final string url_source = "https://world.taobao.com/";      public static void main(string[] args) throws ioexception{          // load document         document doc = jsoup.connect(url_source).get();          // select <a> tag "href" attribute           elements links = doc.select("a[href]");         system.out.println("total results: " + links.size());          (element url: links){             system.out.println(string.format("* [%s] : %s ", url.text(), url.attr("abs:href")));          }     } } 

could please me problem?

this has nothing code.

the particular site generates parts of content using javascript. jsoup able static parts of site, won't able crawl easily.

you can still use tools such selenium that, execute javascript code inside browser.


Comments

Popular posts from this blog

sql server - Cannot query correctly (MSSQL - PHP - JSON) -

php - trouble displaying mysqli database results in correct order -

C++ Linked List -