Hi All,
I am trying to capture the heading structure of a given web page in the order they appear. For example, h1 appears first, then h2, h3, h3, h2, and so on. Similarly, their contents / text extraction is also required.
I am using the MS HTML DOM methods like getelementsbytagname( ) as to my knowledge of this library but it requires tag name and if I hard code "h1", it will go and search wherever it is found and not in the sequence in which it appears in the page source.
Thanks in advance.
Please Login or Register to view this content.
Bookmarks