module HTree
Public Class Methods
Source
# File htree/parse.rb, line 34 def HTree.parse(input) HTree.with_frozen_string_hash { parse_as(input, false) } end
HTree.parse
parses input and return a document tree. represented by HTree::Doc.
input should be a String or an object which respond to read or open method. For example, IO, StringIO, Pathname, URI::HTTP and URI::FTP are acceptable. Note that the URIs need open-uri.
HTree.parse
guesses input is HTML or not and XML or not.
If it is guessed as HTML, the default namespace in the result is set to www.w3.org/1999/xhtml regardless of input has XML namespace declaration or not nor even it is pre-XML HTML.
If it is guessed as HTML and not XML, all element and attribute names are downcaseed.
If opened file or read content has charset method, HTree.parse
decode it according to $KCODE before parsing. Otherwise HTree.parse
assumes the character encoding of the content is compatible to $KCODE. Note that the charset method is provided by URI::HTTP with open-uri.
Source
# File htree/parse.rb, line 48 def HTree.parse_xml(input) HTree.with_frozen_string_hash { parse_as(input, true) } end
HTree.parse_xml
parses input as XML and return a document tree represented by HTree::Doc.
It behaves almost same as HTree.parse
but it assumes input is XML even if no XML declaration. The assumption causes following differences.
-
doesn’t downcase element name.
-
The content of <script> and <style> element is PCDATA, not CDATA.
Public Instance Methods
Source
# File htree/equality.rb, line 10 def ==(other) check_equality(self, other, :usual_equal_object) end
compare tree structures.
Source
# File htree/equality.rb, line 16 def hash return @hash_code if defined? @hash_code @hash_code = usual_equal_object.hash end
hash value for the tree structure.