Docstrings
TidierVest.html_attrs
— MethodGet an attribute
html_attrs(html,string)
Input:
html
– It can be HTMLDocument, HTMLElement or Vector{HTMLNode}string::String
(optional) – Define the attribute that you want to return, if not provided, it would try to return a list of the attributes.
Output
Indicated attribute or a list of the available attributes
TidierVest.html_children
— MethodGet the children of an html
Input:
html
– It can be HTMLDocument, HTMLElement or Vector{HTMLNode}
Output
Indicated attribute or a list of the available attributes
TidierVest.html_elements
— MethodReturns HTML elements
html_elements(html,string)
Input:
html
– It can be HTMLDocument, HTMLElement or Vector{HTMLNode}string
– It's the element in the HTML that you want to find. It can be a String or Vector{String}, if the latter, it will apply the function in sequence
Output
Your HTML reduced to the element that you indicated
TidierVest.html_table
— MethodTakes some HTML and turns it into a DataFrame, only if there is a very clear HTML Table.
htmltable(tablehtml)
Input:
table_html
– Vector{HTMLNode}
Output
A DataFrame
TidierVest.html_text
— MethodReturns the text of an HTML but with some whitespaces
TidierVest.html_text2
— MethodReturns the text of an HTML, but cleaner than html_text
TidierVest.html_text3
— MethodReturns the text of an HTML.
html_text3(html)
Input:
html
– HTMLDocument, HTMLElement or Vector{HTMLNode}
Output
A single String or a Vector{String} depending on the input
If you want/need whitespaces and other things, you can use htmltext or htmltext2
TidierVest.minimal_html
— Methodminimal_html(html::AbstractString, title::AbstractString)::String
Takes some HTML and turns it into a minimal HTML document.
Input:
html
– HTML string that goes in the body of the documenttitle
– title of the document
Output:
html
– The complete HTML document
TidierVest.parse_html
— MethodReturns a parsed HTML from a string
parse_html(str::String)::HTMLDocument
Input:
str::String
Output
HTMLDocument
TidierVest.read_html
— MethodReturns a parsed HTML from an URL string
readhtml(url::String) readhtml(file::IOStream)
Input:
url::String
or
file::IOStream
r
Output
HTMLDocument