Docstrings
Harbest.read_html
— FunctionReturns a parsed HTML from an url
read_html(url::String)
Input:
url::String
Output
HTMLDocument
Harbest.html_elements
— FunctionReturns HTML elements
html_elements(html,string)
Input:
html
– It can be HTMLDocument, HTMLElement or Vector{HTMLNode}string
– It's the element in the HTML that you want to find. It can be a String or Vector{String}, if the latter, it will apply the function in sequence
Output
Your HTML reduced to the element that you indicated
Harbest.html_attrs
— FunctionGet an attribute
html_attrs(html,string)
Input:
html
– It can be HTMLDocument, HTMLElement or Vector{HTMLNode}string::String
(optional) – Define the attribute that you want to return, if not provided, it would try to return a list of the attributes.
Output
Indicated attribute or a list of the available attributes
Harbest.html_table
— FunctionTakes some HTML and turns it into a DataFrame, only if there is a very clear HTML Table.
htmltable(tablehtml)
Input:
table_html
– Vector{HTMLNode}
Output
A DataFrame
Harbest.html_text3
— FunctionReturns the text of an HTML.
html_text3(html)
Input:
html
– HTMLDocument, HTMLElement or Vector{HTMLNode}
Output
A single String or a Vector{String} depending on the input
If you want/need whitespaces and other things, you can use htmltext or htmltext2
Harbest.html_text2
— FunctionReturns the text of an HTML, but cleaner than html_text
Harbest.html_text
— FunctionReturns the text of an HTML but with some whitespaces