Docstrings

TidierVest.html_attrsMethod

Get an attribute

html_attrs(html,string)

Input:

  • html – It can be HTMLDocument, HTMLElement or Vector{HTMLNode}
  • string::String (optional) – Define the attribute that you want to return, if not provided, it would try to return a list of the attributes.

Output

Indicated attribute or a list of the available attributes

source
TidierVest.html_childrenMethod

Get the children of an html

Input:

  • html – It can be HTMLDocument, HTMLElement or Vector{HTMLNode}

Output

Indicated attribute or a list of the available attributes

source
TidierVest.html_elementsMethod

Returns HTML elements

html_elements(html,string)

Input:

  • html – It can be HTMLDocument, HTMLElement or Vector{HTMLNode}
  • string – It's the element in the HTML that you want to find. It can be a String or Vector{String}, if the latter, it will apply the function in sequence

Output

Your HTML reduced to the element that you indicated

source
TidierVest.html_tableMethod

Takes some HTML and turns it into a DataFrame, only if there is a very clear HTML Table.

htmltable(tablehtml)

Input:

  • table_html – Vector{HTMLNode}

Output

A DataFrame

source
TidierVest.html_text3Method

Returns the text of an HTML.

html_text3(html)

Input:

  • html – HTMLDocument, HTMLElement or Vector{HTMLNode}

Output

A single String or a Vector{String} depending on the input

If you want/need whitespaces and other things, you can use htmltext or htmltext2

source
TidierVest.minimal_htmlMethod
minimal_html(html::AbstractString, title::AbstractString)::String

Takes some HTML and turns it into a minimal HTML document.

Input:

  • html – HTML string that goes in the body of the document
  • title – title of the document

Output:

  • html – The complete HTML document
source
TidierVest.parse_htmlMethod

Returns a parsed HTML from a string

parse_html(str::String)::HTMLDocument

Input:

  • str::String

Output

HTMLDocument

source
TidierVest.read_htmlMethod

Returns a parsed HTML from an URL string

readhtml(url::String) readhtml(file::IOStream)

Input:

  • url::String

or

  • file::IOStreamr

Output

HTMLDocument

source