Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
clean_html.Rd 631 B
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/clean_html.R
\name{clean_html}
\alias{clean_html}
\title{Clean a html formatted wikipedia page.
Nodes of interest from the DOM are extracted and then cleaned from all html
tags and annotations.}
\usage{
clean_html(html)
}
\arguments{
\item{html}{Url linking to a wikipedia webpage or a html formatted document.}
}
\value{
Plaintext document containing only the maintext of the give wikipedia page.
}
\description{
Clean a html formatted wikipedia page.
Nodes of interest from the DOM are extracted and then cleaned from all html
tags and annotations.
}