Skip to content
Snippets Groups Projects
Commit 6230c671 authored by David Fuhry's avatar David Fuhry
Browse files

Fix small things

parent 888431ca
No related branches found
No related tags found
1 merge request!71Documentation: Final Report
No preview for this file type
......@@ -98,7 +98,7 @@
While Wikipedia does have a \textit{Physicists} category\footnote{\url{https://en.wikipedia.org/wiki/Category:Physicists}},
it is fragmented into somewhat arbitrary subcategories and thus not optimal to use as a
collection.
However, Wikipedia also features a "List of physicists" which contains 981 articles
However, Wikipedia also features a "List of physicists"\footnote{\url{https://en.wikipedia.org/wiki/List_of_physicists}} which contains 981 articles
that were used to build the corpus. \par
Data scraping was done using the R package \textit{WikipediR} which is a wrapper around the Wikipedia
API.
......@@ -202,7 +202,7 @@ training examples. It was possible to configure the bot to meet our needs withou
restrictions. \par
Wikipedia articles are particularly well suited for the process of information extraction,
because they generally are composed consistently. The different levels of detail and therefore information
were an issue when dealing in using these articles. \par
were an issue in using these articles. \par
Concluding the textmining part of our project we can assess that the functions
using mainly NER tags (get\_awards.R and get\_university.R) have high recall and relatively low
precision. The function get\_spouses.R, which is working with pattern matching, has low recall
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment