diff --git a/INSTALL.md b/INSTALL.md index 2eca6d63122c4b48fd7d9928459317d1e4982faa..dad2252a7823ec27e964aa4d38a45c4cd3ccfe35 100644 --- a/INSTALL.md +++ b/INSTALL.md @@ -80,7 +80,7 @@ R CMD build wikiproc R CMD INSTALL wikiproc_<version>.tar.gz ``` -That's it. You should be good to go and run the master script now. +That's it for R. Run the master script and then go ahead to **Bot Setup** to install and run the rasa bot. ## Manual installation @@ -169,26 +169,28 @@ R CMD INSTALL wikiproc_<version>.tar.gz ``` That's it. You should be good to go and run the master script now. + ======= + ## Bot Setup In order to setup and run the [Rasa Bot](https://rasa.com/docs/) we recommend to use a [conda](https://conda.io/docs/user-guide/getting-started.html#managing-environmentsß) environment again with Python 3.6.7 -``` +```{bash} conda create -n rasa_env python=3.6.7 source activate rasa_env ``` You need to install [Rasa Core](https://rasa.com/docs/core/installation/) and [Rasa NLU](https://rasa.com/docs/nlu/installation/) to run the Bot -``` +```{bash} pip install rasa_nlu pip install rasa_core ``` Install the pipeline -``` +```{bash} pip install sklearn_crfsuite pip install spacy @@ -198,17 +200,17 @@ python -m spacy link en_core_web_md en Now you can train and run the bot -``` +```{bash} cd rasa/ make train ``` -``` +```{bash} make run ``` Run in [Debug Mode](https://rasa.com/docs/core/debugging/) for more logging -``` +```{bash} make run-debug ``` diff --git a/LICENSE b/LICENSE index 6fe5ec17cbbd84186418d145888e6e8fe93e45d8..c0b41b03e13e2c27336ee902aa5009d419760d33 100644 --- a/LICENSE +++ b/LICENSE @@ -631,8 +631,8 @@ to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. - Wiki Rasa - Copyright (C) 2019 tm-chatbot + David Fuhry, Leonard Haas, Lukas Gehrke, Lucas Schons and Jonas Wolff + Copyright (C) 2019 Text Mining - Chatbot This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -652,7 +652,8 @@ Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: - Wiki Rasa Copyright (C) 2019 tm-chatbot + David Fuhry, Leonard Haas, Lukas Gehrke, Lucas Schons and Jonas Wolff Copyright (C) 2019 + Text Mining - Chatbot This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. diff --git a/README.md b/README.md index c0efa49fa51d1653cb2496a7adcb720e05919375..5a72589082f4825ace6b9be03748d7c846d78e87 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,86 @@ # Wiki Rasa +## Overview + +This repository contains all files required to download data from Wikipedia, process that data to extract facts about physicists and build a chatbot based on the rasa framework with that information. + +### Prerequisites + +The R script assumes all the packages in the `packages.list` file are installed within R. You may do this with: + +```{R} +install.packages(readLines('/processing/packages.list')) +``` + +Furthermore you will need to have an spacy installation with the english language data installed. By default the script will assume to find this in a conda environment named `spcy`, if you need to change that do so in the `Master.R` file. + +To build the **wikiproc** package navigate to the processing directory and run: + +```bash +R CMD build wikiproc +R CMD INSTALL wikiproc_<version>.tar.gz +``` + +_Note: This will require the [R Tools](https://cran.r-project.org/bin/windows/Rtools/) on windows and possibly additional packages on *nix platforms._ + +To run the rasa bot rasa will need to be installed. It is recommended to do that in a conda environment, you may create one with: + +```{bash} +conda create -n rasa_env python=3.6.7 +source activate rasa_env +``` + +In this environment install rasa_nlu, rasa_core, sklean_crfsuite and spacy. Also download the spacy en_core_web_md language data. + +```{bash} +pip install rasa_nlu +pip install rasa_core +pip install sklearn_crfsuite +pip install spacy +python -m spacy download en_core_web_md +python -m spacy link en_core_web_md en +``` + +### Running + +The data processing side is done by the `Master.R` script in the `processing/script` folder. The script assumes the working direcory to be somewhere within the base directory `wiki-rasa` so make sure to either call `Rscript` from within this directory or to set the working directory in R here prior to sourcing. Easiest way is to call the script from the base directory of the repository: + +```{bash} +Rscript processing/script/Master.R +``` + +This will download the required data, process it and generate the data file required for the chat bot. After that train the bot (don't forget to activate the conda environment if you're using one). + +```{bash} +cd rasa/ +make train +``` + +You're ready to run the bot. + +```{bash} +make run +``` + +### Installing on debian + +For a detailed guide on installing on a Debian 9 machine take a look at [Installation](INSTALL.md). + +### Building the docker + +**_Work in progress_** + +Run the build script for your system, e.g. on Windows `build_docker.bat` or `build_docker.sh` on Linux. + +After that you should be good to start the docker with + +_Note: This will do all processing including data download in the docker container and thus results in a rather large container. +Container size will be reduced in the future_ + +```{bash} +docker run -it chatbot +``` + ## Contributing Before merging please make sure to check the following: @@ -32,38 +113,3 @@ When writing a function to extract a feature use the following as guidelines: * If your function is to be visible from the outside, make sure to add `@export` to the roxygen comment * Set the working directory to `wikiproc` and call `devtools::document()` * Step into `processing` and use `devtools::install("wikiproc")` to install the package - -## Installation - -You may use this software by installing the **wikiproc** package and then running the `master.R` script. There are also directions on how to install from scratch on a debian vm and on how to build a docker. - -### General prerequisites - -The script assumes all the packages in the `packages.list` file are installed within R. Furthermore you will need to have an spacy installation with the english language data installed. By default the script will assume to find this in a conda environment named `spcy`, if you need to change that do so in the `ProcessNER.R` file. - -To build the **wikiproc** package navigate to the processing directory and run: - -```bash -R CMD build wikiproc -R CMD INSTALL wikiproc_<version>.tar.gz -``` - -_Note: This will require the [R Tools](https://cran.r-project.org/bin/windows/Rtools/) on windows and possibly additional packages on *nix platforms._ - -The data processing side is done by the `Master.R` script in the `r` folder. This may be called via `Rscript r/Master.R` from any command line or via `source("r/Master.R")` from within R. The script assumes the working direcory to be the base directory `wiki-rasa` so make sure to either call `Rscript` from within this directory or to set the working directory in R here prior to sourcing. - -### Installing on debian - -For a detailed guide on installing on a Debian 9 machine take a look at [Installation](docs/install_debian.md). - -### Building the docker - -**_Work in progress_** - -Run the build script for your system, e.g. on Windows `build_docker.bat` or `build_docker.sh` on Linux. - -After that you should be good to start the docker with - -```sh -docker run -it chatbot -```