README.md 1.89 KB
Newer Older
Sebastian Hellmann's avatar
Sebastian Hellmann committed
1
2
# MARVIN-config

Sebastian Hellmann's avatar
Sebastian Hellmann committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Configuration files for MARVIN on the TIB servers, public for forking the architecture

# Acknowledgements
We thank Sören Auer and the Technische Informationsbibliothek (TIB) for providing three servers to run:

* the main DBpedia extraction on a monthly basis 
* community-provided extractors on Wikipedia, Wikidata or other sources 
* enrichment, cleaning and parsing services, so-called [Databus mods](https://github.com/dbpedia/databus-mods/) for open data on the Databus

This contribution by TIB is a great push towards incentivizing Open Data and establishing a global and national research and innovation data infrastructure. 

# Workflow

## Downloading the wikimedia dumps
TODO

## Running the extraction
TODO

## Deploy on Databus
TODO

## Run Databus-Derive (clone and parse)
Sebastian Hellmann's avatar
readme    
Sebastian Hellmann committed
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
On the respective server there is a user marvin-fetch, that has access to `/data/derive` containing the pom.xml of https://github.com/dbpedia/databus-maven-plugin/tree/master/dbpedia

```
# query to get all versions fro derive in xml syntax to paste directly into pom.xml
PREFIX dataid: <http://dataid.dbpedia.org/ns/core#>
SELECT distinct (?derive) WHERE {

    ?dataset dataid:group <https://databus.dbpedia.org/marvin/generic> .
    ?dataset dataid:artifact ?artifact .
    ?dataset dataid:version ?version .
    ?dataset dct:hasVersion "2019.08.30"^^xsd:string
	BIND (CONCAT("<version>",?artifact,"/${databus.deriveversion}</version>") as ?derive)
}
order by asc(?derive)


```
#######
# This is still manual, will be a cronjob soon
#######
su marvin-fetch
tmux a -t derive

WHAT=mappings
NEWVERSION=2019.08.30
# prepare
cd /data/derive/databus-maven-plugin/dbpedia/$WHAT
git pull
mvn versions:set -DnewVersion=$NEWVERSION
# run
mvn -T 23 databus-derive:clone -Ddatabus.deriveversion=$NEWVERSION
```
Sebastian Hellmann's avatar
Sebastian Hellmann committed
58
59
60
61
62
63
64

## Move data to download server (internal)
run marvin-fetch.sh script in databus/dbpedia folder

## Deploy official files