PRIMAT is an open source (ALv2) toolbox for the definition and execution of PPRL workflows.
It offers several components for data owners and the central linkage unit that provide state-of-the-art PPRL methods,
including Bloom-filter-based encoding and hardening techniques, LSH-based blocking, metric space filtering,
post-processing and more.
It offers modules for data owners and the linkage unit that provide state-of-the-art PPRL methods,
including Bloom-filter-based encoding and hardening techniques, LSH-based blocking, post-processing (clustering) and more.
[PRIMAT](https://dl.acm.org/citation.cfm?doid=3352063.3360392) is developed by the [Database Group](https://dbs.uni-leipzig.de/research/projects/pper_big_data) of the University of Leipzig, Germany.
## Using PRIMAT
To use PRIMAT in your project, simply add the following dependency to your build tool
```xml
<dependency>
<groupId>de.uni-leipzig.dbs.pprl</groupId>
<artifactId>primat-data-owner</artifactId>
<version>1.0.1</version>
</dependency>
```
for data owner components, including pre-processing and encoding methods, or
```xml
<dependency>
<groupId>de.uni-leipzig.dbs.pprl</groupId>
<artifactId>primat-linkage-unit</artifactId>
<version>1.0.1</version>
</dependency>
```
for linkage unit components, including linkage and post-processing (clustering) methods.
## PRIMAT Modules
-`primat-common` - Contains shared data model and various utility function, e.g, for input file handling, hashing, feature extraction.
-`primat-data-owner` - Contains typical pre-processing functions as well as techniques to encode/mask records for PPRL.
-`primat-linkage-unit` - Provides functionalities for batch and incremental linkage workflows, including blocking, similarity calculation, classification, post-processing (clustering) and evaluation.
-`primat-examples` - Contains example workflows showing use cases for PRIMAT.
## Privacy-preserving Record Linkage
- Task of identifying record in different databases reffering to the same person
...
...
@@ -25,7 +58,7 @@ post-processing and more.
- Scalability to millions of records
- High linkage quality
## PRIMAT
## PRIMAT: Overview
- PPRL tool covering the entire PPRL life-cycle
- Flexible definition and execution of PPRL workflows
...
...
@@ -53,11 +86,11 @@ post-processing and more.
|Component/Module | Function/Feature | Status |
|-----------------|------------------|--------|
| Data generator & corruptor | - Data generation<br> - Data corruption | Implemented<br>Planned |
| Data generator & corruptor | - Data generation<br> - Data corruption | Integration outstanding<br>Planned |