Skip to content
Snippets Groups Projects
user avatar
thiuda authored
843c16a3
History

Constructivist Machine Learning (conML)

Dependencies

  • Python >= 3.7.4
  • (optional) conda or virtualenv

Installation

  1. Install your favorite virtual environment and activate it, e.g. conda (recommended) or virtualenv. Run all commands inside your virtual enviroment.

  2. Move into your project folder and install all dependencies.

pip install -r requirements.txt
  1. Install conML.
python setup.py install

Documentation

You'll find the documentation at here.

Quick Start

1. Define main ingredients

  1. Begin by importing the conML module:
import conML
  1. Now, request the constructor. The constructor needs a list of tuples consisting of an instantiated unsupervised machine learning models from scikit learn library and the corresponding abbreviations. In addition, the type of construction must also be specified. Currently only conceptual construction is supported.
from sklearn.cluster import KMeans
from sklearn.cluster import AgglomerativeClustering

unsup_models = [("Kme", Kmeans()), ("Agg", AgglormerativeClustering())]
constructor = conML.construction("conceptual", unsup_models)
  1. The second component is the feature selector. The feature selector consists of filter methods and embedded methods. It is important to know that embedded methods are applied as soon as a predefined number of features or samples is exceeded, otherwise filter methods are used.
from sklearn.feature_selection import VarianceTrheshold, SelectFromModel
from sklearn.ensemble import ExtraTreesClassifier

filter = VarianceThreshold(2000)
embedded = SelectFromModel(ExtraTreesClassifier())
selector = conML.feature_selection(filter_method=variance, embedded_method=embedded)
  1. Next, you define a reconstructor. The definition follows the same previous scheme, only this time you have to select supervised learning models.
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier

sup_models = [("Rf", RandomForestClassifier()),
              ("Svc", SVC()),
              ("Kne", KNeighborsClassifier())]

reconstructor = conML.reconstruction("conceptual", sup_models)
  1. Finally, you define the deconstructor. The only dependency that the deconstructor needs is the previous defined reconstructor.
deconstructor = conML.deconstruction("conceptual", reconstructor)
  1. The knowledge search operates on blocks of the data type pandas.DataFrame. It is recommended to pass the blocks with the help of a generator. First lets load the example dataset. You should name the features as «0.0.n», where n is the feature number. T column should contain the timestamps, Sigma and Z should be empty.
import os
import pandas

path = os.path.join(os.path.expanduser("~"), ".conML", "toyset.csv")
columns_names = [f"0.0.{i}" for i in range(1, 353)] + ["T", "Sigma", "Z"]

df = pd.read_table(path, index_col=False, sep=" ", names=columns_names)
df["Z"], df["Sigma"] = "", ""
  1. After loading the example dataset, define the generator who yields 100 sample blocks.
block_size = 100

def generate_blocks():
    for start in range(0, df.shape[0], block_size):
        yield df.iloc[start:start+block_size]

2. Starting the knowledge search

After defining the individual components and the block generator the knowledge search can be started. Use a contextmanager to get a KnowlegeSearcher object. After every block processing, the database and the unused fraction of the block is returned. Save them in a list for later analysis. To track the knowledge search pass True to stdout parameter.

components = (constructor, selector, reconstructor, deconstructor)
dbs, haldes = [], []

with conML.knowledge_searcher(*components, stdout=True) as searcher:
    for block in generate_blocks():
        db, halde = searcher.search(block)
        dbs.append(db)
        haldes.append(halde)

3. Saving the knowledge database

Now that you have successfully completed the knowledge search save the database on your harddrive.

home_path = os.path.expanduser("~")
db.save(home_path)