Compare revisions

franzi - schranzi · franzi - schranzi · Paul Kuehnel · Paul Kuehnel · Paul Kuehnel · 595807db
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
+image: python:3.9
+# Change pip's cache directory to be inside the project directory since we can
+# only cache local items.
+variables:
+  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
+# Pip's cache doesn't store the python packages
+# https://pip.pypa.io/en/stable/topics/caching/
+#
+# If you want to also cache the installed packages, you have to install
+# them in a virtualenv and cache it as well.
+cache:
+  paths:
+    - .cache/pip
+    - venv/
+before_script:
+  - python --version ; pip --version  # For debugging
+  - pip install virtualenv
+  - virtualenv venv
+  - source venv/bin/activate
+#test:
+#  script:
+#    - pip install ruff tox  # you can also use tox
+#    - pip install --editable ".[test]"
+#    - tox -e py,ruff
+test:
+  script:
+    - pip install -r requirements.txt
+    - python test.py
+  #artifacts:
+  #  paths:
+  #    - build/*
+pages:
+  script:
+    - pip install sphinx sphinx-rtd-theme
+    - cd docs
+    - make html
+    - mv build/html/ ../public/
+  artifacts:
+    paths:
+      - public
+  rules:
+    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
+#deploy:
+#  stage: deploy
+#  script: echo "Define your deployment script!"
+#  environment: production
--- a/docs/source/Notizen.md
+++ b/docs/source/Notizen.md
 # Big Data Praktikum Autoencoder 
 ## 0. Test Data Frame 
-    * auseinandersetzung mit loom-file 
+    * auseinandersetzung mit loom-file (loompy verwenden, loom Datei einlesen & pandas erstellen)
    * checken wie ein repräsentatives Pandas Data Frame aussehen kann 
    * eins zum testen erstellen. 
 ### 0.1 Datenformat
@@ -9,10 +9,10 @@
    * ever row is a gene which can be expressed or not.. 
 ## 1. Data-Access
-1. Get Data from API 
+    1. Get Data from API 
-2. get it into a "good" format 
+    2. get it into a "good" format 
-    -> what do we need ? 
+        -> what do we need ? 
-### 1.2. Loom-Files
+# 1.1. Loom-Files
    * Idee: Meta Data durch ID behalten, Daten ohne Metadaten nehmen & in Autoencoder reintun. 
    * Idee: Pandas Data Frames extrahieren für den Autoencoder. 
            also jeder Data Frame = 1 Input für Encoder.
@@ -26,5 +26,16 @@
 ## 2. Auto Encoder 
+* reinlesen 
+* welche bibliotheken ? 
+* scientific computing resources, wie verwendet man die ? 
 ## 3. Visualisation
+Ziel: Visualisierung des Clusters 
+* latent space 
+* Idee: Autoencoder output, z.B. 50 Dimensionen 
+    * Darauf Dimensionsreduktions-Algorithmen anwenden 
+        * z.B. t-SNE und uMap  was macht Sinn, auf wie viele Dimensionen wollen wir runter ? Glaube auf 2
+        * Clusteranalyse dann auf 2-dimensionalem anwenden ? 
+        * welche.. kMeans z.B. ? 
+        * [Dokument zu Clusteralgorithmen](https://www.kde.cs.uni-kassel.de/wp-content/uploads/ws/LLWA03/fgml/final/Kirchner.pdf)
No results found