|
@@ -1,4 +1,4 @@
|
|
|
-# Cookiecutter Cookiekaker Data Science Template inspired by @vasinkd and @drivendata
|
|
|
|
|
|
|
+# Cookiekaker Cookiecutter Data Science Template inspired by @vasinkd and @drivendata
|
|
|
|
|
|
|
|
_A not quite logical, nad unreasonably standardized, but flexible project structure for doing and sharing data science work at certain motivation and place._
|
|
_A not quite logical, nad unreasonably standardized, but flexible project structure for doing and sharing data science work at certain motivation and place._
|
|
|
|
|
|
|
@@ -26,20 +26,9 @@ $ cookiecutter https://github.com/metya/cookiekaker
|
|
|
```
|
|
```
|
|
|
|
|
|
|
|
__Features:__
|
|
__Features:__
|
|
|
-- May choose python3.5, python3.6, python3.7, python3.8
|
|
|
|
|
-
|
|
|
|
|
-
|
|
|
|
|
- Creation of virtual envronment is limited to virtualenv.
|
|
- Creation of virtual envronment is limited to virtualenv.
|
|
|
- Creation of virtual envronment also sets up git vcs and dvc vcs and pre-commit hooks
|
|
- Creation of virtual envronment also sets up git vcs and dvc vcs and pre-commit hooks
|
|
|
-- Project library renamed from src to project_name which lets you use the created library on your machine from anythere
|
|
|
|
|
-- Added pipeline folder to store all dvc pipelines there
|
|
|
|
|
-- Added data/features folder
|
|
|
|
|
-- Added settings.py to illustrate how to use .env file
|
|
|
|
|
-- Added an empty noteboook "1.0-{{cookiecutter.author_name}}-dvc-pipeline.ipynb" to store all dvc pipelines creation commands and to illustrate that numeration of notebooks is a good idea
|
|
|
|
|
-- Cleared make_dataset.py since I find it too restrictive and confusing
|
|
|
|
|
-- Removed aws sync functions
|
|
|
|
|
-- Removed data folder from .gitignore since dvc version control takes care of .gitignore
|
|
|
|
|
-- Removed tox.ini since .pre-commit.yaml is enough for me
|
|
|
|
|
|
|
+- Good sturcture of folders from SOTA projects
|
|
|
|
|
|
|
|
### The resulting directory structure
|
|
### The resulting directory structure
|
|
|
------------
|
|
------------
|
|
@@ -54,8 +43,6 @@ The directory structure of your new project looks like this:
|
|
|
│ ├── external <- Data from third party sources.
|
|
│ ├── external <- Data from third party sources.
|
|
|
│ ├── interim <- Intermediate data that has been transformed.
|
|
│ ├── interim <- Intermediate data that has been transformed.
|
|
|
│ ├── processed <- The final, canonical data sets for modeling.
|
|
│ ├── processed <- The final, canonical data sets for modeling.
|
|
|
-│ ├── features <- Features may be stored here
|
|
|
|
|
-│ ├── inference <- Inference stages may be stored here
|
|
|
|
|
│ └── raw <- The original, immutable data dump.
|
|
│ └── raw <- The original, immutable data dump.
|
|
|
│
|
|
│
|
|
|
├── docs <- A default Sphinx project; see sphinx-doc.org for details
|
|
├── docs <- A default Sphinx project; see sphinx-doc.org for details
|
|
@@ -66,8 +53,6 @@ The directory structure of your new project looks like this:
|
|
|
│ the creator's initials, and a short `-` delimited description, e.g.
|
|
│ the creator's initials, and a short `-` delimited description, e.g.
|
|
|
│ `1.0-jqp-initial-data-exploration`.
|
|
│ `1.0-jqp-initial-data-exploration`.
|
|
|
│
|
|
│
|
|
|
-├── references <- Data dictionaries, manuals, and all other explanatory materials.
|
|
|
|
|
-│
|
|
|
|
|
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
|
|
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
|
|
|
│ └── figures <- Generated graphics and figures to be used in reporting
|
|
│ └── figures <- Generated graphics and figures to be used in reporting
|
|
|
│
|
|
│
|
|
@@ -78,17 +63,11 @@ The directory structure of your new project looks like this:
|
|
|
│
|
|
│
|
|
|
├── __init__.py
|
|
├── __init__.py
|
|
|
│
|
|
│
|
|
|
-└── <project_name> <- Source code for use in this project.
|
|
|
|
|
|
|
+└── nets <- Source code for nets and other stuff use in this project.
|
|
|
├── __init__.py <- Makes {{cookiecutter.repo_name}} a Python module
|
|
├── __init__.py <- Makes {{cookiecutter.repo_name}} a Python module
|
|
|
│
|
|
│
|
|
|
├── settings.py <- illustrates how to use .env file
|
|
├── settings.py <- illustrates how to use .env file
|
|
|
│
|
|
│
|
|
|
- ├── data <- Scripts to download or generate data
|
|
|
|
|
- │ └── make_dataset.py
|
|
|
|
|
- │
|
|
|
|
|
- ├── features <- Scripts to turn raw data into features for modeling
|
|
|
|
|
- │ └── featurize.py
|
|
|
|
|
- │
|
|
|
|
|
└── models <- Scripts to train models and then use trained models to make
|
|
└── models <- Scripts to train models and then use trained models to make
|
|
|
│ predictions
|
|
│ predictions
|
|
|
└── train.py
|
|
└── train.py
|