diff --git a/README.txt b/README.txt index d8e190b..6a380fa 100644 --- a/README.txt +++ b/README.txt @@ -1,37 +1,181 @@ -# (from) July 7 2022 -# This repo will be actively worked on from now. -# This file will be my ToDo list of things to take care of +# Docker Compose setup for CKAN -ckan/ckan-docker-base: For the base images Dockerfiles (prod and dev) and related scripts -ckan/ckan-docker: For the project-oriented image template (prod and dev). Patching only done in Dev ### This repo! -All the other images should live in separate repos +* [Overview](#overview) +* [Quick start](#quick-start) +* [Development mode](#development-mode) + * [Create an extension](#create-an-extension) + * [Running the debugger (pdb / ipdb)](#running-the-debugger-pdb--ipdb) +* [CKAN images](#ckan-images) + * [Extending the base images](#extending-the-base-images) + * [Applying patches](#applying-patches) +* [Known Issues](#known-issues) +* [License](#license) -1. Solr - use ckan-solr (https://github.com/ckan/ckan-solr) -2. PostgreSQL - use current method (base image: postgres:12-alpine from DockerHub, enhanced in a Dockerfile) - ### This may change to be more like Solr though -3. Redis - use current method (DockerHub image: redis:${REDIS_VERSION} specified as a compose service in the compose file) - latest image to used is redis:6 -4. nginx - base image: nginx:1.19.8-alpine from DockerHub, enhanced in a Dockerfile) -5. DataPusher - built from the actual datapusher repo (https://github.com/ckan/datapusher) -6. CKAN - built from the ckan/ckan-base:2.9.5 base image (which is built from the ckan/ckan-docker-base repo) -7. CKAN Worker - add new (ckan worker) container in the compose setup -Versions 2.9 and 2.10 (when it's out) only. Plan the repo layout for having multiple versions - OKFN could used as an example +## Overview -Go through all the new changes in the current repo and use those for the new repo if they make sense -- Francesco's PR https://github.com/ckan/ckan/pull/4635 which is a beauty! -- use FROM ubuntu:focal for ckan -- Health Checks https://github.com/ckan/ckan/pull/6812 -- Restarts https://github.com/ckan/ckan/pull/6569 -- Make asure ARGs are used if they are added to compose file -- Check out Florian's docs https://github.com/dbca-wa/ckan/blob/dbca2022/doc/maintaining/installing/install-from-docker-compose.rst -- Check out Florian's repo https://github.com/dbca-wa/ckan/tree/dbca2022 -- Documentation to be re-done from scratch...anything that could be useful can be mentioned here eg: local storage for ckan.ini +This is a set of Docker images and configuration files to run a CKAN site. -- Had to update the prerun.py script as it was failing on check_solr_connection +| CKAN version | Docker tag production | Docker tag development | Notes | +| --- | --- | --- | --- | +| 2.7 | `openknowledge/ckan-base:2.7` | `openknowledge/ckan-dev:2.7` | | +| 2.8 | `openknowledge/ckan-base:2.8` | `openknowledge/ckan-dev:2.8` | | +| 2.9 | `openknowledge/ckan-base:2.9` | `openknowledge/ckan-dev:2.9` | If you need Python 2 images use the `2.9-py2` tags (not recommended) | +| master | `openknowledge/ckan-base:master` | `openknowledge/ckan-dev:master` | The `master` images are updated daily so they might be slightly out of date | -ToDo (workarounds to fix) -1. nginx - what caching should I implement? -2. DataPusher - needed to use a custom requirements.txt (see https://github.com/ckan/datapusher/pull/251) \ No newline at end of file +It includes the following images, all based on [Alpine Linux](https://alpinelinux.org/): + +* CKAN: modified from keitaro/ckan (see [CKAN Images](#ckan-images)) for more details). File uploads are stored in a named volume. +* DataPusher: modified from keitaro/datapusher +* PostgreSQL: Official PostgreSQL image. Database files are stored in a named volume. +* Solr: CKAN's [pre-configured Solr image](https://github.com/ckan/ckan-solr). Index data is stored in a named volume. +* Redis: standard Redis image + +The site is configured via env vars (the base CKAN image loads [ckanext-envvars](https://github.com/okfn/ckanext-envvars)), that you can set in the `.env` file. + +## Quick start + +Copy the included `.env.example` and rename it to `.env` to modify it depending on your own needs. + +Using the default values on the `.env.example` file will get you a working CKAN instance. There is a sysadmin user created by default with the values defined in `CKAN_SYSADMIN_NAME` and `CKAN_SYSADMIN_PASSWORD`(`ckan_admin` and `test1234` by default). I shouldn't be telling you this but obviously don't run any public CKAN instance with the default settings. + +To build the images: + + docker-compose build + +To start the containers: + + docker-compose up + +## Development mode + +To develop local extensions use the `docker-compose.dev.yml` file: + +To build the images: + + docker-compose -f docker-compose.dev.yml build + +To start the containers: + + docker-compose -f docker-compose.dev.yml up + +See [CKAN Images](#ckan-images) for more details of what happens when using development mode. + + +### Create an extension + +You can use the paster template in much the same way as a source install, only executing the command inside the CKAN container and setting the mounted `src/` folder as output: + + docker-compose -f docker-compose.dev.yml exec ckan-dev /bin/bash -c "paster --plugin=ckan create -t ckanext ckanext-myext -o /srv/app/src_extensions" + +From CKAN 2.9 onwards, the `paster` command used for common CKAN administration tasks has been replaced with the `ckan` command. You can create an extension as the previous version by executing the command inside the CKAN container and setting the mounted `src/` folder as output: + + docker-compose -f docker-compose.dev.yml exec ckan-dev /bin/bash -c "ckan generate extension --output-dir /srv/app/src_extensions" + +The new extension will be created in the `src/` folder. You might need to change the owner of its folder to have the appropiate permissions. + + +### Running the debugger (pdb / ipdb) + +To run a container and be able to add a breakpoint with `pdb` or `ipdb`, run the `ckan-dev` container with the `--service-ports` option: + + docker-compose -f docker-compose.dev.yml run --service-ports ckan-dev + +This will start a new container, displaying the standard output in your terminal. If you add a breakpoint in a source file in the `src` folder (`import pdb; pdb.set_trace()`) you will be able to inspect it in this terminal next time the code is executed. + + +## CKAN images + +``` + +-------------------------+ +----------+ + | | | | + | openknowledge/ckan-base +----------------> ckan | (production) + | | | | + +-----------+-------------+ +----------+ + | + | + +-----------v------------+ +----------+ + | | | | + | openknowledge/ckan-dev +-----------------> ckan | (development) + | | | | + +------------------------+ +----------+ + + +``` + +The Docker images used to build your CKAN project are located in the `ckan/` folder. There are two Docker files: + +* `Dockerfile`: this is based on `openknowledge/ckan-base` (with the `Dockerfile` on the `/ckan-base/` folder), an image with CKAN with all its dependencies, properly configured and running on [uWSGI](https://uwsgi-docs.readthedocs.io/en/latest/) (production setup) +* `Dockerfile.dev`: this is based on `openknowledge/ckan-dev` (with the `Dockerfile` on the `/ckan-dev/` folder), wich extends `openknowledge/ckan-base` to include: + + * Any extension cloned on the `src` folder will be installed in the CKAN container when booting up Docker Compose (`docker-compose up`). This includes installing any requirements listed in a `requirements.txt` (or `pip-requirements.txt`) file and running `python setup.py develop`. + * The CKAN image used will development requirements needed to run the tests . + * CKAN will be started running on the paster development server, with the `--reload` option to watch changes in the extension files. + * Make sure to add the local plugins to the `CKAN__PLUGINS` env var in the `.env` file. + +From these two base images you can build your own customized image tailored to your project, installing any extensions and extra requirements needed. + +### Extending the base images + +To perform extra initialization steps you can add scripts to your custom images and copy them to the `/docker-entrypoint.d` folder (The folder should be created for you when you build the image). Any `*.sh` and `*.py` file in that folder will be executed just after the main initialization script ([`prerun.py`](https://github.com/okfn/docker-ckan/blob/master/ckan-base/setup/prerun.py)) is executed and just before the web server and supervisor processes are started. + +For instance, consider the following custom image: + +``` +ckan +├── docker-entrypoint.d +│ └── setup_validation.sh +├── Dockerfile +└── Dockerfile.dev + +``` + +We want to install an extension like [ckanext-validation](https://github.com/frictionlessdata/ckanext-validation) that needs to create database tables on startup time. We create a `setup_validation.sh` script in a `docker-entrypoint.d` folder with the necessary commands: + +```bash +#!/bin/bash + +# Create DB tables if not there +paster --plugin=ckanext-validation validation init-db -c $CKAN_INI +``` + +And then in our `Dockerfile` we install the extension and copy the initialization scripts: + +```Dockerfile +FROM openknowledge/ckan-dev:2.9 + +RUN pip install -e git+https://github.com/frictionlessdata/ckanext-validation.git#egg=ckanext-validation && \ + pip install -r https://raw.githubusercontent.com/frictionlessdata/ckanext-validation/master/requirements.txt + +COPY docker-entrypoint.d/* /docker-entrypoint.d/ +``` + +### Applying patches + +When building your project specific CKAN images (the ones defined in the `ckan/` folder), you can apply patches +to CKAN core or any of the built extensions. To do so create a folder inside `ckan/patches` with the name of the +package to patch (ie `ckan` or `ckanext-??`). Inside you can place patch files that will be applied when building +the images. The patches will be applied in alphabetical order, so you can prefix them sequentially if necessary. + +For instance, check the following example image folder: + +``` +ckan +├── patches +│ ├── ckan +│ │ ├── 01_datasets_per_page.patch +│ │ ├── 02_groups_per_page.patch +│ │ ├── 03_or_filters.patch +│ └── ckanext-harvest +│ └── 01_resubmit_objects.patch +├── Dockerfile +└── Dockerfile.dev + +``` + + +## Known Issues + +* Running the tests: Running the tests for CKAN or an extension inside the container will delete your current database. We need to patch CKAN core in our image to work around that. \ No newline at end of file diff --git a/OLD.README.txt b/auxilliary stuff/OLD.README.txt similarity index 100% rename from OLD.README.txt rename to auxilliary stuff/OLD.README.txt diff --git a/auxilliary stuff/TODO.LIVE.txt b/auxilliary stuff/TODO.LIVE.txt new file mode 100644 index 0000000..d8e190b --- /dev/null +++ b/auxilliary stuff/TODO.LIVE.txt @@ -0,0 +1,37 @@ +# (from) July 7 2022 +# This repo will be actively worked on from now. +# This file will be my ToDo list of things to take care of + +ckan/ckan-docker-base: For the base images Dockerfiles (prod and dev) and related scripts +ckan/ckan-docker: For the project-oriented image template (prod and dev). Patching only done in Dev ### This repo! + +All the other images should live in separate repos + +1. Solr - use ckan-solr (https://github.com/ckan/ckan-solr) +2. PostgreSQL - use current method (base image: postgres:12-alpine from DockerHub, enhanced in a Dockerfile) + ### This may change to be more like Solr though +3. Redis - use current method (DockerHub image: redis:${REDIS_VERSION} specified as a compose service in the compose file) + latest image to used is redis:6 +4. nginx - base image: nginx:1.19.8-alpine from DockerHub, enhanced in a Dockerfile) +5. DataPusher - built from the actual datapusher repo (https://github.com/ckan/datapusher) +6. CKAN - built from the ckan/ckan-base:2.9.5 base image (which is built from the ckan/ckan-docker-base repo) +7. CKAN Worker - add new (ckan worker) container in the compose setup + +Versions 2.9 and 2.10 (when it's out) only. Plan the repo layout for having multiple versions - OKFN could used as an example + +Go through all the new changes in the current repo and use those for the new repo if they make sense +- Francesco's PR https://github.com/ckan/ckan/pull/4635 which is a beauty! +- use FROM ubuntu:focal for ckan +- Health Checks https://github.com/ckan/ckan/pull/6812 +- Restarts https://github.com/ckan/ckan/pull/6569 +- Make asure ARGs are used if they are added to compose file +- Check out Florian's docs https://github.com/dbca-wa/ckan/blob/dbca2022/doc/maintaining/installing/install-from-docker-compose.rst +- Check out Florian's repo https://github.com/dbca-wa/ckan/tree/dbca2022 +- Documentation to be re-done from scratch...anything that could be useful can be mentioned here eg: local storage for ckan.ini + +- Had to update the prerun.py script as it was failing on check_solr_connection + +ToDo (workarounds to fix) + +1. nginx - what caching should I implement? +2. DataPusher - needed to use a custom requirements.txt (see https://github.com/ckan/datapusher/pull/251) \ No newline at end of file diff --git a/build-ckan-without-compose.sh b/auxilliary stuff/build-ckan-without-compose.sh similarity index 100% rename from build-ckan-without-compose.sh rename to auxilliary stuff/build-ckan-without-compose.sh diff --git a/environ b/auxilliary stuff/environ similarity index 100% rename from environ rename to auxilliary stuff/environ