This commit is contained in:
Brett 2022-07-18 10:48:08 +02:00
parent 3522bf54c3
commit e9579b254f
5 changed files with 210 additions and 29 deletions

View File

@ -1,37 +1,181 @@
# (from) July 7 2022
# This repo will be actively worked on from now.
# This file will be my ToDo list of things to take care of
# Docker Compose setup for CKAN
ckan/ckan-docker-base: For the base images Dockerfiles (prod and dev) and related scripts
ckan/ckan-docker: For the project-oriented image template (prod and dev). Patching only done in Dev ### This repo!
All the other images should live in separate repos
* [Overview](#overview)
* [Quick start](#quick-start)
* [Development mode](#development-mode)
* [Create an extension](#create-an-extension)
* [Running the debugger (pdb / ipdb)](#running-the-debugger-pdb--ipdb)
* [CKAN images](#ckan-images)
* [Extending the base images](#extending-the-base-images)
* [Applying patches](#applying-patches)
* [Known Issues](#known-issues)
* [License](#license)
1. Solr - use ckan-solr (https://github.com/ckan/ckan-solr)
2. PostgreSQL - use current method (base image: postgres:12-alpine from DockerHub, enhanced in a Dockerfile)
### This may change to be more like Solr though
3. Redis - use current method (DockerHub image: redis:${REDIS_VERSION} specified as a compose service in the compose file)
latest image to used is redis:6
4. nginx - base image: nginx:1.19.8-alpine from DockerHub, enhanced in a Dockerfile)
5. DataPusher - built from the actual datapusher repo (https://github.com/ckan/datapusher)
6. CKAN - built from the ckan/ckan-base:2.9.5 base image (which is built from the ckan/ckan-docker-base repo)
7. CKAN Worker - add new (ckan worker) container in the compose setup
Versions 2.9 and 2.10 (when it's out) only. Plan the repo layout for having multiple versions - OKFN could used as an example
## Overview
Go through all the new changes in the current repo and use those for the new repo if they make sense
- Francesco's PR https://github.com/ckan/ckan/pull/4635 which is a beauty!
- use FROM ubuntu:focal for ckan
- Health Checks https://github.com/ckan/ckan/pull/6812
- Restarts https://github.com/ckan/ckan/pull/6569
- Make asure ARGs are used if they are added to compose file
- Check out Florian's docs https://github.com/dbca-wa/ckan/blob/dbca2022/doc/maintaining/installing/install-from-docker-compose.rst
- Check out Florian's repo https://github.com/dbca-wa/ckan/tree/dbca2022
- Documentation to be re-done from scratch...anything that could be useful can be mentioned here eg: local storage for ckan.ini
This is a set of Docker images and configuration files to run a CKAN site.
- Had to update the prerun.py script as it was failing on check_solr_connection
| CKAN version | Docker tag production | Docker tag development | Notes |
| --- | --- | --- | --- |
| 2.7 | `openknowledge/ckan-base:2.7` | `openknowledge/ckan-dev:2.7` | |
| 2.8 | `openknowledge/ckan-base:2.8` | `openknowledge/ckan-dev:2.8` | |
| 2.9 | `openknowledge/ckan-base:2.9` | `openknowledge/ckan-dev:2.9` | If you need Python 2 images use the `2.9-py2` tags (not recommended) |
| master | `openknowledge/ckan-base:master` | `openknowledge/ckan-dev:master` | The `master` images are updated daily so they might be slightly out of date |
ToDo (workarounds to fix)
1. nginx - what caching should I implement?
2. DataPusher - needed to use a custom requirements.txt (see https://github.com/ckan/datapusher/pull/251)
It includes the following images, all based on [Alpine Linux](https://alpinelinux.org/):
* CKAN: modified from keitaro/ckan (see [CKAN Images](#ckan-images)) for more details). File uploads are stored in a named volume.
* DataPusher: modified from keitaro/datapusher
* PostgreSQL: Official PostgreSQL image. Database files are stored in a named volume.
* Solr: CKAN's [pre-configured Solr image](https://github.com/ckan/ckan-solr). Index data is stored in a named volume.
* Redis: standard Redis image
The site is configured via env vars (the base CKAN image loads [ckanext-envvars](https://github.com/okfn/ckanext-envvars)), that you can set in the `.env` file.
## Quick start
Copy the included `.env.example` and rename it to `.env` to modify it depending on your own needs.
Using the default values on the `.env.example` file will get you a working CKAN instance. There is a sysadmin user created by default with the values defined in `CKAN_SYSADMIN_NAME` and `CKAN_SYSADMIN_PASSWORD`(`ckan_admin` and `test1234` by default). I shouldn't be telling you this but obviously don't run any public CKAN instance with the default settings.
To build the images:
docker-compose build
To start the containers:
docker-compose up
## Development mode
To develop local extensions use the `docker-compose.dev.yml` file:
To build the images:
docker-compose -f docker-compose.dev.yml build
To start the containers:
docker-compose -f docker-compose.dev.yml up
See [CKAN Images](#ckan-images) for more details of what happens when using development mode.
### Create an extension
You can use the paster template in much the same way as a source install, only executing the command inside the CKAN container and setting the mounted `src/` folder as output:
docker-compose -f docker-compose.dev.yml exec ckan-dev /bin/bash -c "paster --plugin=ckan create -t ckanext ckanext-myext -o /srv/app/src_extensions"
From CKAN 2.9 onwards, the `paster` command used for common CKAN administration tasks has been replaced with the `ckan` command. You can create an extension as the previous version by executing the command inside the CKAN container and setting the mounted `src/` folder as output:
docker-compose -f docker-compose.dev.yml exec ckan-dev /bin/bash -c "ckan generate extension --output-dir /srv/app/src_extensions"
The new extension will be created in the `src/` folder. You might need to change the owner of its folder to have the appropiate permissions.
### Running the debugger (pdb / ipdb)
To run a container and be able to add a breakpoint with `pdb` or `ipdb`, run the `ckan-dev` container with the `--service-ports` option:
docker-compose -f docker-compose.dev.yml run --service-ports ckan-dev
This will start a new container, displaying the standard output in your terminal. If you add a breakpoint in a source file in the `src` folder (`import pdb; pdb.set_trace()`) you will be able to inspect it in this terminal next time the code is executed.
## CKAN images
```
+-------------------------+ +----------+
| | | |
| openknowledge/ckan-base +----------------> ckan | (production)
| | | |
+-----------+-------------+ +----------+
|
|
+-----------v------------+ +----------+
| | | |
| openknowledge/ckan-dev +-----------------> ckan | (development)
| | | |
+------------------------+ +----------+
```
The Docker images used to build your CKAN project are located in the `ckan/` folder. There are two Docker files:
* `Dockerfile`: this is based on `openknowledge/ckan-base` (with the `Dockerfile` on the `/ckan-base/<version>` folder), an image with CKAN with all its dependencies, properly configured and running on [uWSGI](https://uwsgi-docs.readthedocs.io/en/latest/) (production setup)
* `Dockerfile.dev`: this is based on `openknowledge/ckan-dev` (with the `Dockerfile` on the `/ckan-dev/<version>` folder), wich extends `openknowledge/ckan-base` to include:
* Any extension cloned on the `src` folder will be installed in the CKAN container when booting up Docker Compose (`docker-compose up`). This includes installing any requirements listed in a `requirements.txt` (or `pip-requirements.txt`) file and running `python setup.py develop`.
* The CKAN image used will development requirements needed to run the tests .
* CKAN will be started running on the paster development server, with the `--reload` option to watch changes in the extension files.
* Make sure to add the local plugins to the `CKAN__PLUGINS` env var in the `.env` file.
From these two base images you can build your own customized image tailored to your project, installing any extensions and extra requirements needed.
### Extending the base images
To perform extra initialization steps you can add scripts to your custom images and copy them to the `/docker-entrypoint.d` folder (The folder should be created for you when you build the image). Any `*.sh` and `*.py` file in that folder will be executed just after the main initialization script ([`prerun.py`](https://github.com/okfn/docker-ckan/blob/master/ckan-base/setup/prerun.py)) is executed and just before the web server and supervisor processes are started.
For instance, consider the following custom image:
```
ckan
├── docker-entrypoint.d
│ └── setup_validation.sh
├── Dockerfile
└── Dockerfile.dev
```
We want to install an extension like [ckanext-validation](https://github.com/frictionlessdata/ckanext-validation) that needs to create database tables on startup time. We create a `setup_validation.sh` script in a `docker-entrypoint.d` folder with the necessary commands:
```bash
#!/bin/bash
# Create DB tables if not there
paster --plugin=ckanext-validation validation init-db -c $CKAN_INI
```
And then in our `Dockerfile` we install the extension and copy the initialization scripts:
```Dockerfile
FROM openknowledge/ckan-dev:2.9
RUN pip install -e git+https://github.com/frictionlessdata/ckanext-validation.git#egg=ckanext-validation && \
pip install -r https://raw.githubusercontent.com/frictionlessdata/ckanext-validation/master/requirements.txt
COPY docker-entrypoint.d/* /docker-entrypoint.d/
```
### Applying patches
When building your project specific CKAN images (the ones defined in the `ckan/` folder), you can apply patches
to CKAN core or any of the built extensions. To do so create a folder inside `ckan/patches` with the name of the
package to patch (ie `ckan` or `ckanext-??`). Inside you can place patch files that will be applied when building
the images. The patches will be applied in alphabetical order, so you can prefix them sequentially if necessary.
For instance, check the following example image folder:
```
ckan
├── patches
│ ├── ckan
│ │ ├── 01_datasets_per_page.patch
│ │ ├── 02_groups_per_page.patch
│ │ ├── 03_or_filters.patch
│ └── ckanext-harvest
│ └── 01_resubmit_objects.patch
├── Dockerfile
└── Dockerfile.dev
```
## Known Issues
* Running the tests: Running the tests for CKAN or an extension inside the container will delete your current database. We need to patch CKAN core in our image to work around that.

View File

@ -0,0 +1,37 @@
# (from) July 7 2022
# This repo will be actively worked on from now.
# This file will be my ToDo list of things to take care of
ckan/ckan-docker-base: For the base images Dockerfiles (prod and dev) and related scripts
ckan/ckan-docker: For the project-oriented image template (prod and dev). Patching only done in Dev ### This repo!
All the other images should live in separate repos
1. Solr - use ckan-solr (https://github.com/ckan/ckan-solr)
2. PostgreSQL - use current method (base image: postgres:12-alpine from DockerHub, enhanced in a Dockerfile)
### This may change to be more like Solr though
3. Redis - use current method (DockerHub image: redis:${REDIS_VERSION} specified as a compose service in the compose file)
latest image to used is redis:6
4. nginx - base image: nginx:1.19.8-alpine from DockerHub, enhanced in a Dockerfile)
5. DataPusher - built from the actual datapusher repo (https://github.com/ckan/datapusher)
6. CKAN - built from the ckan/ckan-base:2.9.5 base image (which is built from the ckan/ckan-docker-base repo)
7. CKAN Worker - add new (ckan worker) container in the compose setup
Versions 2.9 and 2.10 (when it's out) only. Plan the repo layout for having multiple versions - OKFN could used as an example
Go through all the new changes in the current repo and use those for the new repo if they make sense
- Francesco's PR https://github.com/ckan/ckan/pull/4635 which is a beauty!
- use FROM ubuntu:focal for ckan
- Health Checks https://github.com/ckan/ckan/pull/6812
- Restarts https://github.com/ckan/ckan/pull/6569
- Make asure ARGs are used if they are added to compose file
- Check out Florian's docs https://github.com/dbca-wa/ckan/blob/dbca2022/doc/maintaining/installing/install-from-docker-compose.rst
- Check out Florian's repo https://github.com/dbca-wa/ckan/tree/dbca2022
- Documentation to be re-done from scratch...anything that could be useful can be mentioned here eg: local storage for ckan.ini
- Had to update the prerun.py script as it was failing on check_solr_connection
ToDo (workarounds to fix)
1. nginx - what caching should I implement?
2. DataPusher - needed to use a custom requirements.txt (see https://github.com/ckan/datapusher/pull/251)