Apt repository mirroring with docker
Why mirroring a distribution apt repository ?
Creating apt repo mirroring with docker can be helpful.
- It allow you to build faster your containers having local connection
- Debian repo are sometimes instable, then having local repo can stabilize your CI
- You can control the lifetime repos, then you are sure that your very olds apps still build over time, even if distribution owner decide to remove a version from his repos
Going into code
A simple apache instance can serve the repo for your containers.
Here is the code for this simple Dockerfile :
FROM httpd:2.4.48 # install required packages RUN apt-get update && apt-get install -y wget # copy scripts RUN echo "**** Copy scripts" COPY ./refresh.sh /root/refresh.sh RUN chmod 755 /root/refresh.sh COPY ./health.sh /root/health.sh RUN chmod 755 /root/health.sh COPY ./entrypoint.sh /root/entrypoint.sh RUN chmod 755 /root/entrypoint.sh # remove http index RUN rm -f /usr/local/apache2/htdocs/index.html # entrypoint : start update script & apache ENTRYPOINT /root/entrypoint.sh
As you see nothing hard : a simple http server with wget to download officials repos and some scripts to manage the mirroring.
Let see entrypoint.sh :
#!/bin/bash cd /root # launch repo refresh async nohup ./refresh.sh & PID=$! # disable health route if sync process fail nohup ./health.sh $PID & # serve http httpd-foreground
Here again : nothing scary. We launch the refresh.sh script in background, get his pid and launch health.sh script to monitor the refresh script.
See the health script :
#!/bin/bash while ps -p $1 > /dev/null; do sleep 1; done rm /usr/local/apache2/htdocs/healthz
Nothing say more that if the refresh script stop, we remove file saying to orchestrator (like kubernetes) that our pod is healthy, let you parameter alerts to manage the problem.
The core of the system : download and refresh repositories
The last thing missing for this stack is the data of repositories.
We must download needed repositories and refresh periodically.
In my case, I have an old application containerized based on Ubuntu 12.04. This repo is now unmaintained and archived in a legacy repo by Canonical. Nothing say me that tomorrow, they will still keep this legacy repository online. The other thing to take account, is that theses repos are now static. Then it is more reliable to download separately this repo, only one time, and in a persistent volume.
Once downloading this persistent part is finished, we can download in parallel other repos. For now, debian and ubunutu repos.
After all, we make a "mv" to bring online as fast as possible.
The only thing remain is to tell orchestrator the pod is ready to serve by creating "healthz" file and wait one month to refresh.
Putting all together :
#!/bin/bash # sync every month while [ 1 ]; do # tell sync in progress echo "active" > /usr/local/apache2/htdocs/syncing echo "**** Mirror unmaintained ubutnu releases" # only if not downloaded before : unmaintained repos is static - it is recommended to have persistent volume on /root/archive.ubuntu.com [ -f /root/old-releases.ubuntu.com/done ] && echo "already downloaded" || \ cd /root && \ while true; do wget \ --continue \ --mirror \ --no-verbose \ --no-parent \ --exclude-directories=/old-images/ubuntu/.temp \ --reject "current" \ --reject "pxelinux.cfg" \ ftp://old-releases.ubuntu.com/old-images/ubuntu && \ touch /root/old-releases.ubuntu.com/done && break; done # fusion to maintened ubuntu repo in order to serve all dist in one place echo "syncing" cd /root/old-releases.ubuntu.com/old-images && rsync -rt ubuntu /root/archive.ubuntu.com echo "**** Mirror maintained ubutnu releases" cd /root && \ while true; do wget \ --continue \ --mirror \ --no-verbose \ --no-parent \ --exclude-directories=/ubuntu/.temp \ --reject "current" \ --reject "pxelinux.cfg" \ ftp://archive.ubuntu.com/ubuntu/ && \ break; done & echo "**** Mirror debian releases" cd /root && \ while true; do wget \ --continue \ --mirror \ --no-verbose \ --no-parent \ --exclude-directories=/ubuntu/.temp \ --reject "current" \ --reject "pxelinux.cfg" \ ftp://deb.debian.org/debian/ && \ break; done & wait echo "**** Move mirrors to http server" rm -rf /root/old || echo "Old not present" mv /usr/local/apache2/htdocs/ubuntu /root/old mv /root/archive.ubuntu.com/ubuntu /usr/local/apache2/htdocs/ubuntu rm -rf /root/old || echo "Old not present" mv /usr/local/apache2/htdocs/debian /root/old mv /root/deb.debian.org/debian /usr/local/apache2/htdocs/debian echo "healthy" echo "ok" > /usr/local/apache2/htdocs/healthz # tell sync in progress echo "inactive" > /usr/local/apache2/htdocs/syncing # wait one month sleep 2592000 done
Let's test the repo
Just build this small Dockerfile to test (replace ip and port by your's) :
FROM ubuntu:20.04 RUN echo "deb http://192.168.1.10:7000/ubuntu/ focal main restricted universe multiverse" > /etc/apt/sources.list RUN echo "deb http://192.168.1.10:7000/ubuntu/ focal-updates main restricted universe multiverse" >> /etc/apt/sources.list RUN echo "deb http://192.168.1.10:7000/ubuntu/ focal-security main restricted universe multiverse" >> /etc/apt/sources.list RUN echo "deb http://192.168.1.10:7000/ubuntu/ focal-backports main restricted universe multiverse" >> /etc/apt/sources.list RUN echo "deb http://192.168.1.10:7000/ubuntu/ focal partner" >> /etc/apt/sources.list RUN apt-get update && apt-get install -y vim && sleep 20
Get it on github
You can find all sources at https://github.com/sebk69/small-repo-mirror