Setting up a blog with Pelican and GitHub Pages
I describe how I set up my static blog/website in Python with Pelican, pandoc, Docker, Dockerhub, GitHub pages, and Travis CI.
Here is the workflow I wanted to have:
- I write new contents in Markdown files.
- I commit and push the sources to my GitHub repository.
- That's it. The website is automatically updated after two minutes, thanks to Travis CI and Docker. My CV is also automatically converted from Markdown to PDF via LaTeX.
Setting this up was not straightforward and it did require significant upfront investment.
Creating a GitHub repo for your website
I assume you're creating your personal webpage. You need to create a yourname.github.io
repo where yourname
is your GitHub login. You can also use the same method for an organization page, with minor adjustments. My repo is here.
The master
branch will contain the automatically-generated HTML contents. Only Travis CI will write to this branch with a force push. The sources will be in an orphan branch named sources
.
Clone your repo. Let's assume your local path is /home/yourname/git/yourname.github.io/
.
Setting up Pelican
Now create your website locally with Pelican. If you're starting from scratch this is not the simplest step! You can refer to Pelican's documentation.
Here is an excerpt of my repo's file structure:
pelican-plugins/ # git clone the official pelican-plugins repo
plugins/ # put your own plugins here
content/ # put your contents here
images/ # images that you use in your posts
pages/ # static pages
about.md
...
2016-01-01-my-blog-post.md # URL will be http://yourname.github.io/my-blog-post/
...
themes/ # put your themes and templates here
pure/
static/
templates/
output/ # HTML output generated by Pelican
Dockerfile # these two files are for Docker
run_docker.sh
Makefile # these files are auto-generated by Pelican
pelicanconf.py
publishconf.py
I write pages and posts in Markdown files within the contents/
subdirectory. I can use the Jupyter Notebook to edit Markdown files with the ipymd package. This is convenient when my posts contain a lot of code.
The theme's files (jinja templates, CSS and JS files) are in themes/pure/
(pure is the name of the CSS framework I'm using). I use a few Pelican plugins, which are in the pelican-plugins/
subdirectory (a cloned repo). I also have a custom plugin in plugins/
(see later in this post).
When Pelican generates the website, the HTML files are saved in the output/
subfolder which is not tracked by git.
Pelican comes with a tool that initializes the file structure for a new site. It creates a default pelicanconf.py
, a complete Makefile
, and a few other things. You put all of your site's parameters in pelicanconf.py
. Here is an excerpt of my pelicanconf.py
:
THEME = 'themes/pure'
PATH = 'content'
# Extract a post's date from the filename:
FILENAME_METADATA = '(?P<date>\d{4}-\d{2}-\d{2})-(?P<slug>.*)'
STATIC_PATHS = ['images']
EXTRA_PATH_METADATA = {
'images/favicon.png': {'path': 'favicon.png'},
}
# Markdown extensions:
MD_EXTENSIONS = ['codehilite(css_class=highlight,'
'guess_lang=False,linenums=False)',
'headerid',
'extra']
# Pagination:
DEFAULT_PAGINATION = 5
PAGINATION_PATTERNS = (
(1, '{base_name}/', '{base_name}/index.html'),
(2, '{base_name}/page/{number}/', '{base_name}/page/{number}/index.html'),
)
PLUGIN_PATHS = ['pelican-plugins', 'plugins']
# Pelican plugins:
PLUGINS = [# These plugins are part of the official `pelican-plugins` repo:
'render_math',
'summary',
'neighbors',
# This one is a custom plugin:
'pdf',
]
ARTICLE_URL = '{slug}/'
ARTICLE_SAVE_AS = '{slug}/index.html'
PAGE_URL = '{slug}/'
PAGE_SAVE_AS = '{slug}/index.html'
At this point, develop and configure your site locally until it's ready to be made public. I generally use make html
to generate the website locally, make regenerate
to have it regenerated automatically while I work on it, and make serve
to browse it locally at http://localhost:8000
.
The publishconf.py
makes a few adjustements to your pelicanconf.py
to make your website ready to be published (mainly specifying the public URL of your website).
Automatically generating a PDF version of my CV
One of the pages of my site contains my CV in Markdown. I wanted to have a PDF version automatically available, using pandoc and LaTeX to convert from Markdown to PDF. I created a quick-and-dirty plugin for this purpose:
# This is in `plugins/pdf/__init__.py`
import os
import tempfile
from pelican import signals
# The pandoc command. The CV is saved in a static `pdfs/` subdirectory.
CMD = ('pandoc {fn} -o content/pdfs/cv.pdf '
'-V geometry:margin=1in '
'--template=template.tex')
def generate_pdf(p):
with tempfile.TemporaryDirectory() as tmpdir:
print("Generating cv.pdf")
with open('content/pages/about.md', 'r') as f:
contents = f.read()
fn = os.path.join(tmpdir, 'about.md')
contents = contents[contents.index('\n---') + 4:]
# Add title and author in Markdown front matter.
contents = ('---\n'
'title: Curriculum vitae\n'
'author: Cyrille Rossant\n'
'---\n\n' +
contents)
with open(fn, 'w') as f:
f.write(contents)
os.system(CMD.format(fn=fn))
def register():
# Create the PDF before generating the site.
signals.initialized.connect(generate_pdf)
Now, as part of the build process, a content/pdfs/cv.pdf
file is automatically generated. This ensures that the PDF is always in sync with that page. This PDF is not tracked by git. It will be automatically generated by Travis CI.
Setting up Travis CI
Now we're going to set up Travis CI. We'll tell Travis to build the website at every push to the sources
branch, and to force push the output to the master
branch. This ensures that the website is automatically built and deployed.
Here's my .travis.yml
:
language: python
python:
- "3.5"
sudo: required
services:
- docker
branches:
only:
- sources
env:
global:
secure: "xxxxxxxxxxxx"
install:
- pip install ghp-import
- git clone https://github.com/getpelican/pelican-plugins
script:
- make publish github
A few things to note:
- I use Docker to build the website and the PDF but this is optional.
- If you don't use Docker, you'll have to install Pelican and other dependencies to build your website.
- I put an encrypted version of an authentication key to allow Travis to push to the
master
branch of the repo. Refer to this page to see how to generate and encrypt an authentication key. - I use the
ghp-import
tool to push the generated website to themaster
branch. Note that this tool is destructive: here it will destroy yourmaster
branch every time. You will always have a single commit inmaster
with the latest version of your website. - The build process occurs in
make publish github
which is readily provided by the defaultMakefile
. What this command does is:- Generate your website in
output/
. - Commit the
output/
to themaster
branch. - Push force that branch to GitHub. GitHub Pages takes care of the rest and updates your website automatically at
http://yourname.github.io
.
- Generate your website in
Setting up Docker
The default Makefile
contains the command pelican contents/ -o output/ -s publishconf.py
to generate your website. However, since I'm using Docker, I've replaced this command by a bash run_docker.sh
, described below.
The main reason why I'm using Docker here is that installing LaTeX takes a while, and using Docker makes the build process slightly faster on Travis CI. The image is big though (almost 1GB), mainly because of LaTeX, and I'd be happy to find a way to make it smaller. It would make the build process faster.
Using Docker also gives me a bit more control on the dependencies I need. But it certainly makes the setup more complicated. Don't use it if you don't need it.
First, install Docker locally. This is not necessarily straightforward: follow all instructions. Also, create a Dockerhub account, and create a yourname/pelican
repository. Dockerhub is like GitHub, but for Docker images.
Then, create a Dockerfile
at the root of your repo with the following:
FROM python:3
MAINTAINER yourname <your@email.com>
# Update OS
RUN apt-get update
RUN sed -i 's/# \(.*multiverse$\)/\1/g' /etc/apt/sources.list
RUN apt-get -y upgrade
# Install dependencies
# I need LaTeX and pandoc to generate the CV:
RUN apt-get install make git tex-common texlive pandoc -y
RUN pip install pelican Markdown ghp-import
RUN pip install --upgrade pelican Markdown ghp-import
WORKDIR /site
# Generate the website when running the container:
CMD pelican content/ -o output/ -s publishconf.py
Starting from a Python 3 image, we add LaTeX, pandoc, Pelican, Markdown, and ghp-import, and we generate the website.
When you'll run a container based on this image, you'll have to mount your repository as a data volume so that the Docker container has access to it.
Build your container with docker build -t yourname/pelican .
(note the trailing dot!). This will download the Python 3 image and build an image with your Dockerfile instructions.
Next step is to upload the image to your Dockerhub account with docker login
and docker push yourname/pelican
. Travis CI will download it and use it to build your website.
Finally, here's a tiny bash script to pull the latest version of the image from Dockerhub and run it to generate the website:
# This is run_docker.sh
# Pull the latest Docker image
docker pull yourname/pelican
# Generate the site with pelican
docker run -t -v $(pwd):/site yourname/pelican
The last line of the script runs our container. The -v $(pwd):/site
allows us to mount the current directory (typically your ~/git/yourname.github.io/
repo) to the /site
directory, which is our container's working directory.
Phew, that's it! Now I can edit the Markdown sources, commit and push to GitHub, and the website is built automatically by Travis CI. To sum up, the build process done by Travis CI at every push to sources
is as follows:
- Clone the current
yourname.github.io
repo - Pull the
yourname/pelican
image from Dockerhub - Create and run a container based on this image, with the current directory containing the sources mounted inside the container
- The container, which has Pelican, LaTeX, and pandoc installed, generates the website in
output/
, including the PDF version of one of the pages - The output is committed to the
master
branch viaghp-import
- The
master
branch is pushed to the GitHubyourname.github.io
repo thanks to the authentication token