Jekyll - Invalid US-ASCII character '\xE2'
Context
I’ve been using the 2.5-stable branch of the jekyll/minima theme without issues. However, I recently wanted to customize the <head/> section of my site. The master branch provides _includes/custom-head.html for this purpose, but this feature isn’t available in the stable branch.
Attempting the Upgrade
I had two options:
- Submit a pull request to backport the custom head feature to the
2.5-stablebranch - Upgrade from
2.5-stabletomaster
I chose the second approach and updated my _config.yml to use the master branch (since I was already using a remote theme, it was cheap and easy):
-remote_theme: jekyll/minima@2.5-stable
+remote_theme: jekyll/minima@master
Unfortunately, this broke my local build😭:
Generating...
Remote Theme: Using theme jekyll/minima
Jekyll Feed: Generating feed for posts
Conversion error: Jekyll::Converters::Scss encountered an error while converting 'assets/css/style.scss':
Invalid US-ASCII character "\xE2" on line 255
/usr/local/src/rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/jekyll-sass-converter-1.5.2/lib/jekyll/converters/scss.rb:123:in `rescue in convert': Invalid US-ASCII character "\xE2" on line 255 (Jekyll::Converters::Scss::SyntaxError)
raise SyntaxError, "#{e} on line #{e.sass_line}"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Investigation / RCA
ℹ️ This issue only affected my local Jekyll build running in an Ubuntu-based Docker container (following 🎉 Develop GitHub Pages locally in a Ubuntu Docker Container (latest) by Bill Raymond🏆). GitHub Pages handled the master branch upgrade without any problems, which gave me hope for a solution.
After investigating what changed in the master branch, I traced the build failure to a specific pull request:
- https://github.com/jekyll/minima/pull/855
- https://github.com/jekyll/minima/commit/478d99d18540948394c3d359f83b27abbcc325c8
The culprit was in _sass/minima/_layout.scss, specifically these lines containing the • character:
.force-inline {
display: inline;
&::before {
content: "•";
padding-inline: 5px;
}
}
The • character is a bullet symbol with Unicode value U+2022. Its UTF-8 encoding is the byte sequence 0xE2 0x80 0xA2. This directly correlates with the build error message: “Invalid US-ASCII character “\xE2”. Jekyll was attempting to parse the file as US-ASCII instead of UTF-8.
Research into this issue revealed that the problem stems from an incorrectly configured local environment, causing Jekyll/Ruby to default to US-ASCII encoding. Relevant resources:
- https://www.janmeppe.com/blog/invalid-US-ASCII-character/
- https://github.com/mmistakes/minimal-mistakes/issues/1809
While adding @charset "UTF-8"; at the top of the SCSS file can force the correct encoding (see this StackOverflow post), this approach doesn’t scale well across multiple files.
Resolution
The solution was to configure the Docker container’s locale settings. Following guidance from this AskUbuntu post, I added the English language pack to my Dockerfile:
RUN apt-get -y install language-pack-en
This ensures Ruby/Jekyll correctly detect UTF-8 encoding. After rebuilding the container, the build succeeded🎉✅:
Remote Theme: Using theme jekyll/minima
Jekyll Feed: Generating feed for posts
done in 1.801 seconds.
Auto-regeneration: enabled for '/workspaces/xxx'
LiveReload address: http://127.0.0.1:35729
Server address: http://127.0.0.1:4000
Server running... press ctrl-c to stop.