I recently changed the theme of my blog (and likely hammered your RSS/Atom reader) to have a more lightweight, readable and responsive theme, a banner with a glitch effect in pure css, and the articles available in Markdown, so you can download them, instead of having to print web pages if you want to read them outside of your web browser, or archive them.
The latest article contained snippets of radare2's output,
with fancy utf8-powered arrows.
Unfortunately, despite the presence of the charset utf8; directive in my nginx configuration,
the render was quite ugly:
[0x080485ba]> pd 20 @ sub.memcpy_8cb
â•’ (fcn) sub.memcpy_8cb 491
│ 0x080488cb 83ec5c sub esp, 0x5c
│ 0x080488ce c744244c0000. mov dword [esp + 0x4c], 0
│ 0x080488d6 c74424480000. mov dword [esp + 0x48], 0
│ 0x080488de 8b442464 mov eax, dword [esp + 0x64]
│ 0x080488e2 83e001 and eax, 1
│ 0x080488e5 85c0 test eax, eax
│ ┌─< 0x080488e7 751b jne 0x8048904
│ │ 0x080488e9 837c246477 cmp dword [esp + 0x64], 0x77
│ ┌──< 0x080488ee 7614 jbe 0x8048904
│ ││ 0x080488f0 8b442464 mov eax, dword [esp + 0x64]
│ ││ 0x080488f4 83e00f and eax, 0xf
│ ││ 0x080488f7 85c0 test eax, eax
│ ┌───< 0x080488f9 7509 jne 0x8048904
│ │││ 0x080488fb 8b442468 mov eax, dword [esp + 0x68]
│ │││ 0x080488ff 833800 cmp dword [eax], 0
│ ┌────< 0x08048902 750d jne 0x8048911
│ │└└└─> 0x08048904 c74424100000. mov dword [esp + 0x10], 0
│ │ ┌─< 0x0804890c e99d010000 jmp 0x8048aae
│ └────> 0x08048911 8b442460 mov eax, dword [esp + 0x60]
│ │ 0x08048915 8a00 mov al, byte [eax]
[0x080485ba]>
The trick is that nginx only appends charset=utf8 to the Content-type header only if the
MIME type is either
text/html, text/xml, text/plain, text/vnd.wap.wml, application/javascript or application/rss+xml,
and I'm serving the source of my articles with the type text/markdown.
I just had to add text/markdown to the charset_types
option to serve utf8-powered documents:
[0x080485ba]> pd 20 @ sub.memcpy_8cb
╒ (fcn) sub.memcpy_8cb 491
│ 0x080488cb 83ec5c sub esp, 0x5c
│ 0x080488ce c744244c0000. mov dword [esp + 0x4c], 0
│ 0x080488d6 c74424480000. mov dword [esp + 0x48], 0
│ 0x080488de 8b442464 mov eax, dword [esp + 0x64]
│ 0x080488e2 83e001 and eax, 1
│ 0x080488e5 85c0 test eax, eax
│ ┌─< 0x080488e7 751b jne 0x8048904
│ │ 0x080488e9 837c246477 cmp dword [esp + 0x64], 0x77
│ ┌──< 0x080488ee 7614 jbe 0x8048904
│ ││ 0x080488f0 8b442464 mov eax, dword [esp + 0x64]
│ ││ 0x080488f4 83e00f and eax, 0xf
│ ││ 0x080488f7 85c0 test eax, eax
│ ┌───< 0x080488f9 7509 jne 0x8048904
│ │││ 0x080488fb 8b442468 mov eax, dword [esp + 0x68]
│ │││ 0x080488ff 833800 cmp dword [eax], 0
│ ┌────< 0x08048902 750d jne 0x8048911
│ │└└└─> 0x08048904 c74424100000. mov dword [esp + 0x10], 0
│ │ ┌─< 0x0804890c e99d010000 jmp 0x8048aae
│ └────> 0x08048911 8b442460 mov eax, dword [esp + 0x60]
│ │ 0x08048915 8a00 mov al, byte [eax]
[0x080485ba]>
You can now enjoy the source of the articles with a real charset.