Personal Diarying in Circles

August 16, 2023

I am experimenting with sharing my diary entries with my family, using the awesome matrix-based Circles app. I write in an offline Diary app. My off-line diary is perfect for my diarying, and Circles is a great tool for sharing stories. What I want to do is combine them, posting my daily diary entries perhaps at the end of each day, as Stories (is that what we call them?) in a Circle.

I have tried posting my diary as password-protected web pages. Compared with that or other options such as email, some advantages of sharing through Circles include: family will be notified when I post something; a convenient way for them to reply; the app layout is optimised for reading such stories, and relatives' stories; the messages are in a standard protocol and can be accessed by other matrix apps, not locked in to one provider; it helps advance the state of matrix development (my knowledge, contribution to the ecosystem); it provides a compelling way to introduce matrix usage to my family.

Why don't I just use the Circles app the way it's designed to be used? This question is worth exploring at some length. As a programmer with some experience of toolmaking, both adapting mechanical tools and professionally making software tools, I take the approach of adapting the available tools to the task I want to perform. That's one reason I use Freedom (open source) software: it gives me the possibility to adapt it, or to invite any small group of like-minded people to adapt it to our collective needs. If I used proprietary software I would never have any possibility of adapting it, and that entire line of creativity in my life would be closed to me.

I have tried diarying directly in a matrix app before, and it didn't work for me. Writing offline, in a dedicated diary editor, with the text stored locally in plain files, is a nice, fast, and powerful way of working. I want a full-screen editor with room to compose a whole page and arrange images to illustrate it. I want to see a preview of the result before posting. I want to be able to go back and insert new content on a previous day's entry, and move paragraphs around. I want to be able to write where and when I like, and not be obstructed just because I am in a remote camp site with no data connection. (It was crushingly frustrating being forced to stop writing after about three sentences when I tried using Element-android with no data connection.) I want to be able to compose a whole day's writing, or a meaningful chunk, before my family see the opening sentence.

It is also vital to me that my long term archive materials, including my diary, are stored in a long term accessible format. With this diary I am storing plain files (text and images) in ordinary file-systems, and in a version control system, and synchronising a copy of the data across three locations (my phone, my laptop/desktop, and one of my servers). This arrangement lets me read and edit my diary from any device. It also ensures I still have a copy or two when I lose a device or its storage drive fails. The redundancy protects against sudden loss, and the version control enables me to recover from user error.

For all these reasons, I will keep using the offline diary as my primary. Sharing via matrix and receiving comments will be secondary.

(I would like to trust my Matrix server, Synapse, to retain my matrix message history for as long as I want, but it is simply not good at archiving. The design priorities of this server and of matrix as a whole are for short term storage, neglecting long term robustness: for example, I receive “unable to decrypt” errors far too often. Its storage format is opaque, and its import/export is weak and awkward: in 2023 there's still no way to transfer history between matrix servers of the same type or different types.)

Today's experiment involves using Matrix Commander, a command-line tool for posting messages and performing other tasks in matrix.

My first attempt uses matrix-commander by itself, and can send markdown text but does not cope with images. After setting up matrix-commander in docker according to its instructions, and logging in to my matrix account, I ran it like this:

FILE="$HOME/Diary/2023/08/15.md"
docker run --rm --name matrix-commander \
    -v $HOME/.config/matrix-commander:/data:z \
    matrixcommander/matrix-commander \
    --room=!xxxxxxxxxxxxxxxxxx:example.net \
    --markdown -m "$(<$FILE)"

The --room options identifies the matrix room I want to use in Circles.

I experimented with splitting the diary page into paragraphs using matrix-commander's --split '\n\n' option. That results is multiple matrix messages, each of a more typical size. That doesn't necessarily work so well in a Circles context, where it is generally expected that a “story” comes as a whole and any replies are sent to the story as a whole. Split into paragraphs, each reply or comment must then be attached to a specific paragraph, and as a result the replies may be harder to find and to comprehend. However, the semantics in that respect are not well defined, and either way is possible. A division into mid-sized blocks containing multiple paragraphs each would be possible, if I had some way of deciding where to break the page. I might consider using a horizontal divider line (which in markdown format is represented by ---) in my diary entries for this purpose.

Next I want to support images. Images are by far the most common non-text insertion in my diarying. I rarely attach an audio or video clip or another kind of file. (This is partly because I am wary of the storage space. I duplicate the files to the diary storage from my camera storage, because if I link to the original files I don't have a well developed way to manage keeping the links intact.)

matrix-commander does not have a way to automatically obey the image attachment instructions in markdown input text. It does have a way to send an image as a separate operation.

First I tried pre-processing the markdown input text, converting the image links from relative file references like this: ![](IMG_xxxx.jpg) to URLs like this: ![](https://my.example.net/Diary/2023/08/IMG_xxxx.jpg). There were a few problems. The way I currently publish my diary, it's behind password authentication, and the published images are scaled to various resolutions with file names like this: IMG_xxxx-640.jpg. Those I could overcome.

More significantly, matrix clients don't like reading an image from an arbitrary URL. In general they expect a matrix “mxc:” URL which they resolve through the user's matrix server. The matrix spec says in a “text” message, the URL in any embedded <img src="x"> must be an mxc: URI. However when sending a message of subtype image, it says only that the URL is “typically” a mxc:// URI, so if I split each image out as a separate message it should be possible to reference an arbitrary URL there.

Clearly the matrixy thing to do is upload the images to the matrix server and use the resulting mxc:// URI.

For this more substantial pre-processing I will switch to invoking matrix-commander from Python.

#!/usr/bin/env python3
"""usage: $0 DIARY_DAY_FILE.md [MATRIX-COMMANDER-ARG...]
"""

from contextlib import redirect_stdout
import os
import re
import sys

sys.path.append('/app/matrix_commander')
from matrix_commander import main

def mc(args):
    """Run matrix-commander
    """
    argv = ['mc', '-s/data/store', '-c/data/credentials.json'] + args
    ret_code = main(argv)
    assert ret_code == 0

URL_BASE = 'https://diary.julian.foad.me.uk'

md_file = sys.argv[1]
other_args = sys.argv[2:]

m = re.match(r'^(.*)/(20\d\d)/(\d\d)/(\d\d).md$', md_file)
base_dir, year, month, day = m.group(1, 2, 3, 4)
print(base_dir, year, month, day)

image_md_pattern = re.compile(r'!\[([^\]]*)\]\(([^) ]*)( +"([^"]*)")?\)')

def replacer(match):
    alt_text, orig_url, opt_title_part, opt_title = match.group(1, 2, 3, 4)

    upload_path = f'{base_dir}/{year}/{month}/{orig_url}'
    upload_args = ['--upload', upload_path, '--plain']
    with redirect_stdout(io.StringIO()) as f:
        mc(other_args + upload_args)
    # output has the URI and the key-dictionary (if encrypted), like "mxc://...    None"
    output = f.getvalue()
    match = re.match(r'^(mxc://[^ ]*)    ([^ ]*)$', output)
    new_uri, key_dict = match.group(1, 2)
    print((alt_text, orig_url, opt_title, "->", upload_path, new_uri, key_dict)) 
    replacement = f"![{alt_text}]({new_uri}{opt_title_part or ''})"
    return replacement

md_orig = open(md_file).read()
message_md = image_md_pattern.sub(replacer, md_orig)

print()
print(message_md)

message_args = ['--markdown', '-m', message_md]
mc(other_args + message_args)

Where I got to on the first day, is running this Python script which intends to upload each image to the matrix server and then replace the markdown references to point to the uploaded images using mxc: URIs.

I'm running it like this:

docker run --rm -ti --name matrix-commander \
    -v $HOME/.config/matrix-commander:/data:z \
    -v $HOME/src/diary-to-matrix:/app/diary-to-matrix:ro,z \
    -v $HOME/Diary:/mnt/Diary:ro,z \
    --entrypoint python3 matrixcommander/matrix-commander \
    /app/diary-to-matrix/diary-to-matrix.py \
    /mnt/Diary/2023/08/15.md --room='!xxx'

Issues I dealt with at this point include:

The matrix-commander image upload code opens the file in 'r+b' mode where the '+' means it needs write access to the file. I don't know if it needs write access; I suspect not. I'm mounting my data source volume read-only for safety. I can work around this temporarily by making a copy of the source data to use.
I first was calling the “-i image” function which posts an image message, and needed to change to the “upload an image” function that only uploads and receives an mxc: URI for it.
The --upload call returns an mxc: URI by printing to stdout, not easily returning it to the Python caller. Work-around: wrap the call in with redirect_stdout from contextlib and parse the output.

To fix matrix-commander to not require read-write access to input files, I suspect the following substitution may work:

sed -ie 's/"r+b"/"rb"/' /app/matrix_commander/matrix_commander.py

(I tried adding this near the beginning of my program, and it still failed the same way:

# Patch matrix_commander.py to not require write access to input files.
os.system("sed -ie 's/\"r+b\"/\"rb\"/' /app/matrix_commander/matrix_commander.py")

Silly me, I now realise: it had already executed from matrix_commander import main, reading the Python code from matrix_commander.py, before it executed this substitution on it.)

I don't know if posting a big page of paragraphs with several images will display well. I can also try the other way I mentioned, splitting into paragraphs, and in that case can use the dedicated “post an image” message type instead of markdown with embedded images.

I'm interested to see how this experiment pans out.

I discussed this with the creator of Circles, C V Wright, in the #circles:futo.org matrix room. The idea of posting multi-paragraph “stories” with images, and other embedded content, seems to be broadly in line with how he would like to see being possible in Circles. There are some ideas around it.

Back to my proof-of-concept. Using matrix-commander --upload I can upload an image and get back an mxc: URI.

Substituting this into the markdown, I post the resulting message. But Circles isn't showing the image, it's showing the image alt-text. Fluffychat shows the image (although very small, about a quarter of the column width). Hydrogen doesn't show anything for that paragraph. So that's not looking promising!

(I'm not testing in Element right now because it's harder to switch to a test account.)

The formatting looks like this (copied from fluffy chat's msg details):

"formatted_body": "<p>Text paragraphs...</p>\n<p><img alt=\"IMG_202308xxx.jpg\" src=\"mxc://my.test.server.example.net/xxxxxx\" /></p>",

(This is with no encryption for the image file.)

Next I'll try a dedicated “image” message.

#!/usr/bin/env python3
"""usage: $0 DIARY_DAY_FILE.md [MATRIX-COMMANDER-ARG...]
"""

from contextlib import redirect_stdout
import io
import os
import re
import sys

# Patch matrix_commander.py to not require write access to input files.
os.system("sed -ie 's/\"r+b\"/\"rb\"/' /app/matrix_commander/matrix_commander.py")

sys.path.append('/app/matrix_commander')
from matrix_commander import main

def mc(args):
    """Run matrix-commander
    """
    argv = ['mc', '-s/data/store', '-c/data/credentials.json'] + args
    ret_code = main(argv)
    assert ret_code == 0

URL_BASE = 'https://diary.julian.foad.me.uk'

md_file = sys.argv[1]
other_args = sys.argv[2:]

m = re.match(r'^(.*)/(20\d\d)/(\d\d)/(\d\d).md$', md_file)
base_dir, year, month, day = m.group(1, 2, 3, 4)
print(base_dir, year, month, day)

image_md_pattern = re.compile(r'!\[([^\]]*)\]\(([^) ]*)( +"([^"]*)")?\)')

def upload_image(upload_path, encrypted=False):
    """Upload image file from UPLOAD_PATH.
       Return (mxc_uri, key_dict)
       key_dict is None if not encrypted.
    """
    upload_args = ['--upload', upload_path] + ([] if encrypted else ['--plain']) 
    with redirect_stdout(io.StringIO()) as f:
        mc(other_args + upload_args)
    # output has the URI and the key-dictionary (if encrypted), like "mxc://...    None"
    output = f.getvalue()
    match = re.match(r'^(mxc://[^ ]*)    ([^ ]*)$', output)
    mxc_uri, key_dict = match.group(1, 2)
    if key_dict == 'None':
        key_dict = None
    return mxc_uri, key_dict

def send_para_with_image(para, match):
    """Send paragraph PARA with an image in it, by first extracting and
       uploading the image, then replacing the local filename with the
       upload mxc: URI.
       In FluffyChat, displays the image very small (~1/4 column width);
       In Circles and Hydrogen, displays nothing :-( .
    """
    alt_text, orig_url, opt_title_part, opt_title = match.group(1, 2, 3, 4)

    upload_path = f'{base_dir}/{year}/{month}/{orig_url}'
    mxc_uri, key_dict = upload_image(upload_path)
    print((alt_text, orig_url, opt_title, "->", upload_path, mxc_uri, key_dict)) 
    replacement = f"![{alt_text}]({mxc_uri}{opt_title_part or ''})"
    message_args = ['--markdown', f'--message={replacement}']
    mc(other_args + message_args)

def send_para_as_image(para, match):
    """Send an image-message by itself, extracted from markdown paragraph PARA.
    """
    alt_text, orig_url, opt_title_part, opt_title = match.group(1, 2, 3, 4)

    upload_path = f'{base_dir}/{year}/{month}/{orig_url}'
    print((orig_url, "->", upload_path))
    image_args = [f'--image={upload_path}']
    mc(other_args + image_args)

def send_markdown(message_md):
    message_args = ['--markdown', '-m', message_md]
    mc(other_args + message_args)

md_orig = open(md_file).read()
md_split = md_orig.split('\n\n')
for para in md_split:
    print("# PARA: ", para)
    match = image_md_pattern.search(para)
    if match:
        print("# IMAGE MATCH: ", match.groups())
        send_para_as_image(para, match)
    else:
        print("# MARKDOWN")
        send_markdown(para)

Got that working, splitting my markdown into paragraphs, sending images as separate message of image type. (Usually in my diary writing, each image is a separate paragraph, though not always.)

Of course the result comes out with paragraphs arranged in bottom-to-top order on the Circles UI :–)

It comes out the right way up in other clients. It looks OK in Hydrogen, with just a light grey time stamp separating each paragraph. It looks better in FluffyChat which does not print time stamps between paragraphs that are close together in time. There we see just a sequence of paragraphs (one in each “message bubble”) and images, almost as clean as my original blog page style.

C V Wright said, “I hate to say this, but what if you created a thread for each entry? ... Then I think we would show the threaded messages in (forward) chronological order.”

A thread is not unreasonable. I may try. The desired top to bottom display order is not inherently more likely: that seems like a happenstance. But more interestingly a thread is one more mechanism that could represent the joining of several messages together into a group, besides other ideas we discussed earlier. Compared with, for example, representing a whole day's diary page inside one message of a new type defined as “list of primitive messages”, some of the different capabilities of a thread may be advantages for this scenario. Such as being able to contain messages from different authors (friends), and such as being displayed “outside the main flow” in clients. At the least, the fact that threading is already in the matrix formal specification is a practical advantage.

Other issues:

ERROR: matrix-commander: E148: room_send failed with error 'RoomSendError: M_LIMIT_EXCEEDED Too Many Requests - retry after 1189ms'. There is no detection and re-try for rate limiting. I'll have it sleep for a second between paragraphs as a work-around.
This version blindly replaces any paragraph containing an image with just an image. It needs to handle other image-containing paragraphs more gracefully.

Then an issue about image captions came up, and I got involved in a discussion about standardising them, in the #matrix-design:matrix.org room.

What I would like to show now is a pair of “before” (blog page) and “after” (matrix/Circles) screen-shots. Unfortunately I've been working with my real diary which I don't want to show here. Anyway it's not particularly exciting yet: it just looks like normal matrix messages.

It would perhaps look more impressive if I add headings (the diary date, at least) to show clearly where each new day's entry begins, as matrix clients' date stamps are usually deliberately unobtrusive.

I hope to get back to this.

Follow/Feedback/Contact: RSS feed · Fedi follow this blog: @julian@wrily.foad.me.uk · matrix me · Fedi follow me · email me · julian.foad.me.uk Donate: via Liberapay All posts © Julian Foad and licensed CC-BY-ND except quotes, translations, or where stated otherwise