The Waving Flag: Social Media Posts From The Archive

Wednesday, 22 April 2026

Social Media Posts From The Archive

Background

Whilst checking my Bluesky feed I noticed someone posting links to, and images from, their blog. Not new posts, but posts from years ago. It struck me this was good use of a blog archive. However, I didn't want to do this manually: even daily posts would soon pall.

In January I tidied all the posts here, and removed all the rubbish. I currently have 464 posts. That's more than enough for a lenghty series of re-posts. I just needed a suitable script I could automate.

Thanks to the perplexity.ai, I now have working python scripts, for both bluesky and the fediverse, that produce posts like this:

Read on if you are running Linux (Xubuntu 24.04) and you'd like to try them.

Instructions

[1] Create a venv folder

As Ubuntu 24.04 (and all its variants) is quite fussy about running python scripts it's best to set up a dedicated virtual environment (venv) for the scripts. Both scripts use the same venv. This isolates the script as well as holding all the required libraries.

The first step is to install python3-venv, if it isn't already installed:

sudo apt install python3-venv

Next create the venv in a suitable folder and install the required libraries with these commands in a terminal:

python3 -m venv ~/.local/share/pipx/venvs/archive_bots
source ~/.local/share/pipx/venvs/archive_bots/bin/activate
pip install requests beautifulsoup4 Pillow

Please note: I created the archive_bots folder in a hidden directory with all the others I've created, but you can use any folder in your home folder.

[2] Bluesky script

First copy this script and save it somewhere in your home folder. I used ~/System/scripts/random_bsky_post.py

import random
import requests
import xml.etree.ElementTree as ET
from datetime import datetime, timezone
import os
from bs4 import BeautifulSoup
from PIL import Image
import io

# Use environment variables for safety
HANDLE = "bsky_handle"
APP_PASSWORD = "app_password"
SITEMAP_URL = "https://yourblog.blogspot/sitemap.xml"

def get_urls(sitemap_url):
    xml = requests.get(sitemap_url, timeout=30)
    xml.raise_for_status()
    root = ET.fromstring(xml.text)
    ns = {"sm": "http://www.sitemaps.org/schemas/sitemap/0.9"}
    return [loc.text.strip() for loc in root.findall(".//sm:loc", ns)]

def get_og_metadata_with_image(session_jwt, url):
    try:
        resp = requests.get(url, timeout=15)
        resp.raise_for_status()
        soup = BeautifulSoup(resp.text, 'html.parser')
        
        title = soup.find("meta", property="og:title")
        title = title["content"].strip() if title else "Blog post"
        
        desc = soup.find("meta", property="og:description")
        desc = desc["content"].strip()[:300] if desc else "15mm historical wargaming"
        
        img_tag = soup.find("meta", property="og:image")
        thumb_blob = None
        if img_tag:
            img_url = img_tag["content"]
            if "://" not in img_url:
                img_url = url.rstrip('/') + '/' + img_url.lstrip('/')
            
            img_resp = requests.get(img_url, timeout=15)
            img_resp.raise_for_status()
            
            # Resize image to reasonable size (1000x1000 max)
            img = Image.open(io.BytesIO(img_resp.content))
            img.thumbnail((1000, 1000))
            
            img_byte_arr = io.BytesIO()
            img.save(img_byte_arr, format='JPEG', quality=85)
            img_bytes = img_byte_arr.getvalue()
            
            # Upload as blob
            blob_resp = requests.post(
                "https://bsky.social/xrpc/com.atproto.repo.uploadBlob",
                headers={
                    "Authorization": f"Bearer {session_jwt}",
                    "Content-Type": "image/jpeg"
                },
                data=img_bytes,
                timeout=30
            )
            blob_resp.raise_for_status()
            thumb_blob = blob_resp.json()["blob"]
        
        return title, desc, thumb_blob
    except:
        return "Blog post", "15mm historical wargaming", None

urls = get_urls(SITEMAP_URL)
pick = random.choice(urls)

session = requests.post(
    "https://bsky.social/xrpc/com.atproto.server.createSession",
    json={"identifier": HANDLE, "password": APP_PASSWORD},
    timeout=30,
)
session.raise_for_status()
session = session.json()
session_jwt = session["accessJwt"]

title, description, thumb_blob = get_og_metadata_with_image(session_jwt, pick)

text = f"""A post from my blog's archive:
{title}
#tabletop #wargames #miniatures"""

facets = []
for tag in ["#tabletop", "#wargames", "#miniatures"]:
    tag_pos = text.find(tag)
    if tag_pos != -1:
        tag_start = len(text[:tag_pos].encode('utf-8'))
        tag_end = tag_start + len(tag.encode('utf-8'))
        facets.append({
            "index": {"byteStart": tag_start, "byteEnd": tag_end},
            "features": [{"$type": "app.bsky.richtext.facet#tag", "tag": tag[1:]}]
        })

record = {
    "$type": "app.bsky.feed.post",
    "text": text,
    "createdAt": datetime.now(timezone.utc).isoformat().replace("+00:00", "Z"),
    "facets": facets,
    "embed": {
        "$type": "app.bsky.embed.external",
        "external": {
            "uri": pick,
            "title": title,
            "description": description
        }
    }
}

if thumb_blob:
    record["embed"]["external"]["thumb"] = thumb_blob

resp = requests.post(
    "https://bsky.social/xrpc/com.atproto.repo.createRecord",
    headers={"Authorization": f"Bearer {session_jwt}"},
    json={
        "repo": session["did"],
        "collection": "app.bsky.feed.post",
        "record": record,
    },
    timeout=30,
)
# resp.raise_for_status()
# print("Posted successfully!")
# print(f"Post URI: {resp.json()['uri']}")

The script requires three pieces of information to work: your Bluskey handle (or username), an app password and the url of your blog's sitemap.

Your Bsky handle is on your profile page (don't add the @). You can create an app password via Settings|Privacy and Security|App passwords (don't use your regular password). The final piece is you blog's name (if on Blogger) or url if not.

Once you have this information, edit this section accordingly:

# Use environment variables for safety
HANDLE = "bsky_handle"
APP_PASSWORD = "app_password"
SITEMAP_URL = "https://yourblog.blogspot/sitemap.xml"

Then edit these sections to reflect your blog's content:

return "Blog post", "15mm historical wargaming", None
#tabletop #wargames #miniatures

The script is now ready for testing.

[3] Fediverse script

First copy this script and save it somewhere in your home folder. I used ~/System/scripts/random_fedi_post.py

import os
import random
import requests
import xml.etree.ElementTree as ET
from bs4 import BeautifulSoup

MASTODON_BASE_URL = "https://your_instance.social"
MASTODON_TOKEN = "your_access_token"
SITEMAP_URL = "https://yourblog.blogspot/sitemap.xml"

def get_urls(sitemap_url):
    xml = requests.get(sitemap_url, timeout=30)
    xml.raise_for_status()
    root = ET.fromstring(xml.text)
    ns = {"sm": "http://www.sitemaps.org/schemas/sitemap/0.9"}
    return [loc.text.strip() for loc in root.findall(".//sm:loc", ns)]

def get_og_metadata(url):
    try:
        resp = requests.get(url, timeout=15)
        resp.raise_for_status()
        soup = BeautifulSoup(resp.text, "html.parser")

        title = soup.find("meta", property="og:title")
        title = title["content"].strip() if title else "Blog post"

        desc = soup.find("meta", property="og:description")
        desc = desc["content"].strip() if desc else "15mm historical wargaming"

        img = soup.find("meta", property="og:image")
        img_url = img["content"].strip() if img else None

        return title, desc, img_url
    except:
        return "Blog post", "15mm historical wargaming", None

def upload_media(image_url):
    if not image_url:
        return None

    img_resp = requests.get(image_url, timeout=30)
    img_resp.raise_for_status()

    filename = image_url.split("?")[0].rsplit("/", 1)[-1] or "image.jpg"
    files = {
        "file": (filename, img_resp.content)
    }

    resp = requests.post(
        f"{MASTODON_BASE_URL}/api/v2/media",
        headers={"Authorization": f"Bearer {MASTODON_TOKEN}"},
        files=files,
        timeout=30,
    )
    resp.raise_for_status()
    return resp.json()["id"]

urls = get_urls(SITEMAP_URL)
pick = random.choice(urls)
title, description, image_url = get_og_metadata(pick)

status = f"""A post from the Waving Flag archive:
{title}
{pick}
#tabletop #wargames #miniatures"""

data = {
    "status": status,
    "visibility": "public",
}

resp = requests.post(
    f"{MASTODON_BASE_URL}/api/v1/statuses",
    headers={"Authorization": f"Bearer {MASTODON_TOKEN}"},
    data=data,
    timeout=30,
)
resp.raise_for_status()
print(resp.json()["url"])

The script requires three pieces of information to work: the name of your fediverse (Mastodon) instance, an access token and the url of your blog's sitemap.

You can create an access token via Preferences|Development|Your applications page.

Once you have this information, edit this section accordingly:

# Use environment variables for safety
MASTODON_BASE_URL = "https://your_instance.social"
MASTODON_TOKEN = "your_access_token"
SITEMAP_URL = "https://yourblog.blogspot/sitemap.xml"

Then edit these sections to reflect your blog's content:

return "Blog post", "15mm historical wargaming", None
#tabletop #wargames #miniatures

The script is now ready for testing.

[4] Testing

To test both scripts run these commands in a terminal:

/home/user/.local/share/pipx/venvs/archive_bots/bin/python /home/user/System/scripts/random_bsky_post.py
/home/user/.local/share/pipx/venvs/archive_bots/bin/python /home/user/System/scripts/random_fedi_post.py

Change the user to your username and edit the script and venv locations if you used anything different.

If they don't work I suggest checking the three lines that contain your credentials and blog information.

[5] Automation with cron

I chose to run each script every three hours with these lines in my list of cron jobs:

0 */3 * * * /home/user/.local/share/pipx/venvs/archive_bots/bin/python /home/user/System/scripts/random_bsky_post.py
5 */3 * * * /home/user/.local/share/pipx/venvs/archive_bots/bin/python /home/user/System/scripts/random_fedi_post.py
   

This usually means I post three or four times a day as my computer is not always on. You can adjust the frequency of the cron jobs or set it to post once a day, week or month: whatever works best for your content.

Close all

No comments :

Salute The Flag

If you'd like to support this blog why not leave a comment, or buy me a beer.