Moving my diary to Obsidian

Jul 7 1AM (Mon)

9 min 1900
9 min
1900
🌤️ 🤔->😉
Home
diary obsidian personal private update diary obsidian personal private update

Keeping private thoughts; private

Intro

So, I recently decided to migrate from Diaro to Two Obsidian for my diary. Not that I have a problem with their website. For the low price of $20, I believe it was, I think it’s a good deal. They gave me a lifetime subscription to their site, which utilizes Dropbox to sync its data backend. They’ve never lost a single entry, and they’ve helped me chronicle my life over the last three or four years. One of the great features they had was that you could upload a photo, and it would tag the location and set the time and date based on the photo. This was a great feature, and something I hope I can find similar to in Obsidian.

Getting the data out

Getting the data out of the service wasn’t exactly the easiest process, though they offered a few different export options, one that did CSV and one that did Text. When the PDF, A few different things, depending on which platform you tried to export from, the options were different depending on whether you did it on mobile or on the website. This was a little bit difficult because all the formats that they offered didn’t include photos, at least in an easy-to-access format. Y, You could try and export the photos by ripping them out of the PDFs, but that was a nightmare. Luckily, I was able to do some digging through the Dropbox account that it happened to be using as a backend sync. In there. I found some XML files and my photos all as JPEGs. The XML file wasn’t the easiest to parse, but thankfully, the structure was relatively sane. Attachments were in one table, Locations in another, and then the direct entries themselves in another table, all on one giant XML document.

Custom scripting

I was able to cook up a little bit of Python magic to get all this converted into a standardized markdown format with some YAML front matter and the images embedded at the bottom of the posts. All in all. This took two to three days Of Work over a weekend. Nothing too crazy, but I do wish that they offered a better export option. Maybe something with JSON or at least CSVs with attachments easily linked? Maybe like a photo dump option as well, just some food for thought. Either way, I can’t complain too much. For a whopping $20, I spent on a lifetime license, and the fact that they never lost a single post, but arguably, I would have to probably thank Dropbox for that. As they were debating the back end. Thankfully, the app always worked on every single device, and the website was reliable, so no digs against them. I’m gonna include a link to the Python script or maybe just embed it in this post. At the bottom, we’ll see. Just in case, it can help somebody else out down the road by moving their notes over from this service to either Obsidian or another markdown-based Diary application. Thanks!

🖖 Jelloeater

Migration script

This script is used to migrate data from Diaro to Obsidian, parsing XML files and creating markdown files with metadata. It’s kind of crap, but it DOES work. Feel free to email me if you need help… ❤️

#!/usr/bin/env -S uv run --quiet
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "pymarkdownlnt",
# ]
# ///

import xml.etree.ElementTree as ET
from datetime import datetime
import zipfile
import os
import re
import pymarkdown

import xml.etree.ElementTree as ET


class Location:
    def __init__(self, uid, title, address, lat, lng, zoom):
        self.uid = uid
        self.title = title
        self.address = address
        self.lat = lat
        self.lng = lng
        self.zoom = zoom

    def __repr__(self):
        return (
            f"Location(uid='{self.uid}', title='{self.title}', "
            f"address='{self.address}', lat={self.lat}, lng={self.lng}, zoom={self.zoom})"
        )


class Attachment:
    def __init__(self, uid, entry_uid, type, filename, position):
        self.uid = uid
        self.entry_uid = entry_uid
        self.type = type
        self.filename = filename
        self.position = position

    def __repr__(self):
        return (
            f"Attachment(uid='{self.uid}', entry_uid='{self.entry_uid}', "
            f"type='{self.type}', filename='{self.filename}', position={self.position})"
        )


def parse_locations_xml(xml_file_path):
    """
    Parses an XML file containing location data into a list of Location objects.

    Args:
        xml_file_path (str): The path to the XML file.

    Returns:
        list: A list of Location objects.
    """
    tree = ET.parse(xml_file_path)
    root = tree.getroot()

    locations = []
    for r_element in root.findall("r"):
        uid = r_element.find("uid").text if r_element.find("uid") is not None else ""
        title = (
            r_element.find("title").text if r_element.find("title") is not None else ""
        )
        address = (
            r_element.find("address").text
            if r_element.find("address") is not None
            else ""
        )

        lat_text = (
            r_element.find("lat").text if r_element.find("lat") is not None else "0.0"
        )
        lng_text = (
            r_element.find("lng").text if r_element.find("lng") is not None else "0.0"
        )
        zoom_text = (
            r_element.find("zoom").text if r_element.find("zoom") is not None else "0"
        )

        try:
            lat = float(lat_text)
        except ValueError:
            lat = 0.0

        try:
            lng = float(lng_text)
        except ValueError:
            lng = 0.0

        try:
            zoom = int(zoom_text)
        except ValueError:
            zoom = 0

        location = Location(uid, title, address, lat, lng, zoom)
        locations.append(location)
    return locations


def parse_attachments_xml(xml_file_path):
    """
    Parses an XML file containing attachment data into a list of Attachment objects.

    Args:
        xml_file_path (str): The path to the XML file.

    Returns:
        list: A list of Attachment objects.
    """
    tree = ET.parse(xml_file_path)
    root = tree.getroot()

    attachments = []
    for r_element in root.findall("r"):
        uid = r_element.find("uid").text if r_element.find("uid") is not None else ""
        entry_uid = r_element.find("entry_uid").text if r_element.find("entry_uid") is not None else ""
        type = r_element.find("type").text if r_element.find("type") is not None else ""
        filename = r_element.find("filename").text if r_element.find("filename") is not None else ""
        position_text = r_element.find("position").text if r_element.find("position") is not None else "0"

        try:
            position = int(position_text)
        except ValueError:
            position = 0

        attachment = Attachment(uid, entry_uid, type, filename, position)
        attachments.append(attachment)
    return attachments


def write_files():
    """
    Parses the Diaro XML data from a string, creates markdown files,
    and zips them up.
    """
    data = open('DiaroBackup.xml').read()
    root = ET.fromstring(data)

    folders = {
        folder.find('uid').text: folder.find('title').text
        for folder in root.findall(".//table[@name='diaro_folders']/r")
    }
    # locations = {
    #     location.find('uid').text: location.find('name').text
    #     for location in root.findall(".//table[@name='diaro_locations']/r")
    # }
    # moods = {
    #     mood.find('uid').text: mood.find('name').text
    #     for mood in root.findall(".//table[@name='diaro_moods']/r")
    # }

    entries = root.findall(".//table[@name='diaro_entries']/r")

    output_dir = "diaro_markdown_entries"
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    markdown_files = []

    xml_path = "./locations.xml"
    parsed_locations = parse_locations_xml(xml_path)

    xml_path_attach = "./attachments.xml"
    parsed_attach = parse_attachments_xml(xml_path_attach)

    for entry in entries:
        date_ms = int(entry.find('date').text)
        dt_object = datetime.fromtimestamp(date_ms / 1000)
        date_str = dt_object.strftime("%Y-%m-%d %H:%M:%S %z")

        text_element = entry.find('text')
        text = text_element.text if text_element is not None and text_element.text else "No content"


        title_element = entry.find('title')
        title = title_element.text if title_element is not None and title_element.text else "Untitled"

        if title == "Untitled":
            title = re.sub(r"[^\w]", " ", text.split('.')[0][:30])
            # Use the first X characters of text if title is "Untitled"

        mood_element = entry.find('mood')
        mood = title_element.text if mood_element is not None and mood_element.text else "3"

        if mood == "1":
            mood = "5-🥰"
        if mood == "2":
            mood = "4-🙂"
        if mood == "3":
            mood = "3-😑"
        if mood == "4":
            mood = "2-🙁"
        if mood == "5":
            mood = "1-😢"


        media_path = "../media/"
        attachments = ""
        for i in parsed_attach:
            if i.entry_uid == entry.find('uid').text:
                attachments = "\n\n---------------------------------------------------------------"
                if i.type == "image":
                    attachments += f"\n\n![{i.filename}]({media_path}{i.filename})\n\n"
                elif i.type == "video":
                    attachments += f"\n\n[Video: {i.filename}]({media_path}{i.filename})\n\n"
                elif i.type == "audio":
                    attachments += f"\n\n[Audio: {i.filename}]({media_path}{i.filename})\n\n"
                else:
                    attachments += f"\n\n![{i.filename}]({media_path}{i.filename})\n\n"


        # Create a safe filename
        safe_title = re.sub(r'[^\w\s-]', '', title).strip().lower()
        safe_title = re.sub(r'[-\s]+', '-', safe_title) if safe_title else "entry"
        filename = f"{dt_object.strftime('%Y-%m-%d')}-{safe_title}.md"



        # Front Matter
        front_matter = ""
        front_matter += "---\n"
        if title:
            front_matter += f"title: {title}\n"
        else:
            front_matter += f"title: Entry on {dt_object.strftime('%Y-%m-%d')}\n"


        front_matter += f"date: {date_str.rstrip()}\n"

        folder_uid = entry.find('folder_uid').text
        if folder_uid and folder_uid in folders:
            front_matter += f"folder: {folders[folder_uid]}\n"

        # location_uid = entry.find('location_uid').text
        # if location_uid and location_uid in locations:
        #     front_matter += f"location: {locations[location_uid]}\n"

        location_element = entry.find('location_uid')
        location = location_element.text if location_element is not None and location_element.text else "N/A"

        if location != "N/A":
            # Find the location in the parsed locations
            location_obj = next((loc for loc in parsed_locations if loc.uid == location), None)
            if location_obj:
                location_obj = location_obj
            else:
                location_obj = "Unknown Location"

        if location != "N/A":
            front_matter += f"location: {location_obj.lat},{location_obj.lng}\n"
            front_matter += f"location_title: {location_obj.title}\n"
            front_matter += f"location_address: {location_obj.address}\n"
        else:
            front_matter += "location: Unknown\n"

        if mood:
            front_matter += f"mood: {mood}\n"

        tags_text = entry.find('tags').text
        if tags_text and tags_text.strip(',') :
            tags = [tag.strip() for tag in tags_text.split(',') if tag.strip()]
            front_matter += f"tags: {tags}\n"

        # mood_uid = entry.find('mood').text
        # if mood_uid and mood_uid in moods:
        #     front_matter += f"mood: {moods[mood_uid]}\n"

        weather_temp = entry.find('weather_temperature').text
        weather_icon = entry.find('weather_icon').text
        weather_desc = entry.find('weather_description').text

        if weather_temp:
            front_matter += f"temp_c: {weather_temp}\n"
        if weather_icon:
            weather_data = weather_icon.split("-")

            time_of_day = weather_data[0]
            if time_of_day == "day":
                front_matter += "time_of_day: day\n"
            elif time_of_day == "night":
                front_matter += "time_of_day: night\n"
            else:
                front_matter += "time_of_day: unknown\n"

            weather_icon = weather_data[1:]
            if weather_icon:
                if 'alt' in weather_icon:
                    weather_icon.remove('alt')
                front_matter += f"weather: [{', '.join(weather_icon)}]\n"

        if weather_desc:
            front_matter += f"forcast: {weather_desc}\n"

        if "-----BEGIN PGP MESSAGE-----" in text:
            front_matter += "encrypted: true\n"
        else:
            front_matter += "encrypted: false\n"

        front_matter += "---\n\n"

        converted_entry = front_matter + text + attachments + "\n"

        # Write markdown file
        filepath = os.path.join(output_dir, filename)
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(converted_entry)
        markdown_files.append(filepath)


if __name__ == "__main__":
    write_files()

AI version

I ran this through Gemini to see if it would emulate my style… eh…

I recently made the switch from [Diaro](https://diaroapp.com) to [Obsidian](https://obsidian.md/) for my personal diary. While I had a great experience with Diaro - their lifetime subscription was a steal at around $20, and their Dropbox sync never lost a single entry over three or four years. Diaro's feature to automatically tag location, time, and date from uploaded photos was incredibly convenient, and I'm hoping to find a similar workflow in Obsidian.

## Extracting the Data

Getting my data out of Diaro proved to be the trickiest part. Diaro offered various export options (CSV, text, PDF), but these varied by platform (mobile vs. web) and none provided an easy way to export photos alongside the entries. Ripping images from PDFs was an absolute nightmare.

Fortunately, I dug into the Dropbox backend Diaro used for syncing. There, I discovered XML files containing all my entries and, more importantly, all my photos as JPEGs. The XML structure was relatively sane, with attachments, locations, and the diary entries themselves all contained within one large document, which made parsing manageable.

## Custom Scripting for Conversion

To get everything into a standardized Markdown format with YAML front matter and embedded images, I cooked up some Python magic. This conversion process took me a solid two to three days over a weekend. While not "crazy" in terms of effort, I do wish Diaro offered more robust export options – perhaps JSON or CSVs with easily linked attachments, or even a dedicated photo dump.

That said, for a $20 lifetime license and the peace of mind that came with never losing a single post (kudos to Dropbox for the reliable backend!), I can't really complain too much. The Diaro app itself was consistently reliable across all my devices, and the website was solid.

I'll include a link to the Python script, or perhaps embed it directly in this post, just in case it can help someone else migrate their notes from Diaro to Obsidian or another Markdown-based diary application down the road.

Thanks for reading!
--- over From this service to either obsidian Or another Mark down based Diary application. Thanks!

Reply:
Mastodon Bluesky Email