nextcloud external edits post
Build / Deploy (push) Successful in 2m8s Details

This commit is contained in:
Rik Berkelder 2025-02-13 00:05:26 +01:00
parent b5c8a022fc
commit c1f52b583d
2 changed files with 144 additions and 1 deletions

View File

@ -0,0 +1,143 @@
---
slug: "nextcloud-external-edits"
title: "Getting Nextcloud to play nice with external file changes"
date: "2025-02-13"
modified: "2025-02-13"
description: "An exciting* tale of getting Nextcloud to handle externally changed files gracefully. With detours through docker wrangling and inotify"
---
You may find yourself hosting a home server.
And you may find yourself wanting to put a bunch of storage in it and use it as a NAS which you access through SMB or NFS on your LAN.
And you may ask yourself, "Well, how can I access these files through a web interface when I'm not at home, and share them with other people when needed?"
And then you land on Nextcloud.
Or at least I did. And I've been using Nextcloud happily as a self-hosted cloud storage solution. Using the External Storage plugin to access files from a directory that I'm also exposing to my network using NFS and SMB. There's one tiny little problem though: Nextcloud doesn't particularly like it when you change the files it manages outside of Nextcloud. There's a setting that automatically rescans external folders for edits when you browse to them on the web interface and through the mobile apps. That fixes it then, right? End of article and all is good in the world.
Except it doesn't account for the Nextcloud desktop client. When polling for server updates, whatever API endpoints the desktop app is using aren't rescanning the folders. So when you use the desktop client to sync files to your machine, it won't pick up the remote changes you made directly through the NAS side of things (until you browse to the folder using the web interface, but that's not particularly convenient.)
(TL;DR: there's a git repo in the "Takeaways" section at the bottom of the post.)
## The status quo
I've attempted several solutions to this problem, the two main ones being external syncing software like Syncthing, and accessing the SMB share over my VPN. SMB doesn't like an unstable connection, so using it over a VPN under sometimes not particularly great network circumstances wasn't particularly stable. Syncthing worked alright (when i used it a couple of years back) but it lacks features compared to the official client, such as being able to create Nextcloud share links directly from Windows Explorer and the virtual drive client. The virtual drive client is quite neat, because it lets me browse all of my files in Explorer without having to sync all of them until I need to actually interact with them.
I accepted the situation as it was for a while. I've been using the Nextcloud app on android to auto-upload my pictures (one-way, so no problems there). For my notes I've been using the Remotely Save plugin for Obsidian using Nextcloud's WebDAV API (so not editing the files externally) and more or less just not syncing anything else at all, mostly doing file editing on the go through Nextcloud's web interface.
## Amnesia
But then, on a faithful day in December, a little syncing conflict ended up causing the mobile version of my Scratchpad note (in which I put the location in the huge parking garage under the CCH convention center in Hamburg) to be overwritten by the version that was on my laptop, which contained a little reminder about something to look up later at home. Remotely Save obviously couldn't know which was more important to me. This, however, did not change the fact that we spent 20 minutes looking for my car. Little conflicts like this (which it just throws out) have caused me some minor memory loss in the past. My ADHD brain can't remember things, so all of my actually useful memory has been outsourced in computers. If I lose data, it's basically amnesia to me.
This little conflict got me lamenting the fact that I can't just use the official Nextcloud syncing clients to a friend who was around at the time. Their response boiled down to "why don't you just do everything through the Nextcloud clients and stop editing your files externally".
I'm very spite driven, so this statement obviously triggered me to finally figure this out once and for all.
# OCC
Nextcloud comes with a CLI tool called OCC. this talks to the server locally and lets sysadmins trigger a whole bunch of commands, mostly for server management and maintenance. It's a great help to get you out of a pinch you can't solve through the web UI, like the time I set up OIDC wrong and it stripped the admin role from my admin user, locking me out of the server (oops).
There's a command in OCC that is particularly interesting here, though: `occ files:scan`. This can be used to rescan all of the files that belong to a user for changes. Now, running this on my complete collection of files (~600000 files, at this point) would take over an hour and be quite CPU-intensive, so that's not something I can just put on a 5 minute cronjob and call it a day.
I was more or less ready to start thinking of different solutions (or give Syncthing another go, it's been a year or two after all), but then, in my greatest time of need, I happened to remember inotify. If you're not aware, inotify is a Linux Kernel API that lets you monitor files in your filesystem for changes, and then make things happen. This can be combined quite nicely with `files:scan`'s `--path` option to only scan single files when they are updated. Rescanning a single file isn't particularly resource intensive and only takes about a second on my system. All seems good, right? Except I ran into the next problem: OCC only runs locally.
You may think "well, just write a little script to set up an inotify watcher that runs the OCC command and you're off to the races". And you'd be right, if I didn't run all of my services in Docker containers.
## Docker Wrangling
If you know Docker, you probably know that it mainly operates on the principle of one (main) process per container. If you don't know Docker, I envy you, but that's all you need to know for now.
Now, this is a fine and dandy paradigm and usually what you want when your containerizing. If you want to have multiple things going on, you set up separate containers and let them talk to each other using docker's internal networking. The problem is, I can't call OCC through a network call to Nextcloud's API. I _need_ to run these commands inside the container.
My first thought was writing a separate little container to watch the files, give it access to the host's docker socket file, and let it use `docker exec` to run commands on the Nextcloud container. The problem is, giving a container access to docker.sock gives it access to _everything_ docker is doing on the system. That's quite a lot, and even if the container should be quite secure (since all it does is watch files and run a script), it's still a pretty bad practice.
My next thought was to hijack the crontab that Nextcloud was already using to run it's scheduled jobs. I mounted the crontab file to the host, and messed around adding some extra lines. Getting these to run ended up becoming a lot more of a hassle than expected.
While I was poking around the running Nextcloud container, it became very clear that there were, in fact, quite a few processes running. Poking around a little bit more, I saw some references to something called "s6-rc". An init system, in a Docker container. Not what I would have expected, but that could certainly be convenient.
A little Googling later, I figured out that s6-init is part of s6-overlay, which is in turn a set of scripts that helps you manage the lifetime of a docker container. It can actually be added quite easily to an existing image, but `linuxserver/nextcloud` already uses it for getting everything rolling.
Basically, s6-rc looks for service definitions in `/etc/s6-overlay/s6-rc.d`. It expects a folder, with the name of your service being the folder name (I called mine svc-inotify). Your folder then should containing a file named `run`, which is a bash script, and a plaintext file called `type` which contains either "oneshot" (initialization tasks) or "longrun" (for daemons).
Additionally, I've added a `dependencies.d` subfolder containing a blank file named `init-services` to make sure that my inotify script runs after all of the other init steps, so that it only starts watching files when Nextcloud is running.
After that, all you have to do is add a blank file with the name of your service to the `/etc/s6-overlay/s6-rc.d/user/contents.d`, and your `run` script will now happily get triggered whenever the container starts.
## Actually doing the magic
After getting my test script to happily write "hello-world.txt" to the root of the container whenever it started, all that was left to do was to write a little script to do what we came here to do in the first place: watch files for changes and tell Nextcloud to rescan them. Going into this, I expected that to be the hardest part (and this whole thing to be a little one hour project). After all the Docker shenanigans, it actually turned out to be the simplest bit.
That script turned out like so. Do please excuse my messy bash, I (fortunately) don't write a lot of it.
```bash
#!/usr/bin/with-contenv bash
# shellcheck shell=bash
# Check if inotify exists. If not, install it.
if ! command -v inotifywait 2>&1 >/dev/null; then
echo "No inotifywait found. install inotify tools"
apk add inotify-tools
fi
# Define stuff that should be excluded some watches.
BASEEXCLUDE='.*\.swp|.*\.swx'
# The main function where all of the magic happens. takes 4 arguments:
# - The folder to watch.
# - Stuff to exclude.
# - Two arguments to replace things in the paths coming from inotify
# to turn them into a path nextcloud understands.
detect(){
WATCH=$1
EXCLUDE=$2
REPLACEIN=$3
REPLACEWITH=$4
echo "watching $WATCH and excluding $EXCLUDE"
# This uses inotifywait to watch for changes.
# Then it runs the code in the while block to
# handle file paths it spits out.
inotifywait -mre close_write,delete \
--format '%e|%w|%f' \
--exclude "$EXCLUDE" \
"$WATCH" | while read RAWEVENT \
/
do
# inotifywait spits out the events like
# "eventtype|/directory|filename",
# here we split them into it's three parts.
IFS='|'
read -a SPLITEVENT <<< "$RAWEVENT"
EVENT=${SPLITEVENT[0]}
DIR=${SPLITEVENT[1]}
FILE=${SPLITEVENT[2]}
# by default, scan the whole directory
# we can't scan a file that's been deleted
REPLACEDFILE=$DIR
# if the event wasn't a delete, we append the filename
# so we only scan the file that was changed.
if [[ $EVENT != 'DELETE' ]]; then
REPLACEDFILE+=$FILE
fi
# replace the beginning of the paths to the filename.
# inotify spits out the filesystem path
# nextcloud expects the path relative to the user directory
if [ -n "$REPLACEIN" ] && [ -n "$REPLACEWITH" ]; then
REPLACEDFILE=${REPLACEDFILE/$REPLACEIN/$REPLACEWITH}
fi
# actually run the OCC command
occ files:scan --path=\"$REPLACEDFILE\" --shallow
done
}
# run the scan function on the folders I want scanned
detect /external/riksolo/Audio $BASEEXCLUDE /external/riksolo /riksolo/files &
...
detect /external/riksolo/Videos $BASEEXCLUDE /external/riksolo /riksolo/files
```
## Takeaways
This setup actually works really well for my purposes. It does what I set it out to do, and it works quite well for my small, personal Nextcloud instance. It takes a few minutes for inotify to set up the watchers for all of the files I'm telling it to watch, which indicates to me that this wouldn't scale particularly well to larger instances. On the other hand, I think the usecase of being able to access files both through SMB/NFS and the cloud web interface.
I did end up limiting myself to only setting up the watchers for base folders where I actually needed it. For each file, inotify has to take a little teeny tiny bit of time to set up the watcher. My entire external storage includes things I don't need to sync with the official clients, but are quite a lot of files, such as a steam library and some game server data. The total directory is over 600,000 files, and after giving it 45 minutes I ended up killing the process when I tried to watch the full directory. The folders I am watching are still easily some tens of thousands of files, though.
Additionally I was happy to see that the hardware load of scanning the files is actually more or less negligable on my little homelab box (Ryzen 5 1600 6-core running at 3.2GhZ). Even running a test where copied over a large folder with a bunch of images didn't make the htop graphs flinch with any kind of significance.
If you want to implement this for your own Nextcloud instance, I set up a [a little example repo here](https://git.riksolo.com/RikSolo/nextcloud-docker-inotify). I obviously don't take any responsibility if anything goes wrong with your data, or you get fired because you broke your day job's Nextcloud instance. That said, in the few weeks between setting this up and writing this post, my server hasn't spontaneously caught fire yet!
It'd sure be nice if Nextcloud would just rescan the files when it's trying to sync them using the desktop client, though...

View File

@ -4,7 +4,7 @@ permalink: "/"
---
<h1>Rik Berkelder</h1>
Lighting Designer | Software Engineer | All-round nerd on the internet
Lighting Designer | Software Engineer | Nerd
<h2>Skills</h2>
<div class="flex">