Archiving web contents is important. Not only to have an exact copy of the content from the time you were reading it, but also as a means of preserving the information in an ever volatile world.
In the past I was a huge fan of del.icio.us, a social bookmarking and tagging service. Syncing my bookmarks across different devices felt like the best choice – until I had visited some of the links and found myself in a world of dead links. Ever since then I have been trying to keep an archived copy of the sites around.
Personally, I still think that historious is an absolutely awesome service and I happily pay the subscription fee every year. historious handles more complex sites like reddit or YouTube perfectly and is a great way to quickly archive, tag and search – basically my old del.icio.us workflow but on steroids.
However, this still means you are reliant on someone else to host the service for you. And with the volatile way the web is, historious might just disappear tomorrow. The answer: Host your own application to do the job. But which one?
Over the course of 2023 and 2024 I have evaluated multiple different solutions with very different ideas how to approach the problem of how to archive a website and – more importantly – what to archive from a website.
I always come back to what I personally consider the best software around: Shiori.
While Shiori has some issues with reddit posts, it handles classic forum posts just fine. The content extractor also allows to archive PDF files and news articles from pretty much all the websites I ever visit. Unlike solutions like Obsidian’s Web Clipper that often picks the wrong content, I can be confident that I do not need to double-check the archive of everything I push into Shiori.
Shiori also has a basic Manifest v2 browser extension that allows tagging and saving contents with two clicks – absolutely awesome and all you really need! It does one thing and it does it well.
You have plenty of control to determine on a post-by-post basis whether an archive is available publicly or not and the user interface works great on mobile devices too.
Unlike Readeck or Wallabag, Shiori extracts the contents I want. This might not sound like a big deal to you, but for me it is incredibly important to have a solution I can depend on. Unlike MyBase the contents stays intact when being archived.
Over a long period of time I have come to the realization that for my use-case, Shiori simply seems to be the best fit. The out-of-box experience is just so smooth and stays out of your way. I cannot recommend the software highly enough.