Corrupt empty folders on remote... what to do?

While using the remotestorage.js client, with a 5apps account, I sometimes end up with a folder that has all its contents deleted, but the empty folder remains with some partial metadata. It happens rarely, so I don’t have steps to reproduce, but it’s pretty disruptive because it results in a path prefix being unusable.

If I try to click one of these folders in Inspektor.5apps.com, I’ll get an exception like:

Error while processing route: index Folder description at https://storage.5apps.com/myuser/myapp/shouldBeDeletedFolder/ is not JSON

Then I can’t store to a subpath of /shouldBeDeletedFolder/ until this resolves… which, after a long time, it mysteriously does (a sweeper, maybe?) .

I’m hoping for advice on a workaround or debugging. I didn’t find a relevant issue in the remotestorage.js or remotestorage-server github repos.

I’m not sure I can follow entirely. Let’s try to unpack it a bit:

I sometimes end up with a folder that has all its contents deleted, but the empty folder remains with some partial metadata.

This should obviously not happen just out of nowhere, and I don’t know of any code at 5apps that would be able to delete the content of folders, other than the one that runs cleanup jobs after someone cancels their whole account.

Folders in RS are just paths really. As per the spec it’s not possible to DELETE a folder path. You have to send DELETE requests for all the documents it contains, and then the folder will disappear from the parent folder’s item list.

So, the question would be which “partial metadata” is left in this situation, and if the documents are actually gone, or if maybe the folder listing is broken, i.e. it doesn’t contain the items, while the items still exist.

Then I can’t store to a subpath of /shouldBeDeletedFolder/ until this resolves

What does “this resolves” mean exactly? You should always be able to store any file to any path in the tree basically, no matter if the folder “exists” beforehand or not.


Edit: we (at 5apps) did see some connection issues between our RS API and Redis host recently, so maybe this could have something to do with potentially broken folder listings/metadata…

Sorry, I didn’t mean to suggest that any objects/files were deleted when they shouldn’t be. My program intentionally sent delete requests for all the objects/files under a path. Here is an actual browser console session:

Works as expected:

> await raw_rsclient.getListing('/never_used_path/')  
[requests fetch] Response {type: 'cors', url: 'https://storage.5apps.com/dustinw/chatgptweb/never_used_path/', redirected: false, status: 200, ok: true, …}
remotestoragejs.js?v=90824a05:225 [WireClient] Successful request 43f7c78e19b08750294798ed695b6e5c 
< {}

An apparently-corrupt path:

> await raw_rsclient.getListing('/approxModtime4/')  
remotestoragejs.js?v=90824a05:225 [requests fetch] Response {type: 'cors', url: 'https://storage.5apps.com/dustinw/chatgptweb/approxModtime4/', redirected: false, status: 500, ok: false, …}
remotestoragejs.js?v=90824a05:225 [WireClient] Successful request null
Uncaught Folder description at https://storage.5apps.com/dustinw/chatgptweb/approxModtime4/ is not JSON

getAll('/approxModtime4/') fails with the same error. storeObject and getObject are working for me now for paths rooted at the problematic /approxModtime4/, so I may have been mistaken when I said I can’t store to a subpath of such a path.

The reason that folder is named /approxModtime4 is because /approxModtime3, /approxModtime2, etc were previously corrupted, and I’ve just been kicking this problem down the road while focusing on other dev tasks. The problems with those paths later “resolved” at some point (I don’t know why), in the sense that they no longer show up in getListing('/'), and getListing('/approxModtime3/') no longer errors.

Maybe the Redis issue resulted in an incomplete execution of the workflow for removing metadata associted with the folder after all the documents under it are deleted?

Anyway, thanks so much for your help and providing the service!

This explains it! The request to the folder failed with a 500 status (internal server error), and I also just found the respective failed requests from our API server to our Redis server.

It looks like metadata was corrupted indeed, because I can reproduce the failed response without current connection issues.

Not that it helps you immediately, but at first glance it seems like a Last-Modified date is missing on some document’s metadata. This points to an unknown bug in Liquor Cabinet (maybe related to handling failed PUT requests), to which I have to say congrats for finding it, because we haven’t seen one in a very long time! :sweat_smile:

I’m going to have a closer look and see what’s going on exactly. And we should probably also make it easier to handle failed folder listing requests in rs.js, too.

@dustin I found the exact issue and fixed it the two of your folders that were affected. The item was still stored in the folder’s item list (a set of paths in Redis), while the item metadata key had been deleted.

Would you mind sharing the code that leads to the situation (in private if you want)? I’m a little perplexed as to how this would fail repeatedly in the same way, and not just be a wild coincidence happening a single time exactly at that point in the code/process.

1 Like

The item was still stored in the folder’s item list (a set of paths in Redis), while the item metadata key had been deleted.

What is “item” in this context? A leaf/file path under what I called a “corrupt folder” (e.g. /approxModtime4/)?

I haven’t tried to reproduce the bug yet – maybe that would be better than dumping the whole codebase on you. It’s a fork of GitHub - Niek/chatgpt-web: ChatGPT web interface using the OpenAI API, where I’m adding a sync feature. It’s not ready to be made public but I could share it with you privately.

Yes, it’s just path/filename that is stored in a set/list in Redis by Liquor Cabinet, so it’s more efficient to assemble folder listings. It’s not really relevant for the client-side issue, or a potential workaround in rs.js or in your app.

Reproducible bugs are ideal, of course! But feel free to DM me any code snippets here, also if you haven’t reproduced it yet.