There’s a corner case in my rework of the S3 backend for Armadietto — sometimes the Content-Type of a document will not be available when the rest of the metadata is. What’s the best value to use in the folder listing? application/octet-stream
is the standard when the actual type is unknown, but I wonder if the empty string, or not including the Content-Type
key in the JSON would lead to a better user experience.
I think the correct solution would be to find out why it’s not available and then fix that, instead of introducing a new quasi standard for when the software is unreliable. If the rest of the metadata is available, then it seems wrong to me that this specific property should be allowed to be missing.
Maybe we could help with the underlying issue?
The underlying issue is that the Content-Type of an S3 blob (“object”) is not returned in listings , which are used to generate the remoteStorage folder data. The ETag, Size [Content-Length] and Last-Modified are. It seems wrong to me, too, that the Content-Type is not returned, but that’s what Amazon has implemented, and I’m not aware of S3-compatible storage that does include it.
It’s possible to cache the Content-Type, but when not cached, one must make a HEAD request to get the Content-Type. Usually that will succeed, but on occasion it might fail, perhaps because requests are being made to S3 too fast. If the HEAD request fails, it’s not clear when another request would succeed, so the remoteStorage folder data must be returned without the Content-Type datum.
Hmm, OK. I would generally not recommend creating folder listings from S3 requests directly, because that will be considerably slower than a local cache, even without content types.
Aside from it being slow, I would also be worried about consistency when fetching listings immediately after PUT requests.
At 5apps, we’ve been using S3-compatible storage for over a decade now, but we keep all metadata in Redis, and Liquor Cabinet can issue a single query to assemble the entire directory listing in one go and very quickly. Maybe it doesn’t have to be Redis, or any additional server process like that, but keeping some kind of local cache is probably the right choice when using any kind of object storage as a back-end.
… that said, the correct choice for an incomplete listing is probably to not include the property at all, since it’s currently a SHOULD in the spec.
We have to keep this in mind when finally tackling client.getListing not returning full info · Issue #1108 · remotestorage/remotestorage.js · GitHub
Not returning the field is what happens by default.