hhmx.de

Föderation · Fr 14.03.2025 02:04:27

A list of Digital Service Providers outside the jurisdiction of the United States of America.

https://codeberg.org/Linux-Is-Best/Outside_Us_Jurisdiction

This is a group project, so feel free to reach out if you have any suggestions, or learn any new information.

Föderation EN Fr 14.03.2025 02:27:41

@Linux @grrrr_shark I believe the Mastodon hosting service @mastohost is based in Europe.

Föderation EN Fr 14.03.2025 09:15:24

@Linux I wouldn't call Masto.host a reseller of OVH 😊 but yes Masto.host uses OVH to host the server infrastructure.

But it relies on other services, namely bunny.net/ for CDN, cloudns.net/ for DNS that are also in Europe.

@fuzzface @grrrr_shark

Föderation EN So 16.03.2025 10:25:59

@mastohost uses OVH object storage based on , right? Did you cross 1 PB yet? Does OVH perform deduplication for byte-wise identical files? That would be a significant environmental advantage! (Cf. jortage.com/ .)

Föderation EN So 16.03.2025 10:38:26

@nemobis Yes, Masto.host uses OVH Swift Object Storage but it's not at a PB. There is daily remote media cache deletion for everyone hosted on Masto.host masto.host/re-mastodon-media-s so the media storage has not been growing exponentially.

Regarding deduplication, never found any OVH information talking about that.

Föderation EN So 16.03.2025 11:17:14

@mastohost Thanks! I've now asked on the cloud@ml.ovh.net mailing list: the subject is "Swift object storage and deduplication", can't remember whether the archives are public.

I can find a lot of documentation on how rings use file hashes, but nothing on how those hashes are created in the first place.

docs.openstack.org/swift/lates
docs.openstack.org/security-gu

Föderation EN So 16.03.2025 11:29:11

@nemobis I know they created a custom solution for OpenStack Swift storage:

blog.ovhcloud.com/dealing-with

blog.ovhcloud.com/dealing-with

Don't know if this is still in place or if there is something else they are doing now.

Föderation EN So 16.03.2025 11:30:58

@mastohost As of July 2024, storage.sbg.cloud.ovh.net was based on Swift, according to a message in the mailing list from an OVH engineer "Re: [cloud] High Latencies and high error rate - SBG Swift Object Storage".

Föderation EN Sa 22.03.2025 13:35:35

@mastohost An OVH engineer answered that "On our object storage swift, the hashing mechanism is based on the account/bucket/path rather than the file's content." So there is no deduplication.

Föderation EN Sa 22.03.2025 13:55:02

@shlee @nemobis The main problem I have with these solutions is that when someone wants to move out of Masto.host I would have no simple way to provide them with a backup of the cached media files.

So, deduplication on the actual object storage filesystem would be much better.

Föderation EN Sa 22.03.2025 13:57:33

@mastohost @nemobis good point.. I'll raise a ticket for that situation.

Föderation EN Sa 22.03.2025 14:03:09

@shlee There are other issues, but that is the main one :P

For example, if a media file gets cached and you want to delete it because it contains some illegal content or something you don't want to host. Then, you would need to go to the databases one by one and delete references to that file. It's probably impossible if you don't store the hash each proxied file generates or something like that.

More edge cases like that if I spend some time thinking about it.

@nemobis

Föderation EN Sa 22.03.2025 14:04:43

@mastohost @nemobis this is a good point. I figure you would need an api endpoint to ban a hash?

then it should clean itself and stop future uploads.

Föderation EN Sa 22.03.2025 14:11:09

@shlee @mastohost @nemobis keep in mind you don't just want cryptographic hashes but also perceptual for moderation.

(I'm actually going to prototype something in the T&S side of this space soon)

Föderation EN Sa 22.03.2025 14:13:42

@shlee @mastohost @nemobis though for media management, you could probably do a object storage that is content-addressed under the hood, such that file paths are just a reference to the content hash

Föderation EN Sa 22.03.2025 14:18:01

@thisismissem I have thought about that many times.

I never moved forward because when I started to consider the server resources necessary to maintain a large database, deal with proxies, etc., I was not sure if that would end up being more expensive and not bring any real-world advantage.

@shlee @nemobis

Föderation EN Sa 22.03.2025 14:22:35

@mastohost @thisismissem @nemobis (unless I am mistake) jortage/s3-proxy does that... you have separate api tokens per tenant.. and they upload a file and get back a unique path... and the file path is written into the DB with a hash to a file

The file is uploaded to the backend if it's the first upload, or deleted from the proxy on the second upload.

The jortage dev was talking about a "rivet protocol".. so mastodon instances could say.. do you have a file with this hash already? before it uploads otherwise it just asks for a reference URI to the known file... to save an upload

(I really need a day or so to make an official end to end spec for a fediverse storage protocol/solution... then somebody could code it for me :annoyingdog: )

Föderation EN Sa 22.03.2025 14:13:23

@shlee Yes, something like that.

In the end, I am still not sure if server resources/costs (to deal with the proxy redirects) plus extra development and maintenance vs object storage resources/costs would be a net positive.

@nemobis

Föderation EN Sa 22.03.2025 17:26:53

@mastohost @shlee @nemobis Anyone with the token can just run an `mc mirror` or equivalent to leave with all their files.