Edit: obligatory explanation (thanks mods for squaring me away)…
What you see via the UI isn’t “all that exists”. Unlike Reddit, where everything is a black box, there are a lot more eyeballs who can see “under the hood”. Any instance admin, proper or rogue, gets a ton of information that users won’t normally see. The attached example demonstrates that while users will only see upvote/downvote tallies, admins can see who actually performed those actions.
Edit: To clarify, not just YOUR instance admin gets this info. This is ANY instance admin across the Fediverse.
So any instance admin can analyze all users upvotes/downvotes and possibly derive political standpoints, likes/dislikes, opinions and location data from it
Yes.
Just muddling around I’ve built queries that: (a) list all of my post & comments, everybody who voted on them, and their votes (b) tally how many times specific users have upvoted or downvoted me. © identifies the most prolific voters across the Fediverse and the communities they are voting in (d) identifies users with the same username or display name across all instances and correlates the activities across those accounts.
These are all for the sake of learning and are innocuos the way I’m using them. It is plain to see that someone with skills and an agenda could make more out of it than I have.
So you have raw database access and you can see that data. Why is this surprising? The systems I’ve used that solve storing data encrypted have massive usibility hits around exchanging and authenticating keys to a point where it sucks so bad I just want to disable it (matrix is a good example, non question their key exchange bullshit is hindering their adoption). I’m not saying this couldn’t be fixed but should it? Most services that use a database will be inline with your discovery of how Lemmy uses that database. Storing something encrypted that is meant to be viewed publicly is the same outcome with more steps. If someone cares enough to monetize it just patch the code to change whatever behavior you don’t like. I havent seeing anything about an acceptance test for Lemmy instances or anything that requires someone to use an unaltered version of Lemmy. How do you know the server admin isn’t already doing all of this? You don’t. Don’t expect privacy in public spaces.
So you can get the users voting on posts on other instances?
Could it be anonymized, so you can get exact up/downvote data from your instance, but when it comes to other instances you only get the absolute up/downvotes?
So you have raw database access and you can see that data. Why is this surprising? The systems I’ve used that solve storing data encrypted have massive usibility hits around exchanging and authenticating services to a point where it sucks. I’m not saying this couldn’t be fixed but should it? Most services that uses a database will be inline with your discovery of how Lemmy uses that database. Storing something encrypted that is meant to be viewed publicly is the same outcome with more steps. If someone cares enough to monetize it just patch the code to change whatever behavior you don’t like. I havent seeing anything about an acceptance test for Lemmy instances or anything that requires someone to use an unaltered version of Lemmy. How do you know the server admin isn’t already doing all of this? You don’t. Don’t expect privacy in public spaces.
You posted three times, may want to delete the extras. Did you press post multiple times?
It seems these multi-posts are typically coming from a user getting an error message when their post actually goes through, then they try posting again.
After I learned about that I’ve been bookmarking comments I want to reply to, copy my intended post in another document, then check later to see if what I wrote was actually posted. If yes, yay, don’t have to worry about multiposting. If no, I just post once the server isn’t being weird.
That is exactly what happened. I posted it said network error and acted like I hadn’t submitted my comment. Rinse repeat and here we are, It also looks like they were auto deleted though? I don’t see them and I don’t see them and I didn’t delete them.
Nevermind found them and deleted them and got the same network error while deleting. Lucky me I picked lemmy.ml before the reddit exodus.
So you have raw database access and you can see that data. Why is this surprising? The systems I’ve used that solve storing data encrypted have massive usibility hits around exchanging and authenticating keys to a point where it sucks so bad I just want to disable it (matrix is a good example, non question their key exchange bullshit is hindering their adoption). I’m not saying this couldn’t be fixed but should it? Most services that use a database will be inline with your discovery of how Lemmy uses that database. Storing something encrypted that is meant to be viewed publicly is the same outcome with more steps. If someone cares enough to monetize it just patch the code to change whatever behavior you don’t like. I havent seeing anything about an acceptance test for Lemmy instances or anything that requires someone to use an unaltered version of Lemmy. How do you know the server admin isn’t already doing all of this? You don’t. Don’t expect privacy in public spaces.
To further this thought, it makes it really easy for any motivated party to profile accounts.
Create an account that posts intentionally politically motivated news or comments.
Rinse and repeat a few times and now you the data you want.
How is this different than any other website?
I can’t just spin up a website and automatically get that info from other websites, but I can spin up a lemmy instance and get that info from everyone it’s federated with.
I agree, someone has to store and maintain your data, but giving all instances access to it is a risk that could be avoided
How is it even possible to do a SQL query on the database from another instance?
Makes no sense, databases should be private and behind the HTTP API. Why is he showing a SQL query as evidence?
So I’ll assume this is done via the HTTP API then. If that’s the case, why does an instance needs to see this information from other instances? By needs I mean if there’s an actual purpose for that info being exposed.
You don’t query another instance’s database.
When your instance is federated with another, your instance will sync a local copy of threads and interactions from that instance.
You then query your own database and instantly have access to everyone else’s interaction data.
Wow. Off-topic but that sounds inefficient for very large networks of instances. Sounds like the federation is doing more that it should.
Is there some place to learn about the federation protocol?