lemm.ee plans for mitigating image upload abuse
PriorProject @ PriorProject @lemmy.world Posts 9Comments 266Joined 2 yr. ago
Mod actions are public on Lemmy, here's the modlog of actions related to your account: https://lemmy.world/modlog?page=1&userId=1589367
The comment on these actions is:
reason: Please stop calling people pedophiles
The ban will expire in 3 days.
My money is also on IO. Outside of CPU and RAM, it's the most likely resource to get saturated (especially if using rotational magnetic disks rather than an SSD, magnetic disks are going to be the performance limiter by a lot for many workloads), and also the one that OP said nothing about, suggesting it's a blind spot for them.
In addition to the excellent command-line approaches suggested above, I recommend installing netdata on the box as it will show you a very comprehensive set of performance metrics without having to learn to collect each one on the CLI. A downside is that it will use RAM proportional to the data retention period, which if you're swapping hard will be an issue. But even a few hours of data can be very useful and with 16gb of ram I feel like any swapping is likely to be a gross misconfiguration rather than true memory demand... and once that's sorted dedicating a gig or two to observability will be a good investment.
- Does this happen all the time or intermittently?
- If you retry does it work later?
- Can you link a post where it happens?
I just tested and it looks ok to me, but world is flaky sometimes. It could just be random timeouts from slowness that would self-resolve if you retry.
Tailscale is out, unfortunately. Because the server also runs Plex and I need to use it with Chromecast on remote access...
I rather suspect you already understand this, but for anyone following along... Tailscale can be combined with other networking techniques as well. So one could:
- Access Plex from a Chromecast on your home network using your physical IP, and on your tailnet using the overlay IP.
- Or one could have some services exposed publicly and others exposed on the tailnet. So Immich could be on the tailnet while Plex is exposed differently.
It's not an all or nothing proposition, but of course the more networking components you have the more complicated everything gets. If one can simplify, it's often well worth doing so.
Good luck, however you approach it.
So for something like Jellyfin that you are sharing to multiple people you would suggest a VPS running a reverse proxy instead of using DDNS and port forwarding to expose your home IP?
I run my Jellyfin on Tailscale and don't expose it directly to the internet. This limits remote access to my own devices, or the devices of those I'm willing to help install and configure tailscale on. I don't really trust Jellyfin on the public internet though. It's both a bit buggy, which doesn't bode well for security posture... and also a misconfiguration that exposes your content could generate a lot of copyright liability even if it's all legitimately licensed since you're not allowed to redistribute it.
But if you do want it publicly accessible there isn't a hoge difference between a VPS proxying and a dynamic DNS setup. I have a VPS and like it, but there's nothing I do with it that couldn't be done with Cloudflare tunnel or dyndns.
What VPS would you recommend? I would prefer to self host, but if that is too large of a security concern I think there is a real argument for a VPS.
I use linode, or what used to be linode before it was acquired by Akamai. Vultr and Digitalocean are probably what I'd look to if I got dissatisfied. There's a lot of good options available. I don't see a VPS proxy as a security improvement over Cloudflare tunnel or dyndns though. Tailscale is the security improvement that matters to me, by removing public internet access to a service entirely, while lettinge continue to use it from my devices.
Do I need to set up NGINX on a VPS (or similar cloud based server) to send the queries to my home box?
A proxy on a VPS is one way to do this, but not the only way and not necessarily the best one... depending on your goals.
- You can also use port-forwarding and dyndns to just expose the port off your home-ip. If your ISP is sucky, this may not work though.
- You can also use Cloudflare's free tunneling product, which is basically a hosted proxy that acts like a super port-forward that bypasses sucky ISP restrictions.
- If you want to access Immich yourself from your own devices but don't need to make it available to (many) others on devices you don't control, I like and use tailscale the best. The advantage of tailscale is that Immich remains on a private network, not directly scannable from the internet. If there's a preauth exploit published and you don't pay attention to update promptly, scanners WILL exploit your Immich instance with internet-exposed techniques... whereas tailscale allows you to access services that internet scanners cannot connect to, which is a nice safety net.
Do I need to purchase a domain (randomblahblah.xyz) to use as the main access route from outside my house?
Not for tailscale, and I don't think for Cloudflare tunnel. Yes for a VPS proxy.
I've run a VPS for a long while and use multiple techniques for different services.
- Some services I run directly on the VPS because it's simple and I want them to be truly publicly accessible.
- Other services I run on a bigger server at home and proxy through the VPS because although I want them to be publicly accessible, they require more resources than my VPS has available. When I get around to installing Immich, there's a decent chance it will go into this category.
- Still other services, I run wherever and attach them to my tailnet. These I access myself on my own devices (or maybe invite a handful of trusted people into my tailnet), but aren't visible to the public internet. If I decide not to use immich's shared gallery features (and so don't need it publicly accessible) or decide I don't trust it security-wise... it will go here instead of the proxy-by-vps category.
- ...create a sidebar with some contents... At least some of these communities have empty sidebars.
- Every community needs enough moderators. A single-mod community is not "enough" for a healthy community because things can blow up when you're asleep or away, even in a community that was previously inactive. If a community member reaches out to offer to join a single-mod team... that contact warrants a response from the existing mod. Not necessarily to immediately accept the offer, but at least to discuss the possibility of extra mod coverage.
- It's just not at all true that if others aren't posting there's no moderation work that could be done. Mods of inactive communities can jumpstart them by soliciting feedback on proposed rules, advertising them elsewhere, making scheduled discussion posts, and more. Some of these things can be done by a "regular" community member as well, but if community members try to include mods in discussions about how best to promote the community and the mods ignore them... that's a sign that the community is abandoned.
- If a mod is notified that they're their community is about to get reassigned and they don't respond... the community is definitely abandoned.
All of which is to say, there are lots of way to detect abandoned communities when post volume is low, and the process I highlighted is the standard way to request a takeover.
I use k8s at work and have built a k8s cluster in my homelab... but I did not like it. I tore it down, and currently using podman, and don't think I would go back to k8s (though I would definitely use docker as an alternative to podman and would probably even recommend it over podman for beginners even though I've settled on podman for myself).
- K8s itself is quite resource-consuming, especially on ram. My homelab is built on old/junk hardware from retired workstations. I don't want the kubelet itself sucking up half my ram. Things like k3s help with this considerably, but that's not quite precisely k8s either. If I'm going to start trimming off the parts of k8s I don't need, I end up going all the way to single-node podman/docker... not the halfway point that is k3s.
- If you don't use hostNetworking, the k8s model of traffic routes only with the cluster except for egress is all pure overhead. It's totally necessary with you have a thousand engineers slinging services around your cluster, but there's no benefit to this level fo rigor in service management in a homelab. Here again, the networking in podman/docker is more straightforward and maps better to the stuff I want to do in my homelab.
- Podman accepts a subset of k8s resource-yaml as a docker-compose-like config interface. This lets me use my familiarity with k8s configs iny podman setup.
Overall, the simplicity and lightweight resource consumption of podman/docker are are what I value at home. The extra layers of abstraction and constraints k8s employs are valuable at work, where we have a lot of machines and alot of people that must coordinate effectively... but I don't have those problems at home and the overhead (compute overhead, conceptual overhead, and config-overhesd) of k8s' solutions to them is annoying there.
The more normal transfer path is to offer to take over a specific community or communities by:
- Reaching out to the existing mod and asking to be added to the mod team.
- Documenting their lack of response after a few days or a week.
- Documenting the failure to abide by Lemmy world moderation guidelines: https://lemmy.world/post/424735 by linking spam or off-topic posts and to communities that lack rules/useful-sidebar-content, etc.
- Posting this info in !moderators@lemmy.world and offering to takeover moderation.
This is better than mass deletion because it keeps whatever small list of existing subscribers and post content intact across the transition. For moderation, Lemmy world admins will get notified of reports and can address anything that violates instance rules.
Check the sidebar, there is a rule. This post shouldn't spoil the result in the title, and it should spoiler-tag or nsfw the image.
Do you have a source on this? 2w later world is still missing and this post is the first mention of it I've found in a few minutes of sleuthing.
Edit: NVM, found this comment with a GitHub issue link: https://lemmy.world/comment/2195281
I wanted to plug one of them over USB, but it seems that docker just doesn't like to have volumes on external drives. AFAIK docker starts before the drive is fully mounted, preventing it from doing so. I couldn't find any reliable way to work around this (but I'm open to suggestions!).
You haven't said what operating-system you're using, how your mount was configured, or how you're starting docker or your containers. An external drive is the normal way to do this, though, and I do it on Linux with ZFS drives and docker-compose auto-starting the containers and it works fine.
11,263 lbs, huh? It's not a kind estimate, but not unrealistic either.
If a proxy is useful, I believe this is the implementation that powers Caddy2's QUIC support.
Permanently Deleted
I feel like you're combatively advocating for a specific vision and not collecting and processing feedback as your OP suggests, at any rate... you don't seem to be understanding what I was trying to say at all... but it's not something I'm going to fight about with someone who is questioning if I know what a multi-reddit is and dismissing client-side techniques as nonsense without seeming to understand why they were being discussed in the first place.
I'll leave with these thoughts, do with them what you will:
- I'm not interested in any multireddit feature that reduces sub privacy. I'd consider it a net loss for lemmy.
- On Reddit, multi-reddits personal in nature. Such a personal multireddit for lemmy doesn't require interaction with federation or privacy changes.
- I realize that a shared super-community feature is frequently requested on Lemmy aimed at addressing duplication of communities across instances. I don't think that's more than superficially similar to actual multireddits, and I don't think it's a good idea because it creates moderation problems that are far worse than the community duplication problems it purports to address.
Permanently Deleted
What you've described is one way. It could also be a filtered view based on the subscribed/all feed which provides a single API call that can return material from multiple communities. I'm not suggesting that a client-side only solution is a GOOD solution. But from an information-flow perspective, I'm suggesting that multireddits are a "local" function. Theu are so local that they're possible without server-side support at all, and especially local enough not to require representation in the federated feed... which is a more significant change with potential impacts to other federated projects like kbin and mastodon... and shouldn't require relaxing privacy constraints in any case.
The Beehaw admins made this choice, and documented their rationale here: https://beehaw.org/post/567170
Permanently Deleted
Anyway, what's the feedback on privacy issue with allowing any user to have read-only access to your community subscribe list...
I wouldn't want this in exchange for multi-reddits. You can a little bit infer the communities someone subscribes to from their comment activity, but as it stands one can choose to privately lurk and this would eliminate that... silently for existing users in the absence of some big series of announcements to make it well known.
Why are multi-reddits a thing that involves federation at all? Multi-reddits as they exist on Reddit itself could be implemented entirely client-side, the server side stuff just syncs the behavior of multiple client apps. Why does the concept of a multi-reddit need to extend outside of the user's instance?
Nutbutter sort of covered it.
- Tailscale creates a virtual network.
- That network can be (and is by default) private in that no one can join that you don't allow, and in that respect it's similar to your home network. You can join your laptop, desktop, and phone to your tailnet... but probably you cannot join your Chromecast or smart-television (they don't publish tsilscale clients for these devices).
- If you configure Jellyfin to listen on your tailnet and not on the Internet... then you can access Jellyfin from anywhere using a device that is connected to your tailnet, but attackers on the Internet cannot access Jellyfin without first accessing your tailnet, which is hard to do.
The security/convenience tradeoff of tailscale is pretty good if you want to access a service from anywhere, but only from your own devices and only from supported operating systems (Linux, windows, OSX, android... not sure about iOS). It is another networking layer, which can be mind-bending... but as much as such a layer can be easy to use... tailscale is as easy as any of them.
However, Tailscale's backend is not open-source. They may not log all the data passed through, but they certainly can look at it.
This see sentence is nonsense though.
- Tailscale is end to end encrypted, tailscale cannot quietly see your traffic.
- Tailscale COULD, by default, surreptitiously join a node to your tailnet. If you're super paranoid, they provide a way to disable this but it makes tailscale much less convenient to use: https://tailscale.com/kb/1226/tailnet-lock/
- Tailscale is phenomenally transparent about security and has WAY higher standards than self-hosters: https://tailscale.com/security/.
- Tailscale clients are open source, and they employ the author of Headscale an open source implementation of the Tailscale control protocols.
There is very little to fear from Tailscale as a provider, and they support the headscale project if you want to go that route (which I do... but not because I am concerned about Tailscale's integrity or security posture).
It's worth considering some commercially developed options as well: https://prostasia.org/blog/csam-filtering-options-compared/
The Cloudflare tool in particular is freely and widely available: https://blog.cloudflare.com/the-csam-scanning-tool/
I am no expert, but I'm quite skeptical of db0's tool:
I'm no expert, but my belief is that open tools are likely to be hamstrung permanently compared to the tools developed by big companies and the most effective solutions for Lemmy must integrate big company tools (or gov/nonprofit tools if they exist).
PS: Really impressed by your response plan. I hope the Lemmy world admins are watching this post, I know you all communicate and collaborate. Disabling image uploads is I think I very effective temporary response until detection and response tooling can be improved.