Skip Navigation

Posts
7
Comments
53
Joined
2 yr. ago

  • I had a similar idea: Could search engines be broken up and distributed instead of being just a couple of monoliths?

    Reading the HN thread, the short answer is: NO.

    Still, its fun to imagine what it might look like if only......

    I think the OP is looking for an answer to the problem of Google having a monopoly that gives them the power to make it impossible to be challenged. The cost to replicate their search service is just so astronomical that its basically impossible to replace them. Would the OP be satisfied if we could make cheaper components that all fit together to make a competing but decentralized search service? Breaking down the technical problems is just the first step, the basic concepts for me are:

    Crawling -> Indexing -> Storing/host index -> Ranking

    All of them are expensive because the internet is massive! If each of these were isolated but still interoperable then we get some interesting possibilities: Basically you could have many smaller specialized companies that can focus on better ranking algorithms for example.

    • What if crawling was done by the owners of each website and then submitted to an index database of their choice? This flips the model around so things like robots.txt might become less relevant. Bad actors and spam however now don't need any SEO tricks to flood a database or mislead as to their actual content, they can just submit whatever they like!. These concerns feed into the next step:
    • What if there were standard indexing functions similar to how you have many standard hash functions. How a site is indexed plays an important role in how ranking will work (or not) later. You could have a handful of popular general purpose index algorithms that most sites would produce and then submit (e.g. keywords, images, podcasts, etc.) combined with many more domain specific indexing algorithms (e.g. product listings, travel data, mapping, research). Also if the functions were open standards then it would be possible for a browser to run the index function on the current page and compare the result to the submitted index listing. It could warn users that the page they are viewing is probably either spam or misconfigured in some way to make the index not match what was submitted.
    • What if the stored indexes were hosted in a distributed way similar to DNS? Sharing the database would lower individual costs. Companies with bigger budgets could replicate the database to provide their users with a faster service. Companies with fewer resources would be able to use the publicly available indexes yet still be competitive.
    • Enabling more competition between different ranking methods will hopefully reduce the effectiveness of SEO gaming (or maybe make it worse as the same content is repackaged for each and every index/rank combination). Ranking could happen locally (although this would probably not be efficient at all but that fact that it might even be possible at all is quite a novel thought)

    Sigh enough daydreaming already........

  • Was just trying to watch the original Star Wars from when I was young and found out that it is simply not available for sale. My money is no good! Then I found this Project 4K77.

  • +1 servarr It took me a while to navigate the (high) sea of information but eventually I got a setup I like. I started, like you say, just running qBit but found the search results limited and tedious to review manually. Get started with Prowlarr if nothing else. No need to jump in the deep end with everything all at once but once you see how it works you can add other components later.

  • I selfhost my own email and you are absolutely correct it is musch easier to receive than to send. I use a 3rd party to send all my outgoing mail on my behalf.

  • This is my experience too. The sites hosting the articles that I want to read only provide the first parapraph and then a link back to the webpage. News is just headlines. I love that RSS doesn't allow much formating so you end up with an experience focused on the content itself (and no ads). It feels like a long time ago since I really enjoyed my RSS feeds.

  • No matter if it is greed, competitiveness, narcissism, another personality trait or some combination of them the point was that we as a society should not consider becoming a billionaire as model behavior. By all means be the best sports player or musician or top surgeon and make as much money as you are legally allowed. Most tech billionaires are just not that impressive to justify their current net worth.

  • Great reply, thank you. OP points out that the situation appears hopeless and I often leave feeling that capitalism has truly captured all the regulators and is now free to grind all value out of society. Assume we get a decent amount of the population on the same page what is the next step? Is there no room for reforms? I have a feeling that only when public discussion consistently prioritizes human well-being above all else can any progress be even attempted.

  • The wealthy actively lobby for tax breaks and relaxed regulation meanwhile the working majority don't seem to be able to stand together and demand social programs or protections from big businesses. The government is not corrupt for delivering the change that is asked if it. Easier said than done but change for the better is possible.

  • I should have prefaced my situation better: I live in a country where the ISP censors certain websites and online services. The closest Linode is not on my continent (so the latency is noticeable). So my need to be connected to the Wireguard VPN really depends on what I'm doing. Having a split DNS system is seamless and I only activate the VPN manually as needed (both at home and when I'm out) Otherwise I would have just asked my ISP for a static IP, opened some ports and installed tailscale for everything else.

  • Thanks will take a look! Sad to hear you eventually gave up but I'm encouraged by the concept. It would make my current setup much simpler and is in keeping with my ethos that I want as much as possible done locally. The VPS should be no more than a piece of networking infrastructure.

  • I recently made the switch to Vaultwarden when I read a series of articles making predictions about passkeys and how they are lining up to replace passwords. Bitwarden apparently is ready to implement whatever standard becomes most popular and I had FOMO of being left behind if I stuck with keepass only. Previously I was using various keepass compatible apps and then syncing the KDBX database with my Nextcloud. (Vaultwarden is the selfhosted fork of Bitwarden)

  • Before you post a snappy "just do X" or "try this software" try it yourself consent-letter-2123.pdf my complaint is not trivial.

  • That just shows how dishonest Adobe is being. For example if a form was named "gov-form.xfa" instead of "gov-form.pdf" then my whole expectation would be different as it is obviously not a PDF and so I shouldn't treat it as such.

  • Where you able to convert the form into an open format and also preserve all the original functionality? If this is true then there is absolutely no excuse for these forms not being offered in alternative formats. There are some tools that will let you 'flatten' an XFA form to a static PDF but this destroys all the dynamic parts of the original.

  • Feels very hostile right? I assume that all these smart XFA forms still have an online legacy dumb equivalent that is far less easy to use (both for the user and the government)

  • Adobe does sell licences for other companies to use the XFA format but even the software you linked has a free reader that pushes you to the paid full version. Also not FOSS.

  • I'm glad that not everyone is oblivious to my suffering! Thanks for the validation!