Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)RU
Posts
12
Comments
1,377
Joined
2 yr. ago

  • Same, same. Maybe one day I'll travel there and see for myself. Where I live people just walk 10-20m, get a cart, go shopping, put the groceries in the car, walk the 20m again to return it and drive home. No being a prick being involved at the supermarket. However, I've observed that some people don't return their carts at the IKEA.

  • The problem with that is a few years is a bit short to get real benefits out of it. And the Wikipedia article contradicts the statement that productivity went down. Actually issues and errors went down, half the workforce was alright with it and they saved tens of millions of Euros. And then they cancelled it. That decision wasn't backed by technical or factual reasons at all. Many people said they were fine with Linux. Issues were for example that they had old and outdated computers. As a reason to switch back they claimed: sync to mobile phones had issues (...as if government workforce syncs their calendar to their mobile phones...) and these were issues with the groupware suite. Nothing had anything to do with Linux, productivity or the people who sat in front of the computers and actually had to use it. There were quite some benefits and from the technical side things were going well despite admins not being backed by their superiors and the city. They did a final study which contains quite some / mostly dubious statements, and Microsoft also was involved in the switching back.

    You CAN enforce a change. Sure, change is always hard in the beginning. But we do it all the time. The story of LiMux is more: You can destroy anything if you really want to. And politics likes to twist things so it suites their narrative. (And lobbyism is a thing and Microsoft is better at it than the Linux community.)

  • Maybe I should have worded things a bit differently. The 5 second snippets are for style transfer. I think it only picks up on the frequency spectrum of a new voice and knows how to handle that because it's been trained with several other voices. I suppose one or two sentences aren't enough to get the pacing right and all the disctinct features of a human speaker. I didn't get good results anyways. Tools like that from ElevenLabs recommend you upload 30mins to 3hours of speech.

    I've managed to get the TTS running. the German thorsten/tacotron2-DDC is very good in my opinion. Could be the thing I was looking for. It just gets all the abbreviations and names wrong but the flow of the voice is quite good. And it's fast, even on my laptop. Sadly I also read that Coqui-AI are shutting down. Seems to be difficult to compete against the big-tech companies who integrate their proprietary TTS tech for free into the platforms.

    I agree. Packaging and integration are some of the most important aspects if you want to actually use something. A research project is also nice, but those don't solve my every-day tasks. And I can't maintain too many development environments with complex dependencies and copy-and-paste everything to the command line. It abolutely needs to be available on the platform and there needs to be a wrapper that integrates it into the other software I use. We have that for espeak, flite and all the old-fashioned tools. But it's completely missing for the last 5 years of technological advancements...

  • I think we're way past that. I've fiddled around a bit with 'bark' and another more common (open?) solution to do voice cloning. It takes like an 5 second audio clip of someone talking and it can extract features from that, train an AI model and transfer the 'style' of that voice to arbitrary speech. I don't really know if it's technically similar to the AI tools that can paint an astronaut on a horse and draw it in the style of van gogh... but it's the same idea. And bark and other tools can also synthesize speech with an AI model. You can just give it text and instruct it to talk in a relaxed female voice, and it'll do it. However, I wasn't able to get good results out of it. It's nice to play around with, but it's not yet feasible for real world use. And it takes a proper graphics card (or a cloud service that provides you with GPU compute) to run it.

    I don't think these tools use phonemes and the old-fashioned ways of doing it. It is machine learning and AI 'magic' that makes those tools sound more smooth and realistic.

    What I also like is coqui-ai. It seems to be entirely free and the samples sound on a complete next level compared to established tools like espeak-ng. Sadly it isn't packaged in any of the Linux distributions I use. And I really don't understand why. It also doesn't need crazy system specs. But it doesn't tie into the desktop at all and requires you to set up conda environments, handle the CUDA libraries and just running the 'pip install TTS' they listed on their github repo didn't do it for me.

    (I excluded the commercial tools here. Big-Tech has some alright TTS. Google, Amazon, Apple, ... they're all usable. elevenlabs.io offer exceptionally good TTS, I think that's what the AI narrated YouTube videos are made with. And I sometimes use the button to convert heise online articles to speech while doing the laundry or other stuff in the house that doesn't take enough time for me to start a podcast. I just wish there was a button on my laptop that'd do the same thing with free software and offer similar quality.)

    [Edit: Forget what I said last. I've been distro-hopping lately and it seems coqui-ai/TTS is avalable in the Linux I've installed last week. I'm going to try it tomorrow.]

  • Sure. Seems the Thorsten voice is in every FLOSS text-to-speech project. I think he (the real Thorsten) also does YouTube videos about that topic.

    I don't know of any other free software Android speech software (that also speaks German) except for espeak. And I need something that can talk to me in the car. For other purposes it sounds a bit rough in my opinion. Eventually I would like something more state of the art with a more human-like sound. And something that properly ties into my Linux desktop and brings local STT and TTS to every application. I think the components are there already. But we're still missing the proper integration into both platforms. (And maybe a few more voices and training data for new ones in several languages.)

  • Ah okay, I can see that being useful. Seems we have a different workflow. I rarely look at that extra info at the bottom. Usually just to see how many files I selected and their total size. If I'm concerned with single files, I either don't care for the size and extra info, or I switch to the list view and have it displayed next to each file if I'm organizing stuff. I'll also sort them by size or whatever in that case. But I'm not concerned with the exact file info while doing regular stuff. So I wouldn't use that use-case for a single-click very often.

  • Or you drag over the files. Or press something like Strg or Shift while clicking. I mean you have to do that anyways, even with double-click per default or you'd lose focus on the first file. And it's rarely the case that you just want to focus a single file.

  • I think that grant is to spread European values on the internet and help independent research. They basically don't care at all who you are, only important thing is you release your results open-access and the code under an open-source license.

  • Yeah, I don't know if OP was looking for that. They specified 'FOSS' in the title. But I think Google can also do local STT nowadays, I haven't tried it for quite a while. Sayboard and FUTO work remarkably well. I personally am struggling a bit more with the reverse part: TTS. There isn't much except for espeak if you want other languages than English (and maybe Russian since there is another project that does a few other languages.) But I skipped on the Google services on my phone.

  • I think that's it. The two mentioned things in the previous comments are also what I've seen floating around. Sayboard and FUTO's voiceinput. The former is free software and FUTO releases under a source-available license. Additionally you can use something like Kõnele (available in F-Droid) to connect to cloud-based services. Disregarding free software, there are probably a few others with a proprietary license. For example Google's STT that is baked into their Android versions.

  • Because you rent them and not own them. It's also illegal to sell a book that you rented from the library. Or get a dvd from the library and then copy it. It's a measure they put into place so you're not allowed to duplicate the thing. Hence they don't grant you the same kind of ownership you'd have over a physical item.

  • Sure. But I mean that isn't unique to wordpress or woocommerce. I mean there are other CMS and e-commerce solutions. And this one isn't even closer aligned to my requirements than any of the other big platforms. Or did is miss something and woccomerce has more feature baked into its free software core than all the other platforms?

    I'll have a closer look at it, but I'm currently still evaluating some of the other recommendations. I've kind of disregarded a solution like that at first, because in the past I've used CMSs for everything. But having a broad and general CMS with many features and then customizing it with several modules / add-on and customizations also gets complex and hard to maintain over time. Nowadays I kind of prefer solutions that are tailored to a use case instead of extending a CMS and I've replaced my wordpress installs with static sites. But I'm not set on it. If WooCommerce turns out to be good and usable without additional paid extensions, I might as well use that.

  • Wow, thx. That looks really good. At one point I've tried Odoo and decided it was too much and too complicated for me. But this looks a bit cleaner. And everything seems to be free software, even some of the add-ons. I definitely have to try this. I just hope it doesn't eat all the resources on my VPS.

  • I'd say 6-12 years. Maybe including about 1 hard disk failing. I forgot what the mean to failure is for a harddisk. And in a decade I probably have all the disks filled to the brim, my usage pattern changed and a new one has 10x the network speed, 4x more storage and is way faster in every aspect.

  • Thanks. I think I'm going to spin up a container and try a few. I had hoped there was a hobby project entirely without important paid add-ons, maybe a hobby project or something by the free software community for niche use cases. But I see, I'll probably have to use one of the proper solutions. I got quite a few recommendations now.

  • I read it's more like 20-200 years. But there are differences. Recorded CD-Rs are worst. Burn DVDs if you can. And bought (pressed(?)) disks perform considerably better. But don't expose them to UV light or scratch them too much.

    With books it depends on how people store these. They can mold. But if you take care to store them right... I mean there are books that are hundreds of years old. I think books are usually lost to things like a fire, flooding, or people deliberately getting rid of them. Otherwise, printed information will survive for quite some time. And I too think it's the better collectible. And they are fun to use. I like them better than reading on a screen.

  • I don't miss the times when my living room had several shelves with movies and CDs and XBox games. Nowadays I have everything stored on a NAS in the basement and a Spoitfy and Netflix subscription.

    I find it difficult to compare those concepts, owning vs renting, pirating vs buying. They're all very different and all have their use. I just think I'm not the one who likes to collect movies and music on physical disks. And streaming it to the phone or TV is more convenient anyways.