Skip Navigation

utopiah @ utopiah @lemmy.ml

Posts

2
Comments

805
Joined

4 yr. ago

7mo ago

Permanently Deleted

Apologies I wasn't clear. I actually I work "on" VR, namely I'm a software developer who write VR/AR code.

Still though... I also do work "in" VR as I have numerous demo where I'm coding in the headset. Most recently you can check this 1min video https://www.youtube.com/watch?v=CGvc4kNXiUY that I did for https://futuretextlab.info/ and it's all open source, cf https://git.benetou.fr/utopiah/text-code-xr-engine/src/branch/fot-sloan-companion . To clarify a bit I drag&drop file on my (Linux) filesystem and they are reflected in AR in that example. I can open them, manipulate them, if it's code (here JavaScript and AFrame) it can live reload part of the scene, etc.

I'm also working "in" VR for the NLNet sponsored project xrsh aka XRshell https://nlnet.nl/project/xrsh/ where thanks to WASM we basically put a (small) Linux system with its terminal on a Web page and thus can code and work in the headset.

7mo ago

FOSS IDEs recommendations?

use Linux without a DE. [...] (/j)

... actually (tip fedora hat) not but seriously actually most of what we NEED is fine that way. It sounds ludicrous then you try Sxmo on a phone and you can't help but GENUINELY wonder "Damn... did I get scammed all those years?"

7mo ago

FOSS IDEs recommendations?

Because people ask for an IDE, rather than an editor, I will say :

Vim + terminal(s) + containerization (e.g. Docker CLI, Python venv) + live reloading (e.g. nodemon or inotify or in the browser using e.g. server side events) + repository management (e.g. git in CLI to juggle between branches, push/pull local/remotely)

IMHO this is very VERY light (0 wait even on a RPi Zero) and yet very flexible.

Also most of that can be "saved" via e.g screen the CLI tool, allowing to have named windows in a terminal and a lot more than to e.g. screen -raAD, locally or remotely.

7mo ago

Fedora Kinite removed Windows boot loader

What games specifically? Just replied https://lemmy.ml/post/23699393/15632445 literally minutes ago

7mo ago

Permanently Deleted

I work in VR, I play in VR, including Windows games, all on Linux. No specific problem for me on that front.

7mo ago

Permanently Deleted

The only bastion left is anticheat. Everything else are just (bad) old habits fueled by marketing.

7mo ago

China wants to dominate in AI — and some of its models are already beating their U.S. rivals

in any way shape or form

I'd normally accept the challenge if you didn't add that. You did though and it, namely a system (arguably intelligent) made an image, several images in fact. The fact that we dislike or like the aesthetics of it or that the way it was done (without prompt) is different than how it currently is remains irrelevant according to your own criteria, which is none. Anyway my point with AARON isn't about this piece of work specifically, rather that there is prior work, and this one is JUST an example. Consequently the starting point is wrong.

Anyway... even if you did question this, I argued for more, showing that I did try numerous (more than 50) models, including very current ones. It even makes me curious if you, who is arguing for the capabilities and their progress, if you tried more models than I did and if so where can I read about it and what you learned about such attempts.

7mo ago

China wants to dominate in AI — and some of its models are already beating their U.S. rivals

Language models on their own do indeed have lots of limitations, however there is a lot of potential in coupling them with other types of expert systems.

Absolutely, I even have a dedicated section "Trying to insure combinatoriality/compositionality" in my notes on the topic https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence

Still, while keeping this in mind we also must remain mindful of what each system can actually do, not conflate with what we WANT it do yet it can not do yet, and might never will.

7mo ago

China wants to dominate in AI — and some of its models are already beating their U.S. rivals

Image gen did not exist in any way shape or form before.

Typical trope while promoting a "new" technology. A classic example is 1972's AARON https://en.wikipedia.org/wiki/AARON which, despite not being based on LLM (so not CLIP) nor even ML is still creating novel images. So... image generation has been existing since at least the 70s, more than half a century ago. I'm not saying it's equivalent to the implementation since DALLE (it is not) but to somehow ignore the history of a research field is not doing it justice. I have also been modding https://old.reddit.com/r/computationalcrea/ since 9 years, so that's before OpenAI was even founded, just to give some historical context. Also 2015 means 6 years before CLIP. Again, not to say this is the equivalent, solely that generative AI has a long history and thus setting back dates to grand moments like AlphaGo or DeepBlue (and on this topic I can recommend Rematch from Arte) ... are very much arbitrary and in no way help to predict what's yet to come, both in terms of what's achievable but even the pace.

Anyway, I don't know what you actually tried but here is a short list of the 58 (as of today) models I tried https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence and that's excluding the popular ones, e.g. ChatGPT, Mistal LeChat, DALLE, etc which I also tried.

I might be making "the same mistake" but, as I hope you can see, I do keep on trying what I believe is the state of the art of a pretty much weekly basis.

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

Might open up a GDPR related issue there. I don't think people using such a library assume they need connectivity nor that their data would be send to a 3rd party.

7mo ago

China wants to dominate in AI — and some of its models are already beating their U.S. rivals

What an impressive waste of resources. It's portrayed as THE most important race and yet what has been delivered so far?

Slightly better TTS or OCR, photography manipulation that is commercially unusable because sources can't be traced, summarization that can introduce hallucinations, ... sure all of that is interesting in terms of academic research, with potentially some use cases... but it's not as if it didn't exist before at nearly the same quality for a fraction of the resources.

It's a competitions where "winners" actually don't win much, quite a ridiculous situation to be in.

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

Thanks for the clarification. I checked the code you linked and noticed recognize_google and seems it's relying on https://github.com/Uberi/speech_recognition which then seems to rely on https://github.com/Uberi/speech_recognition/blob/master/speech_recognition/recognizers/google.py so basically are they using an API, sending all the audio data to Google servers?

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

I used pandoc for this.

Please come back and share if it's done better or worst and if so along which dimensions. Quite curious to better understand the differences.

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

converting audio files to markdown must be a pretty recent feature

Quite curious... does it actually do that and if so how? Because STT to get a plaintext file or subtitle (so with timing) has been available via e.g. Whisper quite efficiently for a while now. If this though does do more, e.g. structure (differentiating a title, list, etc) I'd like to learn how.

7mo ago

Would you recommend NextDNS?

Worked well for me, used it for couple of months, maybe an entire year. Re-installed my OS and didn't put it back.

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

PS: related, asked on Github too https://github.com/microsoft/markitdown/issues/20#issuecomment-2544630753

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

audio transcription tool

Thanks for the clarification but I'm a bit confused here, like audio transcription, STT, done by e.g. Whisper? If so what's the use case? When I think of Office documents audio transcription is not something I have in mind.

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

soffice works as CLI, can be called from Python and has plenty of related tooling, e.g. https://pypi.org/project/unoserver/ so I agree, I'm confused at what's actually novel and better than that or even dedicated long lasting FLOSS projects like pandoc.

7mo ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

FWIW if you are interested in such tooling consider also soffice and pandoc which have (as far as I can tell) similar features but have been existing for years now and are not related to Microsoft.

Edit: not related to Microsoft AND Google, seems the transcription aspect (which IMHO is still weird in that context but OK) is done via Google servers, cf https://lemmy.ml/post/23629310/15586865

7mo ago

Recommendations on Linux Friendly PDF Software

Did too until recently, started to switch to qpdf aqs it seems more openly maintained while doing about the same job with, arguably, clearer documentation than pdftk.