The only time pseudo code might make sense is in academic papers, where there might be strict page limits. Even then, the paper should link to the actual code. It’s quite often the pseudo code gloss over or even misses important implementation details.
In Java it’s quite difficult to forget semicolon and still have a syntactically correct program.
I think braces are incredibly important. With Python it can be confusing which indentation is correct when copying code between files. Different indentations can easily be still syntactically correct, but the outcome is vastly different. It’s often I have to double and triple check to verify I copied the code with correct indentation.
Most of the GPUs belong to the big tech companies, like OpenAI, Google and Amazon. AI startups are rarely buying their own GPUs (often they’re just using the OpenAI API). I don’t think the big tech will have any problem figuring out what to do with all their GPU compute.
No, that’s not necessary. The only thing they need to do is to find an I-Frame (which there are plenty of), make a cut at that frame, show the ad instead, and then resume to the original video after the ad is done. No extra encoding is involved. It’s just like concatenating video files together.
I’ve done similar stuff like this. It’s not too difficult, at least not in H264. Not sure about YouTube’s own format, but I guess it’s quite similar.
15 years ago people made fun of AI models because they could mistake some detail in a bush for a dog. Over time the models became more resistant against those kinds of errors. The change was more data and better models.
It’s the same type of error as hallucination. The model is overly confident about a thing it’s wrong about. I don’t see why these types of errors would be any different.
Most improvements in machine learning has been made by increasing the data (and by using models that can generalize larger data better).
Perfect data isn’t needed as the errors will “even out”. Although now there’s the problem that most new content on the Internet is low quality AI garbage.
Database of ad timestamps like sponsorblock only works if ads happens at the same timestamps (and are of equal length). This is not necessarily the case.
The only reliable way I can come up with is a database of ads to look for, but that can be huge to accommodate for all possible ads. There’s also risk of false positives (risk of skipping video when there are no ad).
I’m not sure if a sponsorblock like solution will work. Sponsorblock is entirely reliant on timestamps provided by users.
A similar solution for YouTube’s ads will only work if the ads always happen at the same timestamps and have the same length. This is not necessarily the case, as ads can happen at any point.
They don’t need to do any extra transcoding. It’s not that costly to stitch videos together. If done at specific strategic locations, it’s like copying a text file into another.
Agree.
The only time pseudo code might make sense is in academic papers, where there might be strict page limits. Even then, the paper should link to the actual code. It’s quite often the pseudo code gloss over or even misses important implementation details.