LPT Do it.
LPT Do it.
LPT Do it.
The weird part is that most modern office software has version control built right in.
And I still do this with all my files anyway.
I've had the built in version control do unexpected things, so I play it safe and create named backup files. I usually end up using that one file, but I've been saved on occasion
Its just not trustworthy
Use date/time in your file name,using GMT:
Metrics of Sales 2024-05-22_14-29.docx
Very unlikely to have 2 docs with the same down-to-the-minute time stamp in the name.
If you think this process involves enough mindpower to check the time, let alone figure out where the dashes are in whatever language keyboard setup I'm using at the time, you are wildly overestimating how much care goes into doing this.
I generally do this on my NAS, combined with nightly and bi-weekly backups, plus a 6-mo safety backup, to a backup drive. Also, basic off-site nightly backups for important stuff. If I worked on really important stuff that required lots of versioning, though, I'd probably go with a versioning system instead of inserting the date.
I should write my resume in LaTeX.
Done it and highly recommend it
Do you have a good LaTeX template for it. I did make a data driven based LaTeX pdf for my resume but it’s a nightmare when applying for jobs these days, since they have that ATS parser nonsense, which will throw the entire resume down if it isn’t as very plain and boring word document without much formatting.
Haha my first thought seeing this meme is "do you want to start writing LaTeX by hand? Because this is how you start..."
I have it is so worth it. I then use GitHub / GitLab releases to “release” a built PDF for my reference.
I do this using overleaf. It's been much easier to maintain and update since switching.
I wrote mine in LaTeX, highly recommend.
I mean, I spent years writing LaTeX for school so it was real simple and mindless. YMMV
I have enjoyed switching mine to HTML format which I then generate a PDF from. The only downside is that different browsers can render stuff slightly different, but that's normally fixable with one line css change. And it's not like I need to update my resume constantly on different machines.
git tag "FINAL FINAL FINAL DRAFT - v20"
Git is like shit for Word documents
That's why we wrote our thesis in LaTeX: https://github.com/jonte/GGS-report/blob/a9d9d20bcc22a524629e371ce5984f131490b743/report.lyx#L362
I also have my reports in latex inside a git repo, complete with a makefile to generate graphs from csv containing simulation results. However I am too ashamed to publish the entire version control to a public repo
#LyX 2.0 created this file. For more info see http://www.lyx.org/
Wait, I thought you guys did it manually...
Anyway, I should still learn it.
Unzip the docx with a pre-commit hook
(This is not a serious suggestion)
Just like word documents are shit for papers and theses/dissertations it turns out. The formatting alone is a nightmare.
.gitattributes can invoke Word on windows to diff versions, and there are plenty of open source scripts that can do it if you don't have a copy of Word (or Windows) lying around.
But Word is like shit for papers. Use LaTeX instead.
Why on Earth would you curse yourself with MS Office anyway, especially if writing docs is your professional responsibility?
Why not use Git+Markdown+Pandoc, have your copy, data and layout separate?
I understand that a lot of istitutions/companies impose stylistic/technical requirements for docs and publications, - still doesn't mean you gotta stay married to the worst tooling.
Still better than using file names.
I learned LaTeX just so I could effectively use git in it.
Yes, I also mde my. Thesis slides in LaTeX which was nice as I coukd reuse the figures.
I mean yes you can use beamer to make slides, but it is a lot less flexible than ppt/LibreOffice Present.
Had to write a paper in college with 100 citations.
We used zotero for citation management, and it would dump a bibtex file on demand.
The paper was written in markdown, stored in git, and rendered through pandoc. We would cite a paper with parentheses and something resembling an id, like (lewis).
We gave pandoc a “citation style definition”, and it took care of everything. Every citation was perfectly formatted. The bibliography was perfectly formatted. Inline references were perfect. Numbering was perfect. All the metadata was ripped from pdfs automatically. It was downright magical.
This is what I (a non coder who only knows git "download the Yuzu repo before they nuke it" and git "give me all the updates") want to do when I get to write a paper. How much git did you have to learn to do this?
This is just basic make changes to file, git add and commit workflow. Other features of git like branching can be leveraged for greater control but are optional. What makes it magical is 3 seperate systems working together with such symphony namely git, Zotero and pandoc. Zotero is citation manager that you can use store scientific articles, papers, thesis etc. and it can produce a bibliography file and pandoc can reference those along with the citations in the make file to create a clean typesetted Word or LaTeX pdf with precise numbering, table of contents, citations and bibliography with correct format without you needing to edit anything.
yep, markdown is a great alternative to LaTeX if you don't need fancy layouts or anything special
Markdown + pandoc means it goes through an intermediary latex template on the way to pdf land - which means your markdown can be a bastardized mix of markdown, html, latex commands, and sometimes more ;)
Exactly my workflow, but I used R Markdown!
I absolutely love R markdown! Being able to iterate on your analysis and report at the same time is fantastic
I also added a Makefile for mine (LaTeX), and it would add the commit hash to the front page (with an asterisk if the repository had uncommitted changes).
So, if I gave a draft to someone and got feedback, I'd know exactly which revision it was.
Hey, amazing idea, can you share the code?
Sure thing. This also includes the beamer bit which I used for my defense. It's all pretty hacky but hope it's useful!
# Makefile for compiling LaTeX documents.
# # Errors aren't handled gracefully (tex doesn't write to stderr, it seems) # If you encounter errors, use "make verbose" # # For small changes (probably those without references), use "make quick" # # Thanks to https://gist.github.com/Miliox/4035649 for dependency outline TEX = pdflatex BTEX = biber MAKE = make -s TEXFLAGS = -halt-on-error # $(MAIN).log is dumb if we have multiple targets! SILENT = > /dev/null || cat $(MAIN).log SILENT_NOER = 2>/dev/null 1>/dev/null EDITOR = vim -p PDFVIEW = evince MAIN = main PRES = presentation ALL = $(MAIN).pdf RECURS = media/ manuscripts/ VERSION := $(shell git rev-parse --short HEAD | cut -c 1-4)$(shell git diff-index --quiet HEAD && (echo -n ' ';git log -1 --format=[%cd]) || (echo -n '* '; date -u '+[%c]')) all: recurs $(ALL) pres: $(PRES).pdf scratch: scratch.pdf scratch.pdf: scratch.tex @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) verbose: SILENT = '' verbose: $(ALL) recurs: $(RECURS) @$(foreach DIR, $(RECURS), \ echo "MAKE (CD) $(CURDIR)/$(DIR)"; \ $(MAKE) -C $(DIR) $(MAKECMDGOALS);) @echo "MAKE (CD) ./" clean: @echo "SH (RM) Not recursing; 'make allclean' to clear generated files." @rm -f *.aux *.log *.out *.pdf *.bbl *.blg *.toc *.lof *.lot *.bcf *.run.xml allclean: recurs @echo "SH (RM) A clean directory is a happy directory" @rm -f *.aux *.log *.out *.pdf *.bbl *.blg *.toc *.lof *.lot *.bcf *.run.xml version: @echo "SH (ver) $(VERSION)" @echo $(VERSION) > VERSION.tex nixpages: main.pdf @echo "PDF (pdftk)" @pdftk main.pdf cat 1 4-end output final.pdf quick: $(MAIN).tex version @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) $(MAIN).pdf: $(MAIN).tex $(MAIN).bbl all.tex tex/abstract.tex tex/intro.tex tex/appendix.tex tex/some_section.tex tex/some_other_section.tex @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) $(MAIN).bbl: $(MAIN).aux @echo "BIB (bib) $(MAIN)" @$(BTEX) $(MAIN) > /dev/null $(MAIN).aux: $(MAIN).tex $(MAIN).bib version @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) $(PRES).pdf: $(PRES).tex $(PRES).bbl tex/beamer*.tex tex/slides/*.tex @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) $(PRES).bbl: $(PRES).aux @echo "BIB (bib) $(PRES)" @$(BTEX) $(PRES) > /dev/null $(PRES).aux: $(PRES).tex $(MAIN).bib @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) edit: @echo "EDIT (fork) $(EDITOR)" @$(EDITOR) ./tex/*.tex *.tex view: @echo "VIEW (fork) $(PDFVIEW)" @$(PDFVIEW) $(ALL) $(SILENT_NOER) &
Makefile in other comments. You'll need something like this on the title page (this assumes you use my Makefile which puts the version in VERSION.tex
[that's the literal name of the file, not a placeholder]):
{\bf{\color{red}DOCUMENT REVISION:}} {\color{blue}\input{VERSION}}
docx files are actually zip archives with xml in them
Let me tell you something. I cannot tell you what company, but I have been tasked with putting Excel files in git "because they are just zip archives with xml" and it is just a disaster. Everytime you save the document it will save certain parts of the xml code in arbitrary ways (like each image is in a list and the order of that list is random everytime), some metadata is re-written everytime like time of last modified and finally all the xml files are one single line. The git diffs are complete useless and noisy and just looking at the Excel file will cause git to consider it updated. So sure, you can use git to snapshot you Office documents... But just don't.
If you are, like I once was, the poor fool who has to maintain a bunch of VBA macros... Extract them into files and source control those. Make a script to extract them and to put them back, and use git-lfs for the actual workbook if you need a template workbook.
Now pardon me, I need to add this to the agenda for my next therapy.
Just fork git to handle zipping, formatting and ignoring metadata! Or just put your office document in the cloud and use the basic versioning it provides.
Doesn't matter, to git they are still binary files, which means it'll check in each revision as an entirely new copy.
Yes, you might only see the most recent one in your working directory, but under the hood, all the other ones are still there in the repo.
Someone could probably build a tool which sits in between you and Git, which unzips the file before committing and after pulling, so Git sees the raw xml file, but you always see the zipped docx.
edit: never mind. Just read @petersr@lemmy.world's comment explaining why this is a bad idea.
Which isn't any different than keeping them as separate files space wise so what's the problem?
(Other than Word having built-in versioning.)
I think you can write clean/smudge filter that will turn docx into tree(folder)
What's a good way to learn about Latex and Git. I've tried learning on my own but it's very overwhelming.
Overleaf is easy to use and has tutorials for LaTeX
Never heard of latex but I can help you with Git.
What you want to know?
It is a pity that Markdown does not have the possibilities of Latex.
Typst is Markdown-ish with the possibilities of LaTeX.
I wrote about half of my thesis in R Markdown using Git to backup my work. It's fantastic because you can have your plots and statistics integrated directly into your paper and formatting in Markdown is much easier than straight up latex.
R markdown is awesome. I'd always use it for my biostatistics tests and assignments.
Me with Jupyter Notebooks
I recently read a tutorial titled: "how to annoy your collaborators: a git CI pipeline for LaTeX" ;)
I encountered an engineering firm that did this. I wanted to do it too.
The company I worked for at the time (said engineering firm was doing subcontracting for us) was full of older business people who could never in a million years have wrapped their heads around the idea.
I also met this at a contracting job. Drove me bonkers.
This is how you know they're irrelevant. Take the time to learn or just retire.
Latex and git ❤️
"Delete this repository" ate my homework.
git checkout -b final_version_revised2_REALLYFINALTHISTIME git commit -am “holy fuck I hope this really is the last edit” git push
Okay, I have a question. I would love to write my papers in latex, but none of my colleges use it. Is there a way to reasonably collaborate with coauthors who only use Word and for whom Latex would be confusing and difficult?
You don't. You could try overleaf or some wysiwyg editor for LaTeX, but both need some getting used to and at least a minute amount of effort. Overleaf probably has the lowest barrier of entry (0 set up required), but is a paid service.
It's possible to selfhost overleaf if you don't want to pay them
some wysiwyg editor for LaTeX
LyX is basically that.
It depends on what sort of collaboration. For things on which I was the sole author, like my dissertation, I leveraged the miracle that is pandoc. Every email my advisor got from me was a perfectly formatted Word doc with a flawless bibliography and he never had to learn what the hell LaTeX is.
But if you have multiple contributors going back and forth, or need to keep long-lived discussions in the track changes panel, you’re better off not trying to teach others a new tool. Unless they have a genuine interest in it, in which case the WYSIWYG editors can be fun.
Markdown and pandoc are like match made in heaven for this. If you didn’t know, Markdown is plain text file, has a simple syntax for formatting (that gets carried over when you use pandoc), supports LaTeX equations and can attach metadata as yaml part on top of the file (gives custom usability when pandoc works on it) and supports citations w/ a bibliography file. And pandoc is document converter between multiple formats and can produce word files, PowerPoints, html file, latex pdfs (book, report, Beamer presentations) etc. You can also provide a template for pandoc to work with and it produces in that format. Not to mention since it’s plain text, you can apply git version control and also use make files to iteratively compile new outputs.
There is also RMarkdown (or it’s newer successor Quartro), which is same markdown pipeline but also can compute codes inside a section and attaches the result to the markdown file and does the whole pandoc thing afterwards. Think of it as like Jupyter Notebook style of literate programming with Markdown. Here’s a demonstration of its capabilities. https://youtu.be/_D-ux3MqGug
Assuming your colleagues can work with git but not LaTeX, you can set up a git repo with just markdown files and collaborate on that and have a makefile or docker container to get the final word or pdf generated. Here’s a good example of an pandoc makefile https://gist.github.com/kristopherjohnson/7466917
In Worst case scenario that they only work with word files, you can generate one from your markdown files and share with them and pull down the changes they sent you on the word document.
P.S. I assume Org-Mode can also substitute Markdown here in the pipeline. But I haven’t committed to it, so I’m not fully sure.
IMO LyX is way simpler than LaTeX for basic stuff, but because it is literally not Microsoft Word I couldn't really use it to collaborate with people this semester, let alone convince them to work on a full LaTeX document. LyX would be the way to go if my colleagues were even remotely interested in learning about literally anything. You can lead a horse to water, but you can't make it drink...
Don't you automatically put everything relevant you create in a version control system? And if not, why?
There's no thinking involved on it. Create repo; run editor. The sequence is automatic.
Only makes sense if it's text files (like source code). Even if DOCX files are just a bunch of XML files wearing ZIP trenchcoat as this guy says, chances are git doesn't know that, so it'll treat the whole thing as a binary file and save each revision as a separate file entirely, in which case you haven't really accomplished much other than hiding away all those intermediate versions in an invisible drawer.
I'm dumb, can someone explain this joke to me? Wtf is a git repo?
Git is a tool that makes it convenient and lightweight to keep past snapshots of a directory of text files (called a repository) and compare them. It also makes it easy to have multiple people work in parallel on the content of the directory, see the differences and merge everything into a common version. It is essential in programming, it's called versioning or version control.
Although it is not easy to access for non programmers because it's based on slightly obscure command lines. So it's a bit of an over-engineering to use it for a single file edited by a single person. Especially because you can now put those on the cloud and have some form of version control that allows to easily compare and go back to previous versions graphically.
It may be worth it if it's a long document that you work upon for a long time, such as a PhD thesis.
Thanks for explaining!
Git tracks changes for a folder full of code (aka "repo") between saves (called "commits") so you can revert back to previous versions. It's intended for software but there's nothing stopping you from using it for documents
Thanks!
I wrote my thesis in Google Docs on my university account.
Google Docs is perfect for stuff like this. Great history management, better (though not great) at formatting and stuff, has features like revisions from editors and notes. Way better than Word IMO.
Don't put binary files in git
It's not ideal, but for a thesis --- which ideally has an end date after which it won't be used --- it's not a huge problem I'd argue.
both Microsoft office and open/libre office files are zip files containing mostly XML so its not that bad actually , as long as you expose the files inside to git
What's the issue with binaries in git? Just that diff'ing binary files is useless?
They are generally large, uncompressable and replaced instead of updated like text files. All files stay in the repo history forever, they make repos big and slow compared to text files with no advantages provided (e.g. as you said, diffing etc is useless).
If a binary file needs to be stored in git, it's usually more appropriate to use git LFS for that file. Git LFS stores the binary outside of the repo in the same way that database engines store binary outside of the respective table.
In this case, it would be much smarter to use version control on the text in the document, not tte binary file, which is a feature of essentially every document writer program.
Between zfs and git, all my important data is versioned.
BTRFS for all us lame folks.
PS Windows pervious versions is actually pretty good, but no one uses it on desktop.
Don't forget to push.
Several times I've lost large chunks of work because I usually copy files from the main folder to backup folders, but occasionally I copy files from a folder that was an old backup, reverting all files everywhere by mistake.
Fourth panel from Mark Pilgrim:
Counterpoint: advisor said no.
"Just use Word, everyone else does. I have never heard of this latex thing, so must be just some trendy useless overengineered software that does Word's job but worse. Word can track changes just fine, and you can leave comments." proceeds to strikethrough, highlight, and inline comment everything instead of using either of those features "I want to read what you wrote, not fight technology" proceeds to email you three separate times after forgetting to attach v28 about how a graphic looks wrong because Word ate it
While correct in the sense of word and versioning via mail being a nightmare, I really don't think you can expect anyone to learn latex just so they can comment in your document. I would have offered to send a pdf. Shoot me.
I would have never considered doing anything but sending a PDF. Even if they do know LaTeX. Unless they're offering to help edit the code for me, what good is it? It's objectively harder to read than the formatted PDF.
That said, marking up a PDF is much more difficult and does require more specialised software and know-how than editing plain text or even editing a Word document. So there are some advantages to it.
you can still use word with git. it's versioning first, diffing and merging only where possible. since you probably won't branch you won't need the latter, though.
Preaching to the choir. "But Box already supports 'versioning', why use a confusing hacker tool instead?"
Missing diffs is a problem, though.
I don't get how Microsoft owns GitHub yet hasn't figured out any way to actually create a spec that would be git compatible for Excel, Word, and PowerPoint files yet.
Dude was shall we say, hands on about certain things. My dissertation is still embargoed because he is paranoid about being scooped. Joke's on him, everything that hasn't been published is not exciting enough to meet his own metric for publishability.
I did this and I had no issues with any of the thesises I have submitted in my bachelors or masters.
First year calculus teacher, thank you SO much for forcing us to write submissions in latex.
Also, overleaf is a thing, this is not like my 1st year of uni, this 11 years later or so. If your fucking professor never heard of latex they are just bad at academia and shouldn't be teaching honestly. It's not just about the field knowledge.
That's assuming they are competent enough to even use a PDF.
I do this, but from Word.
I learned Latex for my master thesis. Never used it again afterwards, except for my resumé.