gitmyhub

arxiv-latex-cleaner

Python ★ 6.9k updated 2mo ago

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

This is a command-line tool for researchers who write academic papers in LaTeX and need to submit them to arXiv, the popular preprint server. The tool takes your paper's source folder and produces a cleaned-up copy that is ready to zip and upload.

The cleaning process handles several practical concerns. On the privacy side, it strips all comments from your LaTeX code, which would otherwise be visible to anyone who downloads the source from arXiv. It can also remove helper commands like todo notes that you defined during writing but do not want in the final submission. On the file size side, arXiv has a 50MB limit, so the tool removes unused image files and unused tex files, and can optionally resize images, compress PDFs, or convert PNG files to JPG to shrink the package.

For diagrams drawn with TikZ (a common way to create figures directly in LaTeX), the tool can replace the raw source code with pre-compiled image files, which hides your diagram code from public view. There is also support for custom find-and-replace rules written in a config file, which lets you swap out your own shorthand commands for standard LaTeX before submitting.

Installation is straightforward via pip or Homebrew on Mac. You point it at your input folder, optionally pass flags or a config file, and it produces a new cleaned folder alongside the original. The project is from Google Research and is written in Python, requiring version 3.9 or higher.