word_cloud
A little word cloud generator in Python
A Python library that generates word cloud images from text, sizing each word by how often it appears. Works from a Python script or the command line, and supports custom shapes, multiple languages, and color options.
This is a Python library that generates word cloud images from text. A word cloud is a visual where words from a piece of writing are displayed at different sizes: the more frequently a word appears in the source text, the larger it is drawn in the image. These are commonly used in presentations, reports, and social media to give a quick visual impression of what a document or dataset is about.
Installing the library is straightforward using pip or conda, the two most common Python package managers. Once installed, you can use it from a Python script or directly from the command line. The command-line version takes a plain text file as input and outputs an image file, which makes it easy to generate a word cloud without writing any code.
The library can handle several visual styles. A simple word cloud places words on a plain background at random positions and sizes. A masked version lets you supply a shape image, and the words are arranged to fill only that shape, so you could get a word cloud in the outline of an animal or a letter. The library also supports color customization and languages beyond English, including Arabic.
For working with PDFs, the README suggests piping the text output of a PDF-to-text conversion tool into the word cloud command, which works on most Linux systems where that tool is included by default.
The library is MIT licensed and tested against several recent versions of Python. It depends on three common Python packages for math, image handling, and plotting. The code was originally shared in a 2012 blog post and has been maintained and expanded since then.
Where it fits
- Generate a word cloud image from a plain text file using the command line, no coding required
- Create a shaped word cloud that fills the outline of any image, such as a logo or animal silhouette
- Visualize the most frequent topics in a document or dataset for a presentation or report
- Add word cloud generation to a Python data pipeline or web application