How does it work?

In the most simple terms:

  1. Each page is given a numerical index. The first page of the first book is page 1, the first page of the second book is 411 and so on, until the last page in the entire library — which has the index 293200.
  2. This number is run through a mathematical function to produce another number of 3200 digits, which we represent in base-29 (29 is the number of characters in our limited alphabet).
  3. Each digit of the base-29 result is mapped to the character at that position in the limited 29 character alphabet (0 → a, 1 → b, ...) to produce the content of that page.

Crucially, the mathematical function is reversible meaning that we can instead give it the contents of a page and determine the page number that it appears on.

For a more detailed explanation and to view the source code, please see the GitHub repository.

Is it fake? Is it just creating pages with my search terms and saving them so I can see them again later?

No, that would be cheating! It is easy to assume that this website just takes whatever text you search for, inserts it into a random page, and saves that page on disk so that you can find it again at a later date. Similarly with pages of gibberish, what’s to say they aren’t just generated randomly when you request them and saved for later?

Building a Library of Babel this way would require unobtainable amounts of storage. A single page (content only) takes up 3200 bytes, with a single byte per ASCII character. 3200 bytes multiplied by 293200 unique pages gives us a total size of 1.509×104671 TB — an infeasible amount of data to store, to say the least.

Thus to build a virtual Library of Babel, we have to use a method that doesn’t require any storage. Pages are generated on the fly based on their page number, and the same page number will continue to generate the same page forever, unless the algorithm is changed.

Where can I learn more about the Library of Babel?

If you haven’t already read the story then start there! The website also contains lots of supplemental information and theory, as well as another implementation of the library itself.