This painstaking process took place over several weeks for the 100+ recipes.Įventually, I ended up with 100+ Markdown files with the latest version of the code and text of the first edition. Then, I manually merged the two versions, integrating the latest version of the text obtained from the PDF into the Markdown sources. I also converted into Markdown the original Jupyter notebooks. I cleaned the output semi-manually with some Python.
How to obtain Markdown files from the PDF ? I used pdftotext, a little tool that extracts plain text from a PDF file. To see LaTeX equations in Markdown files on GitHub, one has to use an extension like this Chrome extension, because GitHub doesn't support this natively. To save time and to avoid the issues of the first edition, I obtained that only a light proofreading pass would be done by the editors.Ī restriction I had was that I wanted the Markdown files to be nicely rendered on GitHub. At the same time, I had negotiated with Packt Publishing that I would not use anything else than Markdown and Jupyter, except for the very last minor edits made by the publisher in PDF. So I was left with two branches of the text : the original unedited notebooks, and the final, edited, proofread text in PDF.įor the new edition, I wanted to start from the final version of the first edition. The text edits weren't backported into the original notebook sources.
Then, the editing process took place in Word and, after the layout process, in PDF. I had written the first edition in Jupyter notebooks, and I had developed a home-made tool to convert the notebooks into Word, the only format accepted by Packt. In this post, I'll give an overview of the technical process I've used to write the book, using Markdown, Jupyter Notebook, pandoc, and pelican. The writing process was much less painful than with the first edition. A few recipes are exclusive to the printed book and ebook, to be purchased on Packt and Amazon. The released text is available under the CC-BY-NC-ND license, while the code is under the MIT license.
However, the main novelty is that almost the entire book is now freely available on GitHub. As usual, all of the code is available on GitHub as Jupyter notebooks. There are a few new recipes introducing recent libraries such as Dask, Altair, and JupyterLab. All 100+ recipes have been updated to the latest versions of Python, IPython, Jupyter, and all of the scientific packages.