PDF to Audio Book - Infovistar

Python

About Lesson

Overview

In this tutorial, you will learn how to an audiobook from a given PDF in Python. We will be using the PyPDF2 module for extracting the text from PDF files. pyttsx3 module for text to speech conversion.

Installing the module

To install the PyPDF2 and pyttsx3 module and some other related dependencies, we can use the pip command:

pip install pypdf2

pip install pyttsx3

Explanation

1)Import all the libraries

2)Create a PDF file object

3)Create a PDF reader object

4)Print number of pages in PDF file

5)Convert Text to speech using pyttsx3

6)Count the number of pages and run the loop till the page you want to read.

The Code

import pyttsx3
import PyPDF2

book = open('sample.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(book)
pages = pdfReader.numPages
print(f"Total Pages: {pages}")
print("nn")
speaker = pyttsx3.init()

for num in range(1, pages):
    page = pdfReader.getPage(num)
    text = page.extractText()
    
    print(f"Page: {num}n")
    print(text)
    print("nn")
    speaker.say(text)
    speaker.runAndWait()