Browse Tag

language

Importing your own scanned text into LingQ

I love using LingQ for my language learning, and to make the most of it, I wanted to import one of my favourite French books that I have only in hardcopy. Whilst the LingQ process for importing text is as easy as could be, preparing the text involved a bit of trial and error, so I thought I would share what I do.

The book is La délicatesse by David Foenkinos, which is full of vocabulary I’d like to internalise. By the way, I love the movie with Audrey Tautou too.

Step 1: Scanning

I found it best to create individual scans of each double page, or to scan to a multi-page PDF, but it’s important not to scan with OCR (optical character recognition) turned on, so that you just have a plain PDF without any text in the background of the image. On the scanner I used the option is called a “non-editable PDF” – a slight misnomer, but anyway. If you produce a scan with OCR you may later have trouble overriding the default underlying text output – unless of course you can change the scanner’s OCR language to French, but with my office equipment that wasn’t possible.

Step 2: OCR process

I use a program called Nitro, but Adobe Acrobat would probably be similar. After importing the PDF, I go to Review > OCR > Options > Advanced, and select French as the recognition language. Then click OK, and the text recognition process only takes a few seconds.

Step 3: Create text document

Still in Nitro, I go Convert > To Word > Convert. A Word doc with the text opens up. In my experience there are some anomalies in the placement of some blocks of text, so I continue…

Step 4: Cleaning the text

I copy and paste the text in the correct order into a Notepad document. The purpose of this is to strip out any weirdness that comes from the Word doc.

Step 5: Assembling the text

I then copy and paste the text from Notepad into a WordPad document, which is easier to work with than Notepad. In WordPad I get rid of any OCR errors, for example hyphens that have magically turned into bullet points, or unnecessary spaces, etc.

It could be easier to edit in Word though. If you want to get rid of double or even triple spaces between words, you can do an easy copy/replace.

Save the file as a LingQ-compatible DOCX file if you import the whole book in once go.

Step 6: Importing the text

You can then use the Import ebook function in LingQ to easily create a lesson with the text you scanned.

Or you can create a new course and then add each chapter as a lesson.

Resources for learning French

I’ve been learning French since I was in 4th grade. For a long time I had a complicated love-hate relationship with this beautiful language, thanks to many years of relentless drilling, but now I just enjoy experiencing some French and Quebecois culture by reading, listening and speaking (a love interest certainly helps).

In this post I’ve collected some resources I’ve found particularly useful. I recommend a combination of various sources. No one source alone will get you to fluency.

Any questions or suggestions? I would love to hear them and include them in this post!

Keep Reading