Text Cleanup

Script for Adobe InDesign
Latest update 9/28/2022, version 5.1

The script repairs common flaws in text, done in one operation rather than repeated visits to the InDesign Find/Change dialog. The script was created to clean up imported text from word processing applications, content that is often loaded with unwanted formatting — excess spaces between words, multiple spaces used to align text, unwanted line breaks, and more.

  • Reduce multiple spaces or tabs to one
  • Change double hyphens to em or en dash
  • Fix foot and inch marks
  • Remove empty paragraphs
  • Remove forced line breaks
  • Remove excess at end of paragraphs and stories
Download
FREE 30 DAY TRIAL

Single-user perpetual license

How-to Video

How to use the script

The interface is divided into five sections: Search, Spaces, Tabs, Other, and Settings. Enable desired options and click the OK button to begin.

The display is magnified to better show the changes to confirm. Also, the display mode is changed to Normal, and Show Hidden Characters is enabled.

The first match found is selected in the layout and a confirmation dialog appears on screen. A description of the proposed change is displayed above a series of buttons:

Text Cleanup confirm

Zoom: - / + — click minus to decrease zoom and show more of the page. Click plus to increase zoom for closer inspection of the change to confirm. After the change is confirmed or skipped, zoom returns to value set in the interface.

Confirm — the selected text is changed and displayed. The button Confirm becomes OK and the former button OK becomes Undo. The remaining buttons other than Cancel are disabled. Click OK to continue as described below. Undo reverts the change and restores the prior confirmation dialog, where Skip may be chosen instead, if desired.

Text Cleanup OK or undo

OK — (for either confirmation dialog) the selected text is changed as indicated, and the next proposed change is selected.

OK all — the selected text and all remaining matching instances are changed without further user intervention. This applies only to the current task (tasks defined as options selected). When processing of the next task begins, the confirmation dialog is again displayed. This repeats until all selected tasks are completed.

Skip — the selected text is not changed, and the next proposed change is selected.

Skip all — the selected text is not changed and all remaining matching instances are ignored. Processing resumes with the next task and again the confirmation dialog is displayed.

Cancel — processing ceases without further changes. Any changes previously accepted remain. If desired, the Edit menu item Undo restores the document to its state prior to launching the script.

The confirmation dialog appears centered on screen and near the top of the window. If the dialog obscures the layout, it may be moved to another location on screen, even to a secondary display. The dialog will maintain its new position until the next launch of the script.

If any story has overset text, it is not possible to show the proposed change because it is hidden off-screen in the overset text. In this case, a warning is displayed and the user has the option to continue regardless, or decline and remedy the overset text before trying again, the recommended choice. Then each change is visible and can be confirmed.

The original intent of the script was to correct text imported from an unreliable source such as a word processing application. When used on a completed document, know that most options will upset text flow and should be used with care.

The user is notified when processing is complete, or if processing is canceled.

Section 1: Search

Document — changes apply to the entire document that is currently open and the top-most window if multiple documents are open.

Story — changes apply to the selected story. If no story is selected, the choice is disabled. The user may also choose Document to increase the scope of text affected.

Zoom — the percentage to which the display is magnified when processing begins. This allows the user a closer look at selected text to better judge if changes are acceptable.

Section 2: Spaces

Change all special to normal — special space characters are changed to normal space characters. Examples of special space characters are non-breaking, thin, en, and em spaces, among others.

Change to tab — the user may define a minimum number of multiple spaces that when detected, the spaces detected and any additional consecutive spaces are changed to a single tab character.

Only at beginning of paragraphsChange to tab may be restricted to only instances of multiple spaces at the beginning of paragraphs.

Remove before or after tab — space characters before or after tabs are removed.

Remove at beginning of paragraphs — space characters at the beginning of paragraphs are removed. This also applies to table cells.

Remove before punctuation — removes space characters that precede a period, comma, colon, semicolon, exclamation mark, or question mark.

Between words, change two or more to one — two or more consecutive space characters are reduced to a single space, unless the choice Keep two spaces between sentences is enabled. See below for more details.

Include special space characters — special space characters are also reduced to a single space, such as non-breaking, en, or em space, etc.

Keep two spaces between sentences — instances of two spaces are preserved, but only when the instance directly follows a period, signaling the end of a sentence. This option will not add spaces between sentences, only keep them if they already exist.

Section 3: Tabs

Remove at beginning of paragraphs — tab characters at the beginning of paragraphs are removed. This also applies to table cells.

Change two or more to one — two or more consecutive tab characters are reduced to a single tab.

Section 4: Other

Change double hyphens to em dash or en dash — instances of two hyphens are changed to a single em dash or en dash, as chosen by the user.

Change two periods to one — instances of two periods surrounded by spaces or other characters are changed to a single period.

Change three periods to ellipsis — instances of three periods surrounded by spaces or other characters are changed to an ellipsis character.

Fix flopped apostrophes — InDesign and word processing programs feature auto-correct that converts single and double quotation marks to “typographer’s quotes.” In most cases this is helpful, but in the rare case a word begins with an apostrophe, this auto-correct feature errantly changes the apostrophe to an opening (left) single quotation mark. For example, go get ’em or summer of ’88 are changed to an opening single quotation mark, the mirror image of an apostrophe (in other words, flopped). This option detects such cases and lets the user change to the correct closing (right) mark, which doubles as an apostrophe.

Fix foot and inch marks — like flopped apostrophes, InDesign and word processing programs errantly change foot and inch marks to typographer’s quotes. This option detects single or double quotation marks that follow a digit or fraction glyph, and provides changing these to a single or double straight quotation mark, the proper symbol for feet or inches.

Remove empty paragraphs — empty paragraphs are removed.

Remove zero-width characters — removes various marker characters, such as bookmarks, text anchors, index, joiner/non-joiner, and others visible only when Show Hidden Characters is enabled. For text imported from a word processing program, these usually have a different meaning, and once brought into InDesign, they are excess. HOWEVER, for completed documents, these marks are likely intended and should remain. USE WITH CARE — this option removes all bookmarks and text anchors set as hyperlink destinations.

Remove forced line breaks — forced line breaks are removed. The script determines if a space character or a hyphen precedes or follows each line break, and for instances where both are absent, the line break is changed to a space to prevent words from crashing together.

Remove column/frame/page breaks — removes these break characters. Obviously, this may greatly upset text flow.

Remove excess at end of paragraphs and stories — spaces and tabs at the end of paragraphs are removed. This also applies to table cells. For the end of stories, in addition to spaces and tabs, forced line breaks and paragraph ends are removed. If one or more paragraph ends exist at the end of a story, one paragraph end will remain, otherwise the story concludes with a story end marker.

Section 5: Settings

The current options may be saved and restored later. Select from the Load drop-down list to choose saved settings, and the current options are updated. Click the Delete button, and the saved settings selected in the Load drop-down list are permanently removed. Click the Save button, provide a name for the settings, and the current options are preserved. If the name already exists, the user may choose to replace the saved settings. Or click the checkbox Replace settings, and choose the settings to replace.

The script provides default saved settings named [Default]. These settings cannot be deleted but may be updated to the current values. Save settings, click the checkbox Replace settings, and choose [Default].

Language

By default the script language is US English, which does not require further download or configuration. To have the script interface display other languages, choose from the available languages below. Download and copy the .i18n file to the script folder alongside the script. When launched, the script detects the language file and displays interface text in that language. If your language is not listed, download the English file and translate it. The file is plain text formatted as JSON, containing interface text in English, and a second value for its translation, which for the English file is the identical text. Copy the file and rename it to replace “en” with the relevant code for your language, then edit the file to change each line’s second value to the translation in your language. For more detailed instructions of how to edit and install i18n files, see How to Localize Scripts.

English: text-cleanup-en-i18n.zip

Spanish: text-cleanup-es-i18n.zip

Download
FREE 30 DAY TRIAL

Single-user perpetual license

For help installing scripts, see How to Install and Use Scripts in Adobe Creative Cloud Applications.

Also available for hire to program custom solutions. Contact William for more information.

IMPORTANT: by downloading any of the scripts on this page you agree that the software is provided without any warranty, express or implied. USE AT YOUR OWN RISK. Always make backups of important data.