Text Cleanup

Script for Adobe InDesign
Latest update 9/27/2021, version 3.1

The script repairs common flaws in text, done in one operation rather than repeated visits to the InDesign Find/Change dialog. The script was created to clean up imported text from word processing applications, content that is often loaded with unwanted formatting — excess spaces between words, multiple spaces used to align text, unwanted line breaks, and more.

NEW as of version 2.8: change double periods to one, or three to ellipsis.

  • Apply to entire document or selected story
  • Confirmation to proceed, undo, or skip each change
  • Change defined number of spaces to tab character
  • Remove spaces before or after tabs
  • Remove spaces and tabs at beginning of paragraphs
  • Reduce multiple spaces or tabs to one
  • If desired, keep two spaces between sentences
  • Change double hyphens to em or en dash
  • Fix flopped apostrophes
  • Fix foot and inch marks
  • Remove empty paragraphs
  • Remove zero-width characters
  • Remove forced line breaks
  • Remove column/frame/page breaks
  • Remove excess at end of paragraphs and stories
  • Save and restore all settings
  • User-configurable localization
Text Cleanup
Download
Text Cleanup

You decide. Reward the author an
amount the solution is worth to you.

How-to Video

How to use the script

The interface is divided into five sections: Search, Spaces, Tabs, Other, and Settings. Enable desired options and click the OK button to begin.

The Scripts panel is closed to give a clear view of the layout, and the display is magnified to better show the changes to confirm. Also, Show Hidden Characters is enabled and the display mode is changed to Normal.

The first match found is selected in the layout and a confirmation dialog appears on screen. A description of the proposed change is displayed above a series of buttons:

Text Cleanup confirm

Confirm — the selected text is changed and displayed. The button Confirm becomes OK and the former button OK becomes Undo. The remaining buttons other than Cancel are disabled. Click OK to continue as described below. Undo reverts the change and restores the prior confirmation dialog, where Skip may be chosen instead, if desired.

Text Cleanup OK or undo

OK — (for either confirmation dialog) the selected text is changed as indicated, and the next proposed change is selected.

OK all — the selected text and all remaining matching instances are changed without further user intervention. This applies only to the current task (tasks defined as options selected). When processing of the next task begins, the confirmation dialog is again displayed. This repeats until all selected tasks are completed.

Skip — the selected text is not changed, and the next proposed change is selected.

Skip all — the selected text is not changed and all remaining matching instances are ignored. Processing resumes with the next task and again the confirmation dialog is displayed.

Cancel — processing ceases without further changes. Any changes previously accepted remain. If desired, the Edit menu item Undo restores the document to its state prior to launching the script.

The confirmation dialog appears centered on screen and near the top of the window. If the dialog obscures the layout, it may be moved to another location on screen, even to a secondary display. The dialog will maintain its new position until the next launch of the script.

If any story has overset text, it is not possible to show the proposed change because it is hidden off-screen in the overset text. In this case, a warning is displayed and the user has the option to continue regardless, or decline and remedy the overset text before trying again, the recommended choice. Then each change is visible and can be confirmed.

The original intent of the script was to correct text imported from an unreliable source such as a word processing application. When used on a completed document, know that most options will upset text flow and should be used with care.

The user is notified when processing is complete, or if processing is canceled.

Section 1: Search

Document — changes apply to the entire document that is currently open and the top-most window if multiple documents are open.

Story — changes apply to the selected story. If no story is selected, the choice is disabled. The user may also choose Document to increase the scope of text affected.

Zoom — the percentage to which the display is magnified when processing begins. This allows the user a closer look at selected text to better judge if changes are acceptable.

Section 2: Spaces

Change all special to normal — special space characters are changed to normal space characters. Examples of special space characters are non-breaking, thin, en, and em spaces, among others.

Change to tab — the user may define a minimum number of multiple spaces that when detected, the spaces detected and any additional consecutive spaces are changed to a single tab character.

Only at beginning of paragraphsChange to tab may be restricted to only instances of multiple spaces at the beginning of paragraphs.

Remove before or after tab — space characters before or after tabs are removed.

Remove at beginning of paragraphs — space characters at the beginning of paragraphs are removed. This also applies to table cells.

Remove before punctuation — removes space characters that precede a period, comma, colon, semicolon, exclamation mark, or question mark.

Between words, change two or more to one — two or more consecutive space characters are reduced to a single space, unless the choice Keep two spaces between sentences is enabled. See below for more details.

Include special space characters — special space characters are also reduced to a single space, such as non-breaking, en, or em space, etc.

Keep two spaces between sentences — instances of two spaces are preserved, but only when the instance directly follows a period, signaling the end of a sentence. This option will not add spaces between sentences, only keep them if they already exist.

Section 3: Tabs

Remove at beginning of paragraphs — tab characters at the beginning of paragraphs are removed. This also applies to table cells.

Change two or more to one — two or more consecutive tab characters are reduced to a single tab.

Section 4: Other

Change double hyphens to em dash or en dash — instances of two hyphens are changed to a single em dash or en dash, as chosen by the user.

Change two periods to one — instances of two periods surrounded by spaces or other characters are changed to a single period.

Change three periods to ellipsis — instances of three periods surrounded by spaces or other characters are changed to an ellipsis character.

Fix flopped apostrophes — InDesign and word processing programs feature auto-correct that converts single and double quotation marks to “typographer’s quotes.” In most cases this is helpful, but in the rare case a word begins with an apostrophe, this auto-correct feature errantly changes the apostrophe to an opening (left) single quotation mark. For example, go get ’em or summer of ’88 are changed to an opening single quotation mark, the mirror image of an apostrophe (in other words, flopped). This option detects such cases and lets the user change to the correct closing (right) mark, which doubles as an apostrophe.

Fix foot and inch marks — like flopped apostrophes, InDesign and word processing programs errantly change foot and inch marks to typographer’s quotes. This option detects single or double quotation marks that follow a digit or fraction glyph, and provides changing these to a single or double straight quotation mark, the proper symbol for feet or inches.

Remove empty paragraphs — empty paragraphs are removed.

Remove zero-width characters — removes various marker characters, such as bookmarks, text anchors, index, joiner/non-joiner, and others visible only when Show Hidden Characters is enabled. For text imported from a word processing program, these usually have a different meaning, and once brought into InDesign, they are excess. HOWEVER, for completed documents, these marks are likely intended and should remain. USE WITH CARE — this option removes all bookmarks and text anchors set as hyperlink destinations.

Remove forced line breaks — forced line breaks are removed. The script determines if a space character or a hyphen precedes or follows each line break, and for instances where both are absent, the line break is changed to a space to prevent words from crashing together.

Remove column/frame/page breaks — removes these break characters. Obviously, this may greatly upset text flow.

Remove excess at end of paragraphs and stories — spaces and tabs at the end of paragraphs are removed. This also applies to table cells. For the end of stories, in addition to spaces and tabs, forced line breaks and paragraph ends are removed. If one or more paragraph ends exist at the end of a story, one paragraph end will remain, otherwise the story concludes with a story end marker.

Section 5: Settings

The current options may be saved and restored later. Select from the Load drop-down list to choose saved settings, and the current options are updated. Click the Delete button, and the saved settings selected in the Load drop-down list are permanently removed. Click the Save button, provide a name for the settings, and the current options are preserved. If the name already exists, the user may choose to replace the saved settings. Or click the checkbox Replace settings, and choose the settings to replace.

The script provides default saved settings named [Default]. These settings cannot be deleted but may be updated to the current values. Save settings, click the checkbox Replace settings, and choose [Default].

Each time processing begins, the current options are preserved, and the next time the script is launched, options are restored to the last values used. Zoom value is not included in saved settings, but is preserved so that each launch of the script, the last value used is restored.

Localization

The script provides user-configurable localization. By default the script language is US English, which does not require further download or configuration. For other languages, download the Language Pack and copy the i18n file for the desired language to the script folder alongside the script file. When launched, the script detects the i18n file and the interface displays the language. If the desired language is not present in the language pack, edit the English i18n file to translate to the desired language, and copy the edited i18n file to the script folder alongside the script file. For details of how to edit and install i18n files, read How to Localize Scripts.

Download
Text Cleanup

License details included in download

For help installing scripts, see How to Install and Use Scripts in Adobe Creative Cloud Applications.

IMPORTANT: by downloading the script you agree that the software is provided without any warranty, express or implied. USE AT YOUR OWN RISK. Always make backups of important data.