Text Cleanup

JavaScript for Adobe InDesign
Latest update 4/9/2018

The script repairs common flaws in text, done in one operation rather than repeated visits to the InDesign Find/Change dialog. The script was created to clean up imported text from word processing applications, content that is often loaded with unwanted formatting — extra spaces between words, multiple spaces used to align text, unwanted line breaks, and more.

  • Apply to entire document, selected story, or selected text
  • Confirmation to proceed, undo, or skip each change
  • Replace a defined number of spaces with a tab character
  • Remove spaces before or after tabs
  • Remove spaces and tabs at paragraph begin or end
  • Reduce multiple spaces or tabs to one
  • If desired, keep two spaces between sentences
  • Replace double hyphens with em or en dash
  • Remove forced line breaks
  • Remove excess at paragraph and story end
  • Remove empty paragraphs
  • Save and restore all settings

Free to download and use. Contributions of any amount are appreciated but not required.

Text Cleanup screen 1
Download
Text Cleanup

Instructions for use

The interface is divided into five sections: Search, Spaces, Tabs, Other, and Settings. When the OK button is clicked, processing begins.

The Scripts palette is closed to give a clear view of the layout, and the display is zoomed to better show the changes to confirm. Also, Show Hidden Characters is enabled, and if the display mode is currently Preview, the mode is changed to Normal. After processing is complete, all prior view settings are restored.

The first match discovered is selected in the layout and a confirmation dialog appears on screen. A description of the proposed change is displayed above a series of buttons:

Text Cleanup screen 2

Confirm — the selected change is performed and displayed. The button Confirm becomes OK and the former button OK becomes Undo. The remaining buttons other than Cancel are disabled. Clicking OK continues as described below. Undo reverts the change and restores to the prior confirmation dialog, where Skip may be chosen instead, if desired.

Text Cleanup screen 3

OK — (for either confirmation dialog) the selected text is removed or replaced as indicated, and the next proposed change is selected.

OK to all — the selected text, and all subsequent changes, are performed without further user intervention. Note: this applies to the current task (i.e. replace multiple spaces). When the next task begins, the confirmation dialog is again displayed. This behavior repeats until all tasks (options selected) are complete.

Skip — the selected text remains unchanged, and the next proposed change is selected.

Skip all — the selected text remains unchanged, and all remaining changes related to the current task are abandoned. Processing resumes with the next task (i.e. replace multiple spaces).

Cancel — processing ceases without further changes. Any changes previously accepted remain. Use undo to back out of all changes, or revert the document.

The confirmation dialog may be moved to another location on screen if it obscures the layout. The dialog will maintain its new position until the next launch of the script.

Once processing is complete, the user is notified.

Additional notes:

  1. Some options pose greater risk to upset a completed document and should be used with care. Understand each option and use the confirmation dialog to ensure the results are as expected. Skip any proposed changes that are not desired. The original intent of the script was to use it immediately after text is imported from an unreliable source such as from a word processing application, a common origin of poorly formatted content. The idea wasn’t to run the script repeatedly or on completed documents, but that doesn’t prevent the script from being used at any time. If a recurring flaw is noticed in a completed document, certainly the script may be used to correct the flaw. The advice here is that if using the script on completed documents, use it with care, limiting the options to those of lesser risk — removing multiple spaces; replacing double hyphens; and removing excess at paragraph and story end. These changes are usually safe. Or other options when the potential results are well understood. Also, when new content is imported, it is advised to limit the search to the selected story or selected text so the operation does not affect the entire document.
  2. Tasks (options selected) are performed in the order listed on screen with the exception of Remove excess from paragraph and story end, and Remove forced line breaks. These tasks are completed first as they may affect remaining tasks, which are then performed in the prescribed order. For example, replacing spaces with a tab is done before resolving multiple spaces, otherwise changing spaces to a tab would never occur. Though changes are performed in the most logical order, cases of particularly odd formatting may warrant a second run of the script. If the ultimate result is not realized the first time through, try a second pass, targeting the particular flaw that remains.
  3. The confirmation function of the script selects the text that will be changed and brings it into view on screen. If any story includes overset text, it is not be possible to show the proposed change because it is hidden off-screen in the overset text. If this condition exists, a warning is displayed and the user has the option to continue regardless, or decline and remedy the overset text before trying again, the recommended choice. Then you can see what is being changed and confirm each change before continuing.

Section 1: Search

Document — changes apply to the entire document that is currently open and the top-most window if multiple documents are open.

Selected story — changes apply strictly to the selected story, or the story containing text currently selected. If no story is selected, the choice is disabled. The user may also choose Document to increase the scope of text affected.

Selection — changes apply strictly to the selected text. If no text is selected, the choice is disabled. The user may also choose Selected story or Document to increase the scope of text affected.

Section 2: Spaces

Replace all special with normal — all instances of special space characters are replaced with normal space characters. Examples of special space characters are non-breaking, thin, en, and em spaces, among others. Use with caution for completed documents, as the use of special space characters is likely intended. This option exists for imported text from unreliable sources such as word processing applications, in which the use of special space characters is usually by mistake and may cause undesired results while typesetting the text.

Replace with tab — the user may define a minimum number of multiple spaces that when detected, the spaces detected along with any additional consecutive spaces are replaced with a single tab character. Use with care as the results will be dramatic. Best used for newly imported text that will be subsequently styled.

Only at paragraph begin — replace with tab may be restricted to only instances of multiple spaces at the beginning of paragraphs.

Remove before or after tab — any number of space characters alone, before or after a tab character, are removed. The option is relatively safe, but in some cases may upset text flow. Use with care for completed documents.

Remove at paragraph begin — any number of space characters discovered at the beginning of any paragraph are removed.

Between words, replace two or more with one — any instance of two or more consecutive space characters is reduced to a single space, unless the choice Keep two spaces between sentences is enabled. See below for more details.

Include special space characters — replacement of multiple spaces includes any special space characters such as non-breaking, en, or em space, etc. Any instance that includes special space characters will be replaced by a normal space character.

Keep two spaces between sentences — during replacement of multiple spaces, preserves instances of two spaces, but only when the instance directly follows a period, signaling the end of a sentence. This choice will not add spaces between sentences, only keep them if they already exist. Please note, if it were up to me (a typesetter of many years), this would not be an option. Unfortunately, clients exist who are adamant that sentences be separated by double spaces, regardless of any explanation of typography and how it differs from typewriters and lessons learned in High School. These clients write the check, so I’m in no position to argue. If you have similar clients, you’ll appreciate the option to preserve their double spaces despite our distaste for the practice.

Section 3: Tabs

Remove at paragraph begin — any number of tab characters discovered at the beginning of any paragraph are removed.

Replace two or more with one — any instance of two or more consecutive tab characters is reduced to a single tab.

Section 4: Other

Replace double hyphen with em dash or en dash — any instance of two hyphens is replaced with a single em dash or en dash, as chosen by the user.

Remove forced line breaks — all forced line breaks are removed. The script determines if a space character precedes or follows each line break, and for instances where a space is absent, one will be inserted to ensure that words do not crash together once the line break is removed. Use with care for completed documents, as forced line breaks are likely intended and their removal may upset text flow. This option exists for imported text from unreliable sources such as word processing applications, in which forced line breaks are usually by mistake and may cause undesired results while typesetting the text.

Remove excess at paragraph and story end — any combination of spaces or tabs, prior to the end of paragraphs, is removed. For stories, the same applies and includes the removal of forced line breaks and excess paragraph ends at the story end. If one or more paragraph ends exist at story end, one paragraph end will remain, otherwise the story concludes with a story end marker. This option is safe to use for any document as the result has no effect on text flow.

Remove empty paragraphs — instances of two consecutive paragraph ends is replaced with one. Use with care for completed documents as the result will be dramatic, eliminating space between paragraphs in cases where empty paragraphs have been used to do so. It’s not the right way to create space between paragraphs, but the practice is common even in completed documents. This option exists for imported text to prepare it for proper configuration of space between paragraphs (if desired) by giving the paragraph style a value for space before and/or after rather than inserting empty paragraphs.

Section 5: Settings

Current choices may be saved and restored later. Select from the Load drop-down list to choose saved settings, which will then update the current choices. Click the Delete button and the saved settings selected in the Load drop-down list will be permanently removed. Click the Save button, provide a name for the settings, and the current choices will be preserved. If the name already exists, the user may choose to replace the saved settings.

The script provides default saved settings named [Default] that will be created if the settings do not exist. The [Default] saved settings cannot be deleted but may be updated to any desired values by saving settings named [Default], which then overwrites the default settings.

Each time processing occurs, the current settings are preserved, and the next time the script is launched, settings are restored to the last values used.

Note that the functionality to save settings requires a file to store the settings, which coexists in the script folder alongside the script file itself. It has the same name as the script but a different extension, “json”. The file may or may not be visible depending on the InDesign Scripts Panel option Display unsupported files. Normally only script files are visible, but when this option is enabled in the Script Panel fly-out menu, all files are visible.

CHANGE LOG

Version 18.4.9

a. Rewrote change engine to execute each task (option) on all stories, then next task, rather than execute all tasks on each story, then next story.
b. OK to all applies only to current task, not all tasks.
c. Improved confirmation dialog with undo.
d. Close scripts palette on launch to keep it from obscuring view of layout.
e. Display is zoomed, set to normal view, and hidden characters are displayed; prior view settings restored on completion.
f. Read legacy settings removed.

Version 17.12.20

a. Create saved settings [Default] if it does not exist.
b. Create saved settings file if it does not exist.
c. Disable delete settings button if none are selected or [Default] is selected.

Download
Text Cleanup

For help installing scripts, see How to install and use scripts in Adobe Creative Cloud applications.

IMPORTANT: by downloading the script you agree that the software is provided without any warranty, express or implied. USE AT YOUR OWN RISK. Always make backups of important data.