# @alcalzone/ansi-tokenize > Efficiently modify strings containing [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code#Colors_and_Styles) If you find yourself modifying styled strings repeatedly, alternatives like [`slice-ansi`](https://github.com/chalk/slice-ansi/) may end up doing a lot of unnecessary work by re-parsing the string each time. This module provides a way to parse the string into an array of tokens (characters or ANSI codes), which can then be modified and re-serialized into a styled string. ## Install ``` $ npm install @alcalzone/ansi-tokenize ``` ## Usage ### Tokenize a string ```js import { tokenize } from "@alcalzone/ansi-tokenize"; // red "foo", followed by unstyled "bar" const str = "\x1B[31mfoo\x1B[39mbar"; const tokens = tokenize(str); // tokens will now look like this: [ { type: "ansi", code: "\x1B[31m", endCode: "\x1B[39m", }, { type: "char", value: "f", fullWidth: false, }, { type: "char", value: "o", fullWidth: false, }, { type: "char", value: "o", fullWidth: false, }, { type: "ansi", code: "\x1B[39m", endCode: "\x1B[39m", }, { type: "char", value: "b", fullWidth: false, }, { type: "char", value: "a", fullWidth: false, }, { type: "char", value: "r", fullWidth: false, }, ]; ``` Each token is either a character ```ts export interface Char { type: "char"; value: string; fullWidth: boolean; } ``` where - `value` is the string representation of the character - `fullWidth` is `true` if the character is full width (takes up 2 characters in monospace, like CJK characters) or an ANSI code ```ts export interface AnsiCode { type: "ansi"; code: string; endCode: string; } ``` where - `code` is the ANSI code that starts the style - and `endCode` is the corresponding ANSI code that ends the style. An `AnsiCode` can also be an end code, in which case `code` and `endCode` will be the same. ### Convert an array of tokens into an array of "styled" chars This representation is a 1:1 mapping of the original string, but not very useful for modifying the string. The `styledCharsFromTokens` function converts a token array to an array of characters, where each character has an all currently active styles associated with it: ```ts export interface StyledChar { type: "char"; value: string; fullWidth: boolean; styles: AnsiCode[]; } ``` Using the above example: ```js import { tokenize, styledCharsFromTokens } from "@alcalzone/ansi-tokenize"; // red "foo", followed by unstyled "bar" const str = "\x1B[31mfoo\x1B[39mbar"; const tokens = tokenize(str); const styledChars = styledCharsFromTokens(tokens); // styledChars will contain the following: [ { type: "char", value: "f", fullWidth: false, styles: [ { type: "ansi", code: "\x1B[31m", endCode: "\x1B[39m", }, ], }, { type: "char", value: "o", fullWidth: false, styles: [ { type: "ansi", code: "\x1B[31m", endCode: "\x1B[39m", }, ], }, { type: "char", value: "o", fullWidth: false, styles: [ { type: "ansi", code: "\x1B[31m", endCode: "\x1B[39m", }, ], }, { type: "char", value: "b", fullWidth: false, styles: [], }, { type: "char", value: "a", fullWidth: false, styles: [], }, { type: "char", value: "r", fullWidth: false, styles: [], }, ]; ``` ### Modify an array of styled characters For modification simply edit the items in the array as necessary, as long as the following rules are followed: 1. The `code` and `endCode` properties must match. You can use the `ansi-styles` module to do this. 2. The `fullWidth` property must be correct. You can use the `is-fullwidth-code-point` module to do this, or if working with multiple strings, turn those into styled char arrays first. E.g. to make the first `o` blue and bold: ```js import ansiStyles from "ansi-styles"; // ... include the above code styledChars[1].styles = [ { type: "ansi", code: ansiStyles.blue.open, endCode: ansiStyles.blue.close, }, { type: "ansi", code: ansiStyles.bold.open, endCode: ansiStyles.bold.close, }, ]; ``` ### Serialize a styled character array back to a string The `styledCharsToString` function converts a styled character array back to a string: ```js import { styledCharsToString } from "@alcalzone/ansi-tokenize"; // ... include the above code const strOut = styledCharsToString(styledChars); // str will now be '\x1B[31mf\x1B[34m\x1B[1mo\x1B[22m\x1B[31mo\x1B[39mbar' ``` This automatically figures out the least amount of escape codes necessary to achieve the desired result, as long as the `styles` arrays contain no unnecessary styles, e.g. blue + red foreground. ## Changelog ### 0.3.0 (2026-02-20) - Fix: preserve non-hyperlink OSC sequences when tokenizing (#54) - Fix: support ST-terminated OSC hyperlinks (#53) ### 0.2.5 (2026-02-11) - Fix: preserve grapheme clusters when tokenizing (#51) ### 0.2.4 (2026-01-29) - Fix: Support hyperlinks with parameters (#45) ### 0.2.3 (2026-01-02) - Fix: Regression when rendering adjacent `dim` and `bold` styles (#43, #44) ### 0.2.2 (2025-10-22) - Fix: Support compound SGR sequences with multiple attributes (#39, #40) ### 0.2.1 (2025-10-20) - Fix: Prevent `dim` and `bold` modifiers from canceling each other (#37) ### 0.2.0 (2025-04-24) - Breaking: Require Node.js 18+ - Fix: Detect emojis as being full width ### 0.1.3 (2023-09-07) - Fix: Support links ### 0.1.2 (2023-08-07) - Fix: Reduce minimum Node.js version to `14.13.1` ### 0.1.1 (2023-04-05) - Fix: Active styles are now correctly reset at the end of the string ### 0.1.0 (2023-03-20) Initial release