Less Broken = Win, Highlighter V0.8

 

Here 'tis, version 0.8. Most of the major changes dealt with fixing the HTML tokenizer which now works *almost* flawlessly. Here are the updated README and CHANGELOG from v0.8.

CHANGELOG
Plain0.8 * Changed the way custom handlers work in syntax files. Instead of returning a highlighted string they should return a tokenset just as the main tokenize function does, this allows for recursion without hackery. See the README for updated documenation on how to write the handler funcs * Massive updates to the bundled HTML tokenizer so it handles comments, DOCTYPE, strings and self-closing tags correctly * Small updates to the bundled syntax files to take into account the new features and changes to the tokenizers. * Added support for comments in the syntax files * Added a -o <outfile> switch which highlights using HTML (line numbering currently doesn't work in HTML output) 0.7.1 * Added ability to link tokens in syntax files * You can now specify human readable colors in the syntax files as well as the SGR code, ie., T_TAG_START red T_TAG_END T_TAG_START T_HTML_COMMENT 36 0.7 * This was the initial build/rebuild that introduced using external tokenizers and syntax files, the details of the individual changes weren't tracked.
README
PlainGetting Started ------------------------------------------------------------------------------- Requirements ----------------------------------------------------------------------------- Requires PHP 5.2.x or greater Installing ----------------------------------------------------------------------------- 1. Extract to desired directory 2. Make 'highlight' executable 3. Edit RESC_DIR constant in 'highlight' to point to the directory 4. (optional) symlink highlight file in /usr/bin/ Usage ----------------------------------------------------------------------------- See command (highlight -h) Extending ------------------------------------------------------------------------------- Syntax Files ----------------------------------------------------------------------------- To add a syntax to the highlighter you need to create two files: syntax.syn and syntax.lib. syntax.syn will be a newline separated file containing the tokens and associated color (see man console_codes) Example: T_STRING 1;32 T_ELSEIF 31 The tokens in syntax.syn must match the tokens produced by syntax.lib. The .syn files can link together by using the #LINK directive ie., T_STRING 1;32 T_ELSEIF 31 #LINK html.syn The color can also be another token which links multiple tokens together: T_STRING bgreen // use human readable color T_ENCAPS_STRING T_STRING // link T_ENCAPS_STRING to T_STRING Tokenizers ----------------------------------------------------------------------------- The <syntax>.lib file at minimum must contain the function tokenize_syntax ie., tokenize_php. This function will return an array of tokenizations of the code passed to the function as a string. The array should follow the format of: array( 0 => array( 'token' => 'T_STRING', 'string' => "'this is some string in the code'" ), 1 => array( 'token' => 'T_ELSEIF', 'string' => 'elseif' ) ) See the bundled php.lib and php.syn for more examples. Each token may have a specialized handler for the color, see T_VARIABLE in php.syn and handlevar() in php.lib as an example. Note that this function CAN point directly to another syntax's tokenizer ie., T_INLINE_HTML tokenize_html // highlight HTML in PHP files Coinciding with the #LINK directive in the .syn files you must include() or require() the appropriate .tok file to produce tokens for the colors. Custom Handlers ---------------------------------------------------------------------------- The custom handler functions are essentially mini-tokenizers which can themselves create tokens or act as wrapper functions to another syntax's tokenize function. In either case it should return an array of tokens just as the standard tokenize_* functions do. See handlevar in the bundled php.tok for an example of generating tokens without another tokenizer. Further Info ------------------------------------------------------------------------------- If you can't get it working or you found a bug, etc. then send an email to shawn AT shawnbiddle DOT com with the subject 'Syntax Highlighter'