Adding syntax highlighting
The previous chapter describes how a simple HTML code editor can be built. But with a plain text view structure and content of a HTML file is not visually separated. To improve legibility, syntax highlighting can be used: By displaying certain parts such as tags or attributes in a color or style different to the one used for content the reader can easily find certain parts of the document.
There are different approaches possible to implement syntax highlighting. For SimplyHTML regular expressions are used for their simple way of defining patterns in a single expression.
Class SyntaxPane
A new class SyntaxPane is created as a subclass of JEditorPane. In the constructor of SyntaxPane method setupPatterns is called, which defines the patterns for HTML tags, attributes and attribute content. Method setMarks (see below) is used to apply syntax highlighting to a given part of the document in the SyntaxPane.
The SyntaxPane registers itself as a CaretListener and uses method caretUpdate to keep the syntax highlighting up to date for any changed text. When a document is shown initially, setMarks is called for the entire content (making it a lengthier process for bigger documents to display the highlighting initially). During changes only the highlighting of the current line is updated so that typing text is not slowed down too much.
A tradeoff with above approach is that multiline formats such as multiline comments are not handled with it.
Method setupPatterns
Method setupPatterns uses regular expressions to define a pattern for each element to be shown different from normal content. A HTML tag for instance is enclosed in < and > and can have letters and numbers with or without a slash inside those markers. An attribute ends with =, etc. For each Pattern an AttributeSet is created having the style to apply for that particular Pattern.
In method setupPatterns a Vector is used to hold pairs of one Pattern and one AttributeSet wrapped into inner class RegExStyle.
Inner class RegExStyle
Inner class RegExStyle is used as a convenience class to bundle a Pattern with a set of attributes. It simply has two class fields for the Pattern and the AttributeSet and respective getters and setters. All defined RegExStyles are stored in Vector patterns of class SyntaxPane .
Method setMarks
Method setMarks is the public member of SyntaxPane which is used to apply syntax highlighting to a given portion of the current document. Method setMarks creates an instance of inner class StyleUpdater (see below) and calls invokeLater of class SwingUtilities to have styles updated without conflicts in the event dispatch thread.
Inner class StyleUpdater
Class StyleUpdater implements the Runnable interface by wrapping its functionality in a public method named run. Its main task is to apply styles associated with regular expression patterns to a given portion of the document which is currently edited.
This is done by iterating through Vector patterns of class SyntaxPane . For each Pattern found a Matcher is created. To all instances of the the Pattern found by the Matcher the style associated to the Pattern is applied.
Method caretUpdate
Method caretUpdate finds out the start and end position of the line the caret currently is in and calls method setMarks for this portion of text each time the caret position changes.
Recommended readings
'Regular Expressions and the JavaTM Programming Language' at
http://developer.java.sun.com/developer/technicalArticles/releases/1.4regex/
and
presentation slides 'Rich Clients for Web Services' from JavaOne 2002 at
http://servlet.java.sun.com/javaone/resources/content/sf2002/conf/sessions/pdfs/2274.pdf