REBOL 3 Docs Guide Concepts Functions Datatypes Errors
  TOC < Back Next >   Updated: 30-Nov-2013 Edit History  

REBOL 3 Functions: parse

parse  input  rules  /all  /case

Parses a string or block series according to grammar rules.

Arguments:

input [series!] - Input series to parse

rules [block! string! char! none!] - Rules to parse by (none = ",;")

Refinements:

/all - For simple rules (not blocks) parse all chars including whitespace

/case - Uses case-sensitive comparison

See also:

trim  

Contents

Description

The parse function is used to match patterns of values and perform specific actions upon such matches. A full summary can be found in parsing: summary of parse operations .

Both string! and block! datatypes can be parsed. Parsing of strings matches specific characters or substrings. Parsing of blocks matches specific values, or specific datatypes, or sub-blocks.

Whereas most languages provide a method of parsing strings, the parsing of blocks is an important feature of the REBOL language.

The parse function takes two main arguments: an input to be parsed and the rules that are used to parse it. The rules are specified as a block of grammar productions that are to be matched.

General parse rules

Rules consist of these main elements:

Item Description
keyword a special word of the dialect, listed in the table below
word get or set a variable (see below) - cannot be a keyword
path get or set a variable via a path (see below)
value match the input to a value (accepted datatypes depend on input datatype)
"|" backtrack and match to next alternate rule (or)
[block] a block of sub-rules
(paren) evaluate an expression (a production)

List of keywords

Within the parse dialect, these words are treated as keywords and cannot be used as variables.

Keyword Description
and rule match the rule, but do not advance the input (allows matching multiple rules to the same input)
any rule match the rule zero or more times; stop on failure or if input does not change.
break break out of a match loop (such as any, some, while), always indicating success.
change rule only value match the rule, and if true, change the input to the new value (can be different lengths)
copy word set the word to a copy of the input for matched rules
do rule evaluate the input as code, then attempt to match to the rule
end match end of input
fail force current rule to fail, backtrack
if (expr) evaluate the expression (in a paren) and if false or none, fail and backtrack
insert only value insert a value at the current input position (with optional ONLY for blocks by reference); input position is adjusted just past the insert
into rule match a series, then parse it with given rule; new series can be the same or different datatype.
opt rule match to the rule once or not at all (zero or one times)
not rule invert the result of the next rule
quote arg accept next argument exactly as is (exception: paren)
reject similar to break: break out of a match loop (such as any, some, while), but indicate failure.
remove rule match the rule, and if true, remove the matched input
return rule match the rule, and if true, immediately return the matched input as result of the PARSE function
set word set the word to the value of the input for matched rules
skip skip input (for the count range, if provided before it)
some rule match to the rule one or more times; stop on failure or if input does not change.
then regardless of failure or success of what follows, skip the next alternate rule (branch)
thru rule scan forward in input for matching rules, advance input to tail of the match
to rule scan forward in input for matching rules, advance input to head of the match
while rule like any, match to the rule zero or more times; stop on failure; does not care if input changes or not.
?? Debugging output. Prints the next parse rule value and shows the current input position (e.g. where you are in the string.)

In addition, none is a special value that can be used as a default match rule. It is often used at the end of alternate rules to catch all no-match cases.

Simple Parse

There is also a simple parse mode that does not require rules, but takes a string of characters to use for splitting up the input string.

Parse also works in conjunction with bitsets (charset) to specify groups of special characters.

The result returned from a simple parse is a block of values. For rule-based parses, it returns TRUE if the parse succeeded through the end of the input string.

print parse "divide on spaces" none
divide on spaces
print parse "Harry Haiku, 264 River Rd., Ukiah, 95482" ","
Harry Haiku 264 River Rd. Ukiah 95482
page: read http://hq.rebol.net
parse page [thru <title> copy title to </title>]
print title
Now is REBOL
digits: charset "0123456789"
area-code: ["(" 3 digits ")"]
phone-num: [3 digits "-" 4 digits]
print parse "(707)467-8000" [[area-code | none] phone-num]
true


  TOC < Back Next > REBOL.com - WIP Wiki Feedback Admin