Public Member Functions | |
~Matcher () | |
Cleans up the dynamic memory used by this matcher. | |
std::string | replaceWithGroups (const std::string &str) |
unsigned long | getFlags () const |
std::string | getText () const |
bool | matches () |
bool | findFirstMatch () |
bool | findNextMatch () |
std::vector< std::string > | findAll () |
void | reset () |
std::string | getString () const |
void | setString (const std::string &newStr) |
int | getStartingIndex (const int groupNum=0) const |
int | getEndingIndex (const int groupNum=0) const |
std::string | getGroup (const int groupNum=0) const |
std::vector< std::string > | getGroups (const bool includeGroupZero=0) const |
int | getGroupNum () |
Static Public Attributes | |
static const int | MATCH_ENTIRE_STRING = 0x01 |
Used internally by match to signify we want the entire string matched. | |
Protected Member Functions | |
void | clearGroups () |
Called by reset to clear the group arrays. | |
Protected Attributes | |
Pattern * | pat |
The pattern we use to match. | |
std::string | str |
The string in which we are matching. | |
int | start |
The starting point of our match. | |
int * | starts |
An array of the starting positions for each group. | |
int * | ends |
An array of the ending positions for each group. | |
int * | groups |
An array of private data used by NFANodes during matching. | |
int * | groupIndeces |
An array of private data used by NFANodes during matching. | |
int * | groupPos |
An array of private data used by NFANodes during matching. | |
int | lm |
The ending index of the last match. | |
int | gc |
The number of capturing groups we have. | |
int | ncgc |
The number of non-capturing groups we havew. | |
int | matchedSomething |
Whether or not we have matched something (used only by findFirstMatch and findNextMatch). | |
unsigned long | flags |
The flags with which we were made. | |
Friends | |
class | NFANode |
class | NFAStartNode |
class | NFAEndNode |
class | NFAGroupHeadNode |
class | NFAGroupLoopNode |
class | NFAGroupLoopPrologueNode |
class | NFAGroupTailNode |
class | NFALookBehindNode |
class | NFAStartOfLineNode |
class | NFAEndOfLineNode |
class | NFAEndOfMatchNode |
class | NFAReferenceNode |
class | Pattern |
Matcher
is the preferred method for scanning strings. Matchers are not thread-safe. Matchers require very little dynamic memory, hence one is encouraged to create several instances of a matcher when necessary as opposed to sharing a single instance of a matcher.
The most common methods needed by the matcher are matches
, findNextMatch
, and getGroup
. matches
and findNextMatch
both return success or failure, and further details can be gathered from their documentation.
Unlike Java's Matcher
, this class allows you to change the string you are matching against. This provides a small optimization, since you no longer need multiple matchers for a single pattern in a single thread.
This class also provides an extremely handy method for replacing text with captured data via the replaceWithGroups
method. A typical invocation looks like:
char buf[10000]; std::string str = "\\5 (user name \\1) uses \\7 for his/her shell and \\6 is their home directory"; FILE * fp = fopen("/etc/passwd", "r"); Pattern::registerPattern("entry", "[^:]+"); Pattern * p = Pattern::compile("^({entry}):({entry}):({entry}):({entry}):({entry}):({entry}):({entry})$", Pattern::MULTILINE_MATCHING | Pattern::UNIX_LINE_MODE); Matcher * m = p->createMatcher(""); while (fgets(buf, 9999, fp)) { m->setString(buf); if (m->matches()) { printf("%s\n", m->replaceWithGroups(str).c_str()); } } fclose(fp);
Calling any of the following functions before first calling
matches
, findFirstMatch
, or findNextMatch
results in undefined behavior and may cause your program to crash.
-
replaceWithGroups
-
getStartingIndex
-
getEndingIndex
-
getGroup
-
getGroups
The function findFirstMatch
will attempt to find the first match in the input string. The same results can be obtained by first calling reset
followed by findNextMatch
.
To eliminate the necessity of looping through a string to find all the matching substrings, findAll
was created. The function will find all matching substrings and return them in a vector
. If you need to examine specific capture groups within the substrings, then this method should not be used.
std::vector< std::string > Matcher::findAll | ( | ) |
Returns a vector of every substring in order which matches the given pattern.
References findNextMatch(), getGroup(), and reset().
bool Matcher::findFirstMatch | ( | ) |
Scans the string for the first substring matching the pattern. The entire string does not necessarily have to match for this function to return success. Group variables are appropriately set and can be queried after this function returns.
References clearGroups(), ends, flags, Pattern::head, lm, matchedSomething, pat, start, starts, and str.
Referenced by findNextMatch().
bool Matcher::findNextMatch | ( | ) |
Scans the string for the next substring matching the pattern. If no calls have been made to findFirstMatch of findNextMatch since the last call to reset, matches, or setString, then this function's behavior results to that of findFirstMatch.
References clearGroups(), ends, findFirstMatch(), flags, Pattern::head, lm, matchedSomething, pat, start, starts, and str.
Referenced by findAll(), and Pattern::findNthMatch().
int Matcher::getEndingIndex | ( | const int | groupNum = 0 |
) | const |
unsigned long Matcher::getFlags | ( | ) | const |
std::string Matcher::getGroup | ( | const int | groupNum = 0 |
) | const |
Returns the specified group. An empty string ("") does not necessarily mean the group was not matched. A group such as (a*b?) could be matched by a zero length. If an empty string is returned, getStartingIndex can be called to determine if the group was actually matched.
groupNum | The group to query |
References ends, gc, starts, and str.
Referenced by findAll(), Pattern::findNthMatch(), getGroups(), and replaceWithGroups().
int Matcher::getGroupNum | ( | ) | [inline] |
std::vector< std::string > Matcher::getGroups | ( | const bool | includeGroupZero = 0 |
) | const |
Returns every capture group in a vector
includeGroupZero | Whether or not include capture group zero |
References gc, getGroup(), and start.
int Matcher::getStartingIndex | ( | const int | groupNum = 0 |
) | const |
Returns the starting index of the specified group.
groupNum | The group to query |
Referenced by Pattern::findNthMatch().
std::string Matcher::getString | ( | ) | const [inline] |
Same as getText. Left n for backwards compatibilty with old source code
References str.
std::string Matcher::getText | ( | ) | const |
The text being searched by the matcher.
References str.
bool Matcher::matches | ( | ) |
Scans the string from start to finish for a match. The entire string must match for this function to return success. Group variables are appropriately set and can be queried after this function returns.
References clearGroups(), flags, Pattern::head, lm, MATCH_ENTIRE_STRING, matchedSomething, pat, and str.
std::string Matcher::replaceWithGroups | ( | const std::string & | str | ) |
Replaces the contents of str
with the appropriate captured text. str
should have at least one back reference, otherwise this function does nothing.
str | The string in which to replace text |
References getGroup().
void Matcher::reset | ( | ) |
Resets the internal state of the matcher
References clearGroups(), lm, and matchedSomething.
Referenced by findAll(), and setString().
void Matcher::setString | ( | const std::string & | newStr | ) | [inline] |
Sets the string to scan
newStr | The string to scan for subsequent matches |
Referenced by Pattern::findNthMatch().