I created a regular expression: ((\(\s*) #match start parens |(\d+\.?\d*) #match a number |([+-/%^!]) #match operators |([*]) #match the asterisk operator |(\s*\)))+ #match the end parens that is supposed to separate parentheses, numbers (integers and decimal (3 and 6.28)), and operators (+-/*^%!). I have tried a few tests ( (2 3 +) 6.28 +) (3.14 6.28 +) ( (3 4 +) (5 6 +) *) and I have noticed a few things. When I run the regular expression on expressions with two start parens, it seems to ignore one of the parentheses, and testing on the site seems to yield many instances of null and repetition of characters. Is there a way to match a valid expression and assign that to it's own group? For example, if I have the expression ( (2 3 +) 6.28 +), the groups generated would be: [(, (, 2, 3, +, 6.28, +, )]? I remember one user posted an answer here that used a python regular expression, and it worked like a charm. The expression used something like (?.) or (.?) and the rest of my expressions. Unfortunately, I neglected to copy it down and the answer has been deleted. After that I have tried tweaking it quite a bit but nothing has worked. Any extra help is appreciated.
Just for the sport, I looked at your question both for the RPN case and the non-RPN case. Here are the recursive regexes I came up with (Perl, PCRE, Python with regex module), with a couple of caveats: I worked on the matching part of your question, not the tokenizing part. I didn't tune them, so please only take them as a starting point. RPN ^(\((?![ ]*[+/*-])(?:[ ]*|[+/*-](?![ ]*[+/*-])|(?:[ ]*\d+)?[ ]*\d+[ ]*[+/*-](?![ ]*[+/*-])[ ]*|(?1))*\)) On the demo, you can see some sample strings that it matches, as well as "improper strings" where it fails. Non-RPN ^(\((?![*\/])(?:\d++|[+*\/-](?![+*\/-])|(?1))*(?<![+*\/-])\)) On the demo, you can see some sample strings that it matches, as well as "improper strings" where it fails. Discussion There are probably cases I haven't thought of where the patterns allow improper operations. I did this fast. If you find one, let me know and we'll look for a fix. :) I didn't deal with decimals—a simple addition, leaving that for you to do if you choose to. In RPN I haven't dealt with negative numbers (which typically have a special key on the calculator): instead, you can -1* for instance For the sake of tidiness, on the non-RPN one I have not allowed any spaces, but it wouldn't be hard to do so. Related When no parentheses are required, things are easier. In this related question, we're able to use a short-and-tidy ^\s*-?\d+(?:\s*[-+*/]\s*\d+\s*)+$
replace if not exists in sed
replace exact string match with regexp in R
regex, match private ipV4 starting with 10 with optional subnet
negate the whole regex pattern when negative lookaround doesn't work
Grep with regular expression returns everything
Use Perl to filter out the word “a” but not each letter “a”
How to extract a subsequent word after pattern match using perl regex?
Notepad++ regex replace Css #media
Extending URL rewrite rule to include more arguments
Notepad++ Open all documents that are search term available
Selecting digits from messy vector via gsub / regex
Can this regular expression be improved?
How to do a multi-line any character match including newline character using RegEx in VBA
Fixing Linefeed breaks
How to make regex of 0-272
awk temperature file to 3 files