regex


Regex with Parenthesis


I am trying to remove the following from my string:
string:
Snowden (left), whose whereabouts remain unknown, made the extraordinary claim as his father, Lon (right), told US television he intended to travel
I am using the following regex: ([(].*[)]) , but it's matching:
(left), whose whereabouts remain unknown, made the extraordinary claim as his father, Lon (right)
Which makes sense, but isn't what I want.
What can I do to solve this? Does it have something to do with greedy or lazy?
EDIT:
I am using Python:
paren = re.findall(ur'([(\u0028][^)\u0029]*[)\u0029])', text, re.UNICODE)
if paren is not None:
text = re.sub(s, '', text)
This leads to the following output:
Snowden (), whose whereabouts remain unknown, made the extraordinary claim as his father, Lon (), told US television he intended to travel
However, when I print paren.group(0) I get "(left)", meaning the parentheses are included, why is this?
Thanks.
Use the negation: ([(][^)]*[)]). This will match the opening (, then any number of characters which are not a closing ), then the closing ).
You can negate any character or set of characters in this way. To match a literal ^ caret, you can put it outside the [] character set or put it anywhere after the first character, like so: [a^bc]. It is always a good idea to read the rules of the regular expression language you are working in to know exactly what is possible and the correct syntax.
Being greedy or lazy is one rule that might not be implemented the same (if at all) in all regular expression implementations. Better to explicitly say what you want to find than to depend on a rule that is difficult to understand and debug (sometimes).
Restrict the .* to match only things that aren't parentheses:
([(][^()]*[)])
First don't need [] for a single char. Just escape the parentheses
Second, use .*? for non-greedy match
/\(.*?\)/
As pguardiario mentioned (who I upvoted), you don't need a character class, just escape the parenthesis.
His solution will work, with one caveat: if the text within parenthesis is hard-wrapped, the . won't capture \n. You need a character class for that.
My proposed solution:
\([^)]*\)
This escapes the parenthesis on either end, and will always capture whatever is within the parenthesis (unless it contains another parenthetical clause, of course).
It's a matter of style, but I prefer [(] to \( so I would use ([(][^)]*[)])
You haven't specified which language you are using. If it is Perl I would use the /x qualified to allow me to add spacing for clarity
/ ( [(] [^)]* [)] ) /x

Related Links

What is the complexity (Big-O) of searching an indexed data in mongoDB?
Filtering multiline pcregrep match with sed
How to remove a word prefix using grep?
How to create a regexp that catches everything that isn't a pure number
Print text at end of semicolon Sed/awk
Perl script, find first half of string [closed]
Perl regular expression to find a exact word
Check user input is valid CSS width value
the wonders of converting ereg_replace to preg_replace
Exist a particular format of the Regular Expressions in VS2010?
Regular Expression in Vim that will count 1 or colon
Regular expression search replace in Sublime Text 2
Regular expression to match all alphabets, whitespace and a colon
Regular expression for “not belonging to” in OCaml
Regular expression single space
Make extension optional in htaccess RewriteRule

Categories

HOME
azure
graph
redirect
tcl
word-vba
ionic3
classloader
regression
binary-search
wso2-das
calayer
proftpd
marketplace
calabash-android
apex
partitioning
amadeus
javers
curve-fitting
appcelerator-alloy
advantage-database-server
powershell-v2.0
easendmail
eonasdan-datetimepicker
hawq
dsl
popupwindow
file-manager
gecko
capstone
aem-6
phing
api-doc
solid-principles
uistoryboardsegue
linechart
sinon
dsx
psychopy
consistency
rating
qmake
permgen
codeigniter-hmvc
mediawiki-api
xargs
sdcc
pseudo-element
jquery-masonry
masm32
biological-neural-network
column-oriented
hipi
facebook-canvas
android-sdk-tools
amazon-dynamodb-streams
webclient-download
geneos
nuitka
namenode
highslide
jquery-load
asp.net-webpages
training-data
flexigrid
radix-sort
wand
autopy
wchar-t
text-alignment
monkeyrunner
gulp-rename
heroku-toolbelt
xsb
sketching
emberfire
alt
targetinvocationexception
sigkill
setattribute
spiceworks
tortoisecvs
computer-science-theory
private-methods
dynamic-data-display
twill
mydbr
appstore-sandbox
couchrest
emitmapper
floating-point-conversion
programmers-notepad
html5-apps
hudson-plugins
12factor
51degrees
datakey
zmodem
websolr
jammit
.net-1.1
n900
coords
geneva-framework
fogbugz-on-demand

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App